Bug 119237 - Content filter replacement data uses the encoding from the blocked page's response headers
Summary: Content filter replacement data uses the encoding from the blocked page's res...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Andy Estes
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2013-07-30 01:38 PDT by Andy Estes
Modified: 2013-07-30 16:45 PDT (History)
3 users (show)

See Also:


Attachments
Patch (4.78 KB, patch)
2013-07-30 02:03 PDT, Andy Estes
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andy Estes 2013-07-30 01:38:02 PDT
Content filter replacement data uses the encoding from the blocked page's response headers
Comment 1 Andy Estes 2013-07-30 02:00:51 PDT
<rdar://problem/14374097>
Comment 2 Andy Estes 2013-07-30 02:03:29 PDT
Created attachment 207700 [details]
Patch
Comment 3 Alexey Proskuryakov 2013-07-30 10:22:02 PDT
Comment on attachment 207700 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=207700&action=review

> Source/WebCore/ChangeLog:15
> +        Forget about encodings determined from these sources. The replacement
> +        data will specify an encoding in a <meta charset>, so let that be used
> +        instead.

This doesn’t give me a good feeling. Why is charset special? What about all other HTTP header fields? Just as one example, will we still download if disposition was an attachment?

> Source/WebCore/loader/DocumentLoader.h:407
>          RefPtr<ContentFilter> m_contentFilter;
> +        bool m_loadWasBlockedByContentFiltering;

Can we get this information from m_contentFilter instead of tracking it separately?
Comment 4 Darin Adler 2013-07-30 12:32:25 PDT
Comment on attachment 207700 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=207700&action=review

r=me but please consider Alexey’s question

> Source/WebCore/loader/DocumentLoader.cpp:807
> +        // The content filter's replacement data has a known encoding. Ignore
> +        // both the override encoding and the blocked response's encoding.
> +        if (m_loadWasBlockedByContentFiltering) {
> +            userChosen = false;
> +            encoding = String();
> +        }

It’s a little awkward to compute and then ignore userChosen and encoding. Would be nice to structure the code so they were not even computed.
Comment 5 Andy Estes 2013-07-30 15:28:04 PDT
(In reply to comment #3)
> (From update of attachment 207700 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=207700&action=review
> 
> > Source/WebCore/ChangeLog:15
> > +        Forget about encodings determined from these sources. The replacement
> > +        data will specify an encoding in a <meta charset>, so let that be used
> > +        instead.
> 
> This doesn’t give me a good feeling. Why is charset special? What about all other HTTP header fields? Just as one example, will we still download if disposition was an attachment?

I don't think charset is special. As well as the problem you mention, mitzpettel also pointed out that we parse the replacement data using the content-type of the original response (luckily we control this data). So I'm also uneasy about how we're using headers from the original response.

This bug isn't trying to address these issues across the board, though. The problem of using the wrong encoding for the replacement data was encountered often enough that a targeted fix made sense to me, the other issues notwithstanding.

> 
> > Source/WebCore/loader/DocumentLoader.h:407
> >          RefPtr<ContentFilter> m_contentFilter;
> > +        bool m_loadWasBlockedByContentFiltering;
> 
> Can we get this information from m_contentFilter instead of tracking it separately?

Yes, good idea. I'll make the change under the auspices of Darin's review, since he asked me to consider your feedback.
Comment 6 Lucas Forschler 2013-07-30 16:05:29 PDT
<rdar://problem/14374097>
Comment 7 Andy Estes 2013-07-30 16:13:21 PDT
Committed r153502: <http://trac.webkit.org/changeset/153502>