Summary: | EventSource should only accept UTF-8 charset | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Julien Chaffraix <jchaffraix> | ||||||||||
Component: | WebCore Misc. | Assignee: | Nobody <webkit-unassigned> | ||||||||||
Status: | RESOLVED FIXED | ||||||||||||
Severity: | Normal | CC: | adam.bergkvist, annevk, ap, commit-queue, ian, webkit.review.bot | ||||||||||
Priority: | P2 | ||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||
Hardware: | All | ||||||||||||
OS: | All | ||||||||||||
See Also: | https://bugs.webkit.org/show_bug.cgi?id=225416 | ||||||||||||
Attachments: |
|
Description
Julien Chaffraix
2011-03-23 11:33:06 PDT
Per spec only streams with the exact MIME type "text/event-stream" should be accepted. If you think other streams should be accepted too, please make sure to contact the mailing list as well so we can update the spec accordingly. Created attachment 86657 [details]
Proposed fix: Accept charset UTF-8, block and log any other charset
Comment on attachment 86657 [details] Proposed fix: Accept charset UTF-8, block and log any other charset View in context: https://bugs.webkit.org/attachment.cgi?id=86657&action=review Does charset="UTF-8" (with quotes) work? Could you add a test? Pelase do send a message to HTML mailing list, as suggested by Ian. > LayoutTests/http/tests/eventsource/eventsource-content-type-charset-expected.txt:1 > +CONSOLE MESSAGE: line 1: EventSource's response had a charset ("Windows-1152") that is not UTF-8. Aborting the connection. There are two failure subtests, so why only one console message? > Source/WebCore/page/EventSource.cpp:186 > - if (statusCode == 200 && response.mimeType() == "text/event-stream") { > + bool isResponseValid = statusCode == 200 && response.mimeType() == "text/event-stream"; I'm confused. The bug description sounds like we should be allowing more than we do today, but the actual fix makes code more restrictive. Which is correct? Also, s/isResponseValid/responseIsValid/ (or isValidResponse). > Source/WebCore/page/EventSource.cpp:194 > + String message = "EventSource's response had a charset (\""; > + message += charset; > + message += "\") that is not UTF-8. Aborting the connection."; I think it's wrong grammar to have different tenses in one sentence (had... is). > Does charset="UTF-8" (with quotes) work? Could you add a test? It does, I will add some testing along those lines. > Pelase do send a message to HTML mailing list, as suggested by Ian. Sure. Ian, we are actually already allowing a charset in the Content-Type header. > > LayoutTests/http/tests/eventsource/eventsource-content-type-charset-expected.txt:1 > > +CONSOLE MESSAGE: line 1: EventSource's response had a charset ("Windows-1152") that is not UTF-8. Aborting the connection. > > There are two failure subtests, so why only one console message? There is 2 failures but only one failure involves an invalid charset. We only log the error when we have a valid mime-type but an invalid charset thus only one console message. Would like me to log also if the mime-type is wrong? > > Source/WebCore/page/EventSource.cpp:186 > > - if (statusCode == 200 && response.mimeType() == "text/event-stream") { > > + bool isResponseValid = statusCode == 200 && response.mimeType() == "text/event-stream"; > > I'm confused. The bug description sounds like we should be allowing more than we do today, but the actual fix makes code more restrictive. Which is correct? This is correct. This change is restricting the behavior to only accept UTF-8 as was suggested in the previous bug. > Also, s/isResponseValid/responseIsValid/ (or isValidResponse). OK. > > Source/WebCore/page/EventSource.cpp:194 > > + String message = "EventSource's response had a charset (\""; > > + message += charset; > > + message += "\") that is not UTF-8. Aborting the connection."; > > I think it's wrong grammar to have different tenses in one sentence (had... is). I will fix it to use present in both cases. Comment on attachment 86657 [details] Proposed fix: Accept charset UTF-8, block and log any other charset > This is correct. This change is restricting the behavior to only accept UTF-8 as was suggested in the previous bug. OK. Please rename the bug in ChangeLog. > Would like me to log also if the mime-type is wrong? Sounds like that would be useful for developers. Brought this change to the WHATWG group: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-March/031003.html Created attachment 87251 [details]
New patch following Alexey's comments: Only accept UTF-8 and log most rejection cases
As the proposal did raise any objection, here is the updated patch.
Comment on attachment 87251 [details] New patch following Alexey's comments: Only accept UTF-8 and log most rejection cases Rejecting attachment 87251 [details] from commit-queue. Failed to run "['./Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=cr-jail-8', 'build-..." exit_code: 2 Last 500 characters of output: /tests/appcache ........................................................... http/tests/cache ........ http/tests/canvas/philip/tests .......... http/tests/canvas/webgl . http/tests/cookies ..... http/tests/css .... http/tests/eventsource .. http/tests/eventsource/eventsource-content-type-charset.html -> failed Exiting early after 1 failures. 22089 tests run. 569.43s total testing time 22088 test cases (99%) succeeded 1 test case (<1%) had incorrect layout 14 test cases (<1%) had stderr output Full output: http://queues.webkit.org/results/8284290 > http/tests/eventsource/eventsource-content-type-charset.html -> failed
The cause of failure is that the Mac implementation of ResourceResponse::textEncodingName does not strip the leading and trailing quotes in the charset. Instead it escapes them and we fail the comparison to "utf-8". Qt, for example, does remove them.
I could add a work-around for that or just go ahead and fix the output of Mac, documenting what we expect textEncodingName to return.
> The cause of failure is that the Mac implementation of ResourceResponse::textEncodingName does not strip the leading and trailing quotes in the charset. I thought that I added a workaround for this in bug 46573. What did go wrong? > I thought that I added a workaround for this in bug 46573. What did go wrong?
I haven't had time to debug the problem so I don't know what's going on yet. I will look into why this is happening here.
(In reply to comment #11) > > I thought that I added a workaround for this in bug 46573. What did go wrong? > > I haven't had time to debug the problem so I don't know what's going on yet. I will look into why this is happening here. Alexey, the problem is not in WebKit. It looks like PHP on Mac escapes quotes in the charset with slashes which is causing the problem (I had to look at some network capture to find this out). I will be posting an updated patch to work around this misbehavior. Created attachment 88028 [details]
Same as previously - with a work-around for some PHP escaping issues
Attachment 88028 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style']" exit_code: 1
LayoutTests/ChangeLog-2011-02-16:27403: ChangeLog entry has no bug number [changelog/bugnumber] [5]
Total errors found: 1 in 8 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 88028 [details] Same as previously - with a work-around for some PHP escaping issues View in context: https://bugs.webkit.org/attachment.cgi?id=88028&action=review > LayoutTests/ChangeLog:21 > + in the URL. Also added a work-around a bug in PHP. I don't think that it's a bug. A quick web search suggested that it's a configuration option, and we should do something like: if (get_magic_quotes_gpc()){ $value = stripslashes($value); } See also: <http://php.net/manual/en/function.stripslashes.php>, <http://php.net/manual/en/function.get-magic-quotes-gpc.php>. Please use the proposed pattern, and fix the ChangeLog accordingly. > LayoutTests/http/tests/eventsource/eventsource-content-type-charset.html:35 > + 'text/event-stream; charset=windows-1152', Did you mean to use a non-existent charset here? If so, 1152 is an overly subtle way to express that. I think that we should test windows-1251 or us-ascii. Comment on attachment 87251 [details] New patch following Alexey's comments: Only accept UTF-8 and log most rejection cases Cleared Alexey Proskuryakov's review+ from obsolete attachment 87251 [details] so that this bug does not appear in http://webkit.org/pending-commit. > > LayoutTests/ChangeLog:21 > > + in the URL. Also added a work-around a bug in PHP. > > I don't think that it's a bug. A quick web search suggested that it's a configuration option, and we should do something like: > > if (get_magic_quotes_gpc()){ > $value = stripslashes($value); > } > > See also: <http://php.net/manual/en/function.stripslashes.php>, <http://php.net/manual/en/function.get-magic-quotes-gpc.php>. Indeed. I should have expected some different options on Mac. Do we know why the configuration is different between Linux and Mac? It looks like something that will bite us at some point and would like to fix. > > LayoutTests/http/tests/eventsource/eventsource-content-type-charset.html:35 > > + 'text/event-stream; charset=windows-1152', > > Did you mean to use a non-existent charset here? If so, 1152 is an overly subtle way to express that. No, this is a typo. Changed it to windows-1251 as the logic does not care if the charset exists or not. > Do we know why the configuration is different between Linux and Mac?
I don't know.
Created attachment 88746 [details]
Same as previous but fixed Alexey's comments
Attachment 88746 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/Chan..." exit_code: 1
LayoutTests/ChangeLog-2011-02-16:27403: ChangeLog entry has no bug number [changelog/bugnumber] [5]
Total errors found: 1 in 8 files
If any of these errors are false positives, please file a bug against check-webkit-style.
The commit-queue encountered the following flaky tests while processing attachment 88746 [details]: fast/loader/recursive-before-unload-crash.html bug 50880 (authors: beidson@apple.com and eric@webkit.org) The commit-queue is continuing to process your patch. Comment on attachment 88746 [details] Same as previous but fixed Alexey's comments Clearing flags on attachment: 88746 Committed r83260: <http://trac.webkit.org/changeset/83260> All reviewed patches have been landed. Closing bug. |