Bug 56942

Summary: EventSource should only accept UTF-8 charset
Product: WebKit Reporter: Julien Chaffraix <jchaffraix>
Component: WebCore Misc.Assignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: adam.bergkvist, annevk, ap, commit-queue, ian, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
See Also: https://bugs.webkit.org/show_bug.cgi?id=225416
Attachments:
Description Flags
Proposed fix: Accept charset UTF-8, block and log any other charset
ap: review-, ap: commit-queue-
New patch following Alexey's comments: Only accept UTF-8 and log most rejection cases
none
Same as previously - with a work-around for some PHP escaping issues
ap: review+, ap: commit-queue-
Same as previous but fixed Alexey's comments none

Julien Chaffraix
Reported 2011-03-23 11:33:06 PDT
This is a follow-up of bug45372. The discussion hinted at accepting UTF-8 charset for EventSource (the charset mentioned in the spec) but failing other charset and logging it to the console. What got landed is not accepting any charset altogether. This bug is about implementing the previous behavior for compatibility with some servers sending charset automatically (tomcat was mentioned).
Attachments
Proposed fix: Accept charset UTF-8, block and log any other charset (11.52 KB, patch)
2011-03-23 11:46 PDT, Julien Chaffraix
ap: review-
ap: commit-queue-
New patch following Alexey's comments: Only accept UTF-8 and log most rejection cases (13.82 KB, patch)
2011-03-28 19:50 PDT, Julien Chaffraix
no flags
Same as previously - with a work-around for some PHP escaping issues (14.06 KB, patch)
2011-04-03 19:53 PDT, Julien Chaffraix
ap: review+
ap: commit-queue-
Same as previous but fixed Alexey's comments (14.20 KB, patch)
2011-04-07 18:10 PDT, Julien Chaffraix
no flags
Ian 'Hixie' Hickson
Comment 1 2011-03-23 11:39:42 PDT
Per spec only streams with the exact MIME type "text/event-stream" should be accepted. If you think other streams should be accepted too, please make sure to contact the mailing list as well so we can update the spec accordingly.
Julien Chaffraix
Comment 2 2011-03-23 11:46:20 PDT
Created attachment 86657 [details] Proposed fix: Accept charset UTF-8, block and log any other charset
Alexey Proskuryakov
Comment 3 2011-03-23 12:31:10 PDT
Comment on attachment 86657 [details] Proposed fix: Accept charset UTF-8, block and log any other charset View in context: https://bugs.webkit.org/attachment.cgi?id=86657&action=review Does charset="UTF-8" (with quotes) work? Could you add a test? Pelase do send a message to HTML mailing list, as suggested by Ian. > LayoutTests/http/tests/eventsource/eventsource-content-type-charset-expected.txt:1 > +CONSOLE MESSAGE: line 1: EventSource's response had a charset ("Windows-1152") that is not UTF-8. Aborting the connection. There are two failure subtests, so why only one console message? > Source/WebCore/page/EventSource.cpp:186 > - if (statusCode == 200 && response.mimeType() == "text/event-stream") { > + bool isResponseValid = statusCode == 200 && response.mimeType() == "text/event-stream"; I'm confused. The bug description sounds like we should be allowing more than we do today, but the actual fix makes code more restrictive. Which is correct? Also, s/isResponseValid/responseIsValid/ (or isValidResponse). > Source/WebCore/page/EventSource.cpp:194 > + String message = "EventSource's response had a charset (\""; > + message += charset; > + message += "\") that is not UTF-8. Aborting the connection."; I think it's wrong grammar to have different tenses in one sentence (had... is).
Julien Chaffraix
Comment 4 2011-03-23 14:05:53 PDT
> Does charset="UTF-8" (with quotes) work? Could you add a test? It does, I will add some testing along those lines. > Pelase do send a message to HTML mailing list, as suggested by Ian. Sure. Ian, we are actually already allowing a charset in the Content-Type header. > > LayoutTests/http/tests/eventsource/eventsource-content-type-charset-expected.txt:1 > > +CONSOLE MESSAGE: line 1: EventSource's response had a charset ("Windows-1152") that is not UTF-8. Aborting the connection. > > There are two failure subtests, so why only one console message? There is 2 failures but only one failure involves an invalid charset. We only log the error when we have a valid mime-type but an invalid charset thus only one console message. Would like me to log also if the mime-type is wrong? > > Source/WebCore/page/EventSource.cpp:186 > > - if (statusCode == 200 && response.mimeType() == "text/event-stream") { > > + bool isResponseValid = statusCode == 200 && response.mimeType() == "text/event-stream"; > > I'm confused. The bug description sounds like we should be allowing more than we do today, but the actual fix makes code more restrictive. Which is correct? This is correct. This change is restricting the behavior to only accept UTF-8 as was suggested in the previous bug. > Also, s/isResponseValid/responseIsValid/ (or isValidResponse). OK. > > Source/WebCore/page/EventSource.cpp:194 > > + String message = "EventSource's response had a charset (\""; > > + message += charset; > > + message += "\") that is not UTF-8. Aborting the connection."; > > I think it's wrong grammar to have different tenses in one sentence (had... is). I will fix it to use present in both cases.
Alexey Proskuryakov
Comment 5 2011-03-23 14:41:52 PDT
Comment on attachment 86657 [details] Proposed fix: Accept charset UTF-8, block and log any other charset > This is correct. This change is restricting the behavior to only accept UTF-8 as was suggested in the previous bug. OK. Please rename the bug in ChangeLog. > Would like me to log also if the mime-type is wrong? Sounds like that would be useful for developers.
Julien Chaffraix
Comment 6 2011-03-24 07:23:20 PDT
Julien Chaffraix
Comment 7 2011-03-28 19:50:00 PDT
Created attachment 87251 [details] New patch following Alexey's comments: Only accept UTF-8 and log most rejection cases As the proposal did raise any objection, here is the updated patch.
WebKit Commit Bot
Comment 8 2011-03-29 05:54:47 PDT
Comment on attachment 87251 [details] New patch following Alexey's comments: Only accept UTF-8 and log most rejection cases Rejecting attachment 87251 [details] from commit-queue. Failed to run "['./Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=cr-jail-8', 'build-..." exit_code: 2 Last 500 characters of output: /tests/appcache ........................................................... http/tests/cache ........ http/tests/canvas/philip/tests .......... http/tests/canvas/webgl . http/tests/cookies ..... http/tests/css .... http/tests/eventsource .. http/tests/eventsource/eventsource-content-type-charset.html -> failed Exiting early after 1 failures. 22089 tests run. 569.43s total testing time 22088 test cases (99%) succeeded 1 test case (<1%) had incorrect layout 14 test cases (<1%) had stderr output Full output: http://queues.webkit.org/results/8284290
Julien Chaffraix
Comment 9 2011-03-31 07:50:41 PDT
> http/tests/eventsource/eventsource-content-type-charset.html -> failed The cause of failure is that the Mac implementation of ResourceResponse::textEncodingName does not strip the leading and trailing quotes in the charset. Instead it escapes them and we fail the comparison to "utf-8". Qt, for example, does remove them. I could add a work-around for that or just go ahead and fix the output of Mac, documenting what we expect textEncodingName to return.
Alexey Proskuryakov
Comment 10 2011-03-31 09:04:36 PDT
> The cause of failure is that the Mac implementation of ResourceResponse::textEncodingName does not strip the leading and trailing quotes in the charset. I thought that I added a workaround for this in bug 46573. What did go wrong?
Julien Chaffraix
Comment 11 2011-03-31 13:47:49 PDT
> I thought that I added a workaround for this in bug 46573. What did go wrong? I haven't had time to debug the problem so I don't know what's going on yet. I will look into why this is happening here.
Julien Chaffraix
Comment 12 2011-04-03 19:50:13 PDT
(In reply to comment #11) > > I thought that I added a workaround for this in bug 46573. What did go wrong? > > I haven't had time to debug the problem so I don't know what's going on yet. I will look into why this is happening here. Alexey, the problem is not in WebKit. It looks like PHP on Mac escapes quotes in the charset with slashes which is causing the problem (I had to look at some network capture to find this out). I will be posting an updated patch to work around this misbehavior.
Julien Chaffraix
Comment 13 2011-04-03 19:53:02 PDT
Created attachment 88028 [details] Same as previously - with a work-around for some PHP escaping issues
WebKit Review Bot
Comment 14 2011-04-03 19:55:32 PDT
Attachment 88028 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style']" exit_code: 1 LayoutTests/ChangeLog-2011-02-16:27403: ChangeLog entry has no bug number [changelog/bugnumber] [5] Total errors found: 1 in 8 files If any of these errors are false positives, please file a bug against check-webkit-style.
Alexey Proskuryakov
Comment 15 2011-04-03 21:02:38 PDT
Comment on attachment 88028 [details] Same as previously - with a work-around for some PHP escaping issues View in context: https://bugs.webkit.org/attachment.cgi?id=88028&action=review > LayoutTests/ChangeLog:21 > + in the URL. Also added a work-around a bug in PHP. I don't think that it's a bug. A quick web search suggested that it's a configuration option, and we should do something like: if (get_magic_quotes_gpc()){ $value = stripslashes($value); } See also: <http://php.net/manual/en/function.stripslashes.php>, <http://php.net/manual/en/function.get-magic-quotes-gpc.php>. Please use the proposed pattern, and fix the ChangeLog accordingly. > LayoutTests/http/tests/eventsource/eventsource-content-type-charset.html:35 > + 'text/event-stream; charset=windows-1152', Did you mean to use a non-existent charset here? If so, 1152 is an overly subtle way to express that. I think that we should test windows-1251 or us-ascii.
Eric Seidel (no email)
Comment 16 2011-04-06 10:46:09 PDT
Comment on attachment 87251 [details] New patch following Alexey's comments: Only accept UTF-8 and log most rejection cases Cleared Alexey Proskuryakov's review+ from obsolete attachment 87251 [details] so that this bug does not appear in http://webkit.org/pending-commit.
Julien Chaffraix
Comment 17 2011-04-07 17:35:12 PDT
> > LayoutTests/ChangeLog:21 > > + in the URL. Also added a work-around a bug in PHP. > > I don't think that it's a bug. A quick web search suggested that it's a configuration option, and we should do something like: > > if (get_magic_quotes_gpc()){ > $value = stripslashes($value); > } > > See also: <http://php.net/manual/en/function.stripslashes.php>, <http://php.net/manual/en/function.get-magic-quotes-gpc.php>. Indeed. I should have expected some different options on Mac. Do we know why the configuration is different between Linux and Mac? It looks like something that will bite us at some point and would like to fix. > > LayoutTests/http/tests/eventsource/eventsource-content-type-charset.html:35 > > + 'text/event-stream; charset=windows-1152', > > Did you mean to use a non-existent charset here? If so, 1152 is an overly subtle way to express that. No, this is a typo. Changed it to windows-1251 as the logic does not care if the charset exists or not.
Alexey Proskuryakov
Comment 18 2011-04-07 17:49:36 PDT
> Do we know why the configuration is different between Linux and Mac? I don't know.
Julien Chaffraix
Comment 19 2011-04-07 18:10:09 PDT
Created attachment 88746 [details] Same as previous but fixed Alexey's comments
WebKit Review Bot
Comment 20 2011-04-07 18:12:12 PDT
Attachment 88746 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/Chan..." exit_code: 1 LayoutTests/ChangeLog-2011-02-16:27403: ChangeLog entry has no bug number [changelog/bugnumber] [5] Total errors found: 1 in 8 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Commit Bot
Comment 21 2011-04-07 23:42:30 PDT
The commit-queue encountered the following flaky tests while processing attachment 88746 [details]: fast/loader/recursive-before-unload-crash.html bug 50880 (authors: beidson@apple.com and eric@webkit.org) The commit-queue is continuing to process your patch.
WebKit Commit Bot
Comment 22 2011-04-07 23:45:50 PDT
Comment on attachment 88746 [details] Same as previous but fixed Alexey's comments Clearing flags on attachment: 88746 Committed r83260: <http://trac.webkit.org/changeset/83260>
WebKit Commit Bot
Comment 23 2011-04-07 23:45:55 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.