Bug 37682 - WebSocket crash when it receives bad header.
Summary: WebSocket crash when it receives bad header.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Fumitoshi Ukai
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-04-15 16:29 PDT by Fumitoshi Ukai
Modified: 2010-04-16 16:13 PDT (History)
1 user (show)

See Also:


Attachments
Patch (6.19 KB, patch)
2010-04-15 16:40 PDT, Fumitoshi Ukai
no flags Details | Formatted Diff | Diff
Patch (5.83 KB, patch)
2010-04-15 17:40 PDT, Fumitoshi Ukai
no flags Details | Formatted Diff | Diff
Patch (5.29 KB, patch)
2010-04-16 14:45 PDT, Fumitoshi Ukai
ap: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Fumitoshi Ukai 2010-04-15 16:29:10 PDT
If server sends back the following as websocket handshake, it crashes.

HTTP/1.1 101 Web Socket Protocol Handshake\r\n
Upgrade: WebSocket\r\n
Connection: Upgrade\r\n
¥:\r\n
\r\n

See http://code.google.com/p/chromium/issues/detail?id=41147 more detail.
Comment 1 Fumitoshi Ukai 2010-04-15 16:40:45 PDT
Created attachment 53487 [details]
Patch
Comment 2 Alexey Proskuryakov 2010-04-15 17:14:25 PDT
+static String encodeHeaderField(const char* data, size_t len)

This function needs a name that makes intended usage clear, something like encodeHeaderFieldForDisplay. Otherwise, one can't help but wonder why it does what it does.

But I'm also not sure how much it helps to display garbage headers encoded as JavaScript literal strings. For one thing, this could be natural for a JavaScript function, but HTTP is a distinct field. It's better to tell what is wrong about the header field name or value than dump a string a hex codes. An finally, this may not be an important enough case to to go into all this trouble.

+            return p;

Should this be "return 0" to indicate error?
Comment 3 Fumitoshi Ukai 2010-04-15 17:29:30 PDT
(In reply to comment #2)
> +static String encodeHeaderField(const char* data, size_t len)
> 
> This function needs a name that makes intended usage clear, something like
> encodeHeaderFieldForDisplay. Otherwise, one can't help but wonder why it does
> what it does.

Ok. I'll rename it to encodeHeaderFieldForDisplay().

> 
> But I'm also not sure how much it helps to display garbage headers encoded as
> JavaScript literal strings. For one thing, this could be natural for a
> JavaScript function, but HTTP is a distinct field. It's better to tell what is
> wrong about the header field name or value than dump a string a hex codes. An
> finally, this may not be an important enough case to to go into all this
> trouble.

I see. Make it to dump a string a hex codes.
 
> +            return p;
> 
> Should this be "return 0" to indicate error?

Oops. I'll fix.
Comment 4 Alexey Proskuryakov 2010-04-15 17:39:40 PDT
Comment on attachment 53487 [details]
Patch

> I see. Make it to dump a string a hex codes.

To re-iterate, I think that it's not great. Ideally, the text should explain what exactly is wrong with the header, and not just dump it in the face of the developer. But it's not a big deal.

Marking r- for the "return p" issue.
Comment 5 Fumitoshi Ukai 2010-04-15 17:40:23 PDT
Created attachment 53494 [details]
Patch
Comment 6 Alexey Proskuryakov 2010-04-16 12:11:08 PDT
Comment on attachment 53494 [details]
Patch

> CONSOLE MESSAGE: line 0: invalid header name: a5 

This looks even more confusing than before. What is invalid about the name "a5"?
Comment 7 Fumitoshi Ukai 2010-04-16 13:32:59 PDT
(In reply to comment #6)
> (From update of attachment 53494 [details])
> > CONSOLE MESSAGE: line 0: invalid header name: a5 
> 
> This looks even more confusing than before. What is invalid about the name
> "a5"?

"a5" only is invalid UTF-8 sequence.
Comment 8 Alexey Proskuryakov 2010-04-16 13:44:42 PDT
Yes, I understand this. But to a Web developer, this would look like an ASCII string "a5" - there is nothing to say or imply that this is hexadecimal. And most Web developers probably don't know how utf-8 works anyway.
Comment 9 Fumitoshi Ukai 2010-04-16 13:48:06 PDT
(In reply to comment #8)
> Yes, I understand this. But to a Web developer, this would look like an ASCII
> string "a5" - there is nothing to say or imply that this is hexadecimal. And
> most Web developers probably don't know how utf-8 works anyway.

Ok, so changing console log to " invalid UTF-8 sequence in header name: 0xa5 (in hexadecimal)" is fine?
Comment 10 Alexey Proskuryakov 2010-04-16 14:08:03 PDT
My preferred message would be just "invalid UTF-8 sequence in header name".
Comment 11 Fumitoshi Ukai 2010-04-16 14:21:07 PDT
(In reply to comment #10)
> My preferred message would be just "invalid UTF-8 sequence in header name".

Does it mean we don't need to dump byte sequence of header name in hexadecimal?
IIRC, raw byte sequence might not be outputted correctly in some DRT, so we might need to rebaseline expectations.
Comment 12 Alexey Proskuryakov 2010-04-16 14:29:40 PDT
I think that there is no need to dump the headers, yes.
Comment 13 Fumitoshi Ukai 2010-04-16 14:32:35 PDT
(In reply to comment #12)
> I think that there is no need to dump the headers, yes.

Ok, so I'll change log message and remove dumping byte sequence.
Comment 14 Fumitoshi Ukai 2010-04-16 14:45:42 PDT
Created attachment 53566 [details]
Patch
Comment 15 Alexey Proskuryakov 2010-04-16 16:03:27 PDT
Comment on attachment 53566 [details]
Patch

r=me
Comment 16 Fumitoshi Ukai 2010-04-16 16:13:54 PDT
Committed r57760: <http://trac.webkit.org/changeset/57760>