Bug 37682

Summary: WebSocket crash when it receives bad header.
Product: WebKit Reporter: Fumitoshi Ukai <ukai>
Component: WebCore Misc.Assignee: Fumitoshi Ukai <ukai>
Status: RESOLVED FIXED    
Severity: Normal CC: ap
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
Patch
none
Patch
none
Patch ap: review+

Description Fumitoshi Ukai 2010-04-15 16:29:10 PDT
If server sends back the following as websocket handshake, it crashes.

HTTP/1.1 101 Web Socket Protocol Handshake\r\n
Upgrade: WebSocket\r\n
Connection: Upgrade\r\n
¥:\r\n
\r\n

See http://code.google.com/p/chromium/issues/detail?id=41147 more detail.
Comment 1 Fumitoshi Ukai 2010-04-15 16:40:45 PDT
Created attachment 53487 [details]
Patch
Comment 2 Alexey Proskuryakov 2010-04-15 17:14:25 PDT
+static String encodeHeaderField(const char* data, size_t len)

This function needs a name that makes intended usage clear, something like encodeHeaderFieldForDisplay. Otherwise, one can't help but wonder why it does what it does.

But I'm also not sure how much it helps to display garbage headers encoded as JavaScript literal strings. For one thing, this could be natural for a JavaScript function, but HTTP is a distinct field. It's better to tell what is wrong about the header field name or value than dump a string a hex codes. An finally, this may not be an important enough case to to go into all this trouble.

+            return p;

Should this be "return 0" to indicate error?
Comment 3 Fumitoshi Ukai 2010-04-15 17:29:30 PDT
(In reply to comment #2)
> +static String encodeHeaderField(const char* data, size_t len)
> 
> This function needs a name that makes intended usage clear, something like
> encodeHeaderFieldForDisplay. Otherwise, one can't help but wonder why it does
> what it does.

Ok. I'll rename it to encodeHeaderFieldForDisplay().

> 
> But I'm also not sure how much it helps to display garbage headers encoded as
> JavaScript literal strings. For one thing, this could be natural for a
> JavaScript function, but HTTP is a distinct field. It's better to tell what is
> wrong about the header field name or value than dump a string a hex codes. An
> finally, this may not be an important enough case to to go into all this
> trouble.

I see. Make it to dump a string a hex codes.
 
> +            return p;
> 
> Should this be "return 0" to indicate error?

Oops. I'll fix.
Comment 4 Alexey Proskuryakov 2010-04-15 17:39:40 PDT
Comment on attachment 53487 [details]
Patch

> I see. Make it to dump a string a hex codes.

To re-iterate, I think that it's not great. Ideally, the text should explain what exactly is wrong with the header, and not just dump it in the face of the developer. But it's not a big deal.

Marking r- for the "return p" issue.
Comment 5 Fumitoshi Ukai 2010-04-15 17:40:23 PDT
Created attachment 53494 [details]
Patch
Comment 6 Alexey Proskuryakov 2010-04-16 12:11:08 PDT
Comment on attachment 53494 [details]
Patch

> CONSOLE MESSAGE: line 0: invalid header name: a5 

This looks even more confusing than before. What is invalid about the name "a5"?
Comment 7 Fumitoshi Ukai 2010-04-16 13:32:59 PDT
(In reply to comment #6)
> (From update of attachment 53494 [details])
> > CONSOLE MESSAGE: line 0: invalid header name: a5 
> 
> This looks even more confusing than before. What is invalid about the name
> "a5"?

"a5" only is invalid UTF-8 sequence.
Comment 8 Alexey Proskuryakov 2010-04-16 13:44:42 PDT
Yes, I understand this. But to a Web developer, this would look like an ASCII string "a5" - there is nothing to say or imply that this is hexadecimal. And most Web developers probably don't know how utf-8 works anyway.
Comment 9 Fumitoshi Ukai 2010-04-16 13:48:06 PDT
(In reply to comment #8)
> Yes, I understand this. But to a Web developer, this would look like an ASCII
> string "a5" - there is nothing to say or imply that this is hexadecimal. And
> most Web developers probably don't know how utf-8 works anyway.

Ok, so changing console log to " invalid UTF-8 sequence in header name: 0xa5 (in hexadecimal)" is fine?
Comment 10 Alexey Proskuryakov 2010-04-16 14:08:03 PDT
My preferred message would be just "invalid UTF-8 sequence in header name".
Comment 11 Fumitoshi Ukai 2010-04-16 14:21:07 PDT
(In reply to comment #10)
> My preferred message would be just "invalid UTF-8 sequence in header name".

Does it mean we don't need to dump byte sequence of header name in hexadecimal?
IIRC, raw byte sequence might not be outputted correctly in some DRT, so we might need to rebaseline expectations.
Comment 12 Alexey Proskuryakov 2010-04-16 14:29:40 PDT
I think that there is no need to dump the headers, yes.
Comment 13 Fumitoshi Ukai 2010-04-16 14:32:35 PDT
(In reply to comment #12)
> I think that there is no need to dump the headers, yes.

Ok, so I'll change log message and remove dumping byte sequence.
Comment 14 Fumitoshi Ukai 2010-04-16 14:45:42 PDT
Created attachment 53566 [details]
Patch
Comment 15 Alexey Proskuryakov 2010-04-16 16:03:27 PDT
Comment on attachment 53566 [details]
Patch

r=me
Comment 16 Fumitoshi Ukai 2010-04-16 16:13:54 PDT
Committed r57760: <http://trac.webkit.org/changeset/57760>