Bug 32249 - WebSocket test server handshake is not strict enough
Summary: WebSocket test server handshake is not strict enough
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-12-07 14:58 PST by Alexey Proskuryakov
Modified: 2009-12-17 18:38 PST (History)
4 users (show)

See Also:


Attachments
Update pywebsocket to 0.4.4 and make handshake checking stricter (20.90 KB, patch)
2009-12-09 01:55 PST, Yuzo Fujishima
no flags Details | Formatted Diff | Diff
Update pywebsocket to 0.4.4 and make handshake checking stricter (21.53 KB, patch)
2009-12-15 23:57 PST, Yuzo Fujishima
ap: review+
ap: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alexey Proskuryakov 2009-12-07 14:58:10 PST
WebSocket test server should reject invalid WebSocket connection attempts (thus testing that the client handshake request is valid).

Steps to reproduce:
1. run-webkit-websocketserver
2. telnet 127.0.0.1 8880
3. Paste the following handshake text:
--------------------
GET       /websocket/tests/simple      HTTP/1.1
connection:Upgrade
Upgrade:     WebSocket
Host: 127.0.0.1:8880
origin:
 http://www.example.org

--------------------

Results: a connection is established successfully. It shouldn't be.

There are several issues with this request:
- wrong whitespace (must always be a single space);
- wrong case of headers;
- wrong order of Connection and Upgrade headers;
- Origin is continued to the next line, which is not allowed for WebSocket.
Comment 1 Alexey Proskuryakov 2009-12-07 15:01:25 PST
Somehow, the spec does not mention Connection and Upgrade headers in section 5.1, and even says that Host should be the first header in request.
Comment 2 Ian 'Hixie' Hickson 2009-12-08 00:14:06 PST
Uh yeah, apparently I forgot about those. Pretend they're there...
Comment 3 Yuzo Fujishima 2009-12-09 01:55:17 PST
Created attachment 44524 [details]
Update pywebsocket to 0.4.4 and make handshake checking stricter
Comment 4 WebKit Review Bot 2009-12-09 01:55:56 PST
style-queue ran check-webkit-style on attachment 44524 [details] without any errors.
Comment 5 Yuzo Fujishima 2009-12-09 01:56:54 PST
Hi, reviewers,

Can you review this patch?

Yuzo
Comment 6 Ian 'Hixie' Hickson 2009-12-09 04:55:02 PST
(In reply to comment #2)
> Uh yeah, apparently I forgot about those. Pretend they're there...

I fixed the spec.
Comment 7 Alexey Proskuryakov 2009-12-09 08:46:44 PST
Comment on attachment 44524 [details]
Update pywebsocket to 0.4.4 and make handshake checking stricter

> +        "--strict",

I don't think this should be an option - why would we want to encourage sloppy implementations?

> +_FIRST_FIVE_LINES = map(re.compile, [
> +    r'^GET [\S]+ HTTP/1.1\r\n$',

The spec also says that path must begin with a slash.

> +_SIXTH_AND_LATER = re.compile(
> +    r'^'
> +    r'(WebSocket-Protocol: [\x20-\x7e]+\r\n)?'
> +    r'([Cc][Oo][Oo][Kk][Ii][Ee]2?:[^\r]*(\r\n[ \t][^\r]*)*\r\n)?'
> +    r'\r\n')

The Cookie header must also be in the restricted WebSocket form. The spec says its semantics matches HTTP, but doesn't say anything special about syntax.

Does this allow having both Cookie and Cookie2 headers in a single request?

>      for c in protocol:
> -        if not 0x21 <= ord(c) <= 0x7e:
> +        if not 0x20 <= ord(c) <= 0x7e:
>              raise HandshakeError('Illegal character in protocol: %r' % c)

I'm confused - bug 32266 has already introduced a test verifying that space is allowed in protocol name. How could that possibly work before this change?

> +_STRICTLY_GOOD_REQUESTS = (
<...>
> +    (  # Cookie with continuation lines

This shouldn't be allowed - all handshake header fields are subject to WebSocket syntax.

> +_NOT_STRICTLY_GOOD_REQUESTS = (
> +    (  # Extra space after GET
> +        'GET  /demo HTTP/1.1\r\n',
> +        'upgrade: WebSocket\r\n',

Upgrade doesn't have a correct case here.

Please also test path without a slash at the beginning.

> +    (  # Origin comes before Host

Please also test incorrect order of Upgrade and Connection

> +    (  # Unknow header

Typo: unknown.

> +# vi:sts=4 sw=4 et

We don't normally allow modelines in WebKit sources.

Looks good, thanks for tackling this! I can't deeply review Python code, but it looks reasonable to me, too.
Comment 8 Alexey Proskuryakov 2009-12-09 11:22:06 PST
> I'm confused - bug 32266 has already introduced a test verifying that space is
> allowed in protocol name. How could that possibly work before this change?

I now see that it's skipped, waiting for this patch to be landed.
Comment 9 Yuzo Fujishima 2009-12-15 23:56:07 PST
(In reply to comment #7)
> (From update of attachment 44524 [details])
> > +        "--strict",
> 
> I don't think this should be an option - why would we want to encourage sloppy
> implementations?

Two reasons:
- pywebsocket is for general (although experimental) use, and
generally speaking, the receiver of data should be lenient.
The order of headers is minor enough to tolerate, in my opinion.
- strict checking is difficult to enforce in the Apache module version
of pywebsocket, and we want to keep the behavior as consistent as
possible.

> 
> > +_FIRST_FIVE_LINES = map(re.compile, [
> > +    r'^GET [\S]+ HTTP/1.1\r\n$',
> 
> The spec also says that path must begin with a slash.

Done.

> 
> > +_SIXTH_AND_LATER = re.compile(
> > +    r'^'
> > +    r'(WebSocket-Protocol: [\x20-\x7e]+\r\n)?'
> > +    r'([Cc][Oo][Oo][Kk][Ii][Ee]2?:[^\r]*(\r\n[ \t][^\r]*)*\r\n)?'
> > +    r'\r\n')
> 
> The Cookie header must also be in the restricted WebSocket form. The spec says
> its semantics matches HTTP, but doesn't say anything special about syntax.
> 
> Does this allow having both Cookie and Cookie2 headers in a single request?

Fixed to allow both Cookie and Cookie2 in a same request.

> 
> >      for c in protocol:
> > -        if not 0x21 <= ord(c) <= 0x7e:
> > +        if not 0x20 <= ord(c) <= 0x7e:
> >              raise HandshakeError('Illegal character in protocol: %r' % c)
> 
> I'm confused - bug 32266 has already introduced a test verifying that space is
> allowed in protocol name. How could that possibly work before this change?

The test added by 32266 is currently skipped.

> 
> > +_STRICTLY_GOOD_REQUESTS = (
> <...>
> > +    (  # Cookie with continuation lines
> 
> This shouldn't be allowed - all handshake header fields are subject to
> WebSocket syntax.

Fixed.

> 
> > +_NOT_STRICTLY_GOOD_REQUESTS = (
> > +    (  # Extra space after GET
> > +        'GET  /demo HTTP/1.1\r\n',
> > +        'upgrade: WebSocket\r\n',
> 
> Upgrade doesn't have a correct case here.

Good catch. Thank you!

> 
> Please also test path without a slash at the beginning.

Added.

> 
> > +    (  # Origin comes before Host
> 
> Please also test incorrect order of Upgrade and Connection

Added.

> 
> > +    (  # Unknow header
> 
> Typo: unknown.

Fixed.

> 
> > +# vi:sts=4 sw=4 et
> 
> We don't normally allow modelines in WebKit sources.

pywebsocket follows a coding rule different form that of WebKit. 

> 
> Looks good, thanks for tackling this! I can't deeply review Python code, but it
> looks reasonable to me, too.

Thank you for the review. I'll upload a new patch right after this.

Yuzo
Comment 10 Yuzo Fujishima 2009-12-15 23:57:17 PST
Created attachment 44948 [details]
Update pywebsocket to 0.4.4 and make handshake checking stricter
Comment 11 WebKit Review Bot 2009-12-15 23:58:35 PST
style-queue ran check-webkit-style on attachment 44948 [details] without any errors.
Comment 12 Alexey Proskuryakov 2009-12-16 00:13:35 PST
> Two reasons:
> - pywebsocket is for general (although experimental) use, and
> generally speaking, the receiver of data should be lenient.
> The order of headers is minor enough to tolerate, in my opinion.
> - strict checking is difficult to enforce in the Apache module version
> of pywebsocket, and we want to keep the behavior as consistent as
> possible.

The specification has MUST level requirements for strict checking, so any implementation that is lenient is automatically non-conforming. This is even more important in general use than in testing.

If it's not practically feasible to achieve the level of strictness required by the spec due to Apache limitations, please bring this up on IETF hybi mailing list - the protocol may need to be changed to achieve its security goals by other means.
Comment 13 Alexey Proskuryakov 2009-12-16 00:19:48 PST
Hmm, re-reading the spec for server-side requirements, I'm actually not sure what form they take. "Requirements phrased in the imperative as part of algorithms <...> are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm" - I do not see where the algorithms are "introduced".
Comment 14 Yuzo Fujishima 2009-12-16 00:25:21 PST
MUST level applies for the data sent by the client, but not for the data sent
by the server.
http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-66#section-5.1

For example, "if the second token doesn't begin with U+002F SOLIDUS character
(/), the server should abort the connection". The server may choose to keep the
connection in this case. The server has the freedom of being as lenient as
desired in checking received data.

Yuzo
Comment 15 Yuzo Fujishima 2009-12-16 00:27:42 PST
Oops, sorry, I meant:

MUST level applies for the way the client sends data, but not for the way the
server examine the data.
Comment 16 Ian 'Hixie' Hickson 2009-12-16 09:14:24 PST
Specifically, the only requirement on even reading the handshake was "Servers may read part or all of the client's handshake before closing the connection or sending all of their side of the handshake".

I've tried to clarify this by switching the first two subsections of the parser side around, and adding a requirement that at a minimum the 0A0D0A0D bytes be read and discarded before reading frames.
Comment 17 Alexey Proskuryakov 2009-12-16 09:58:05 PST
Comment on attachment 44948 [details]
Update pywebsocket to 0.4.4 and make handshake checking stricter

 +    r'(WebSocket-Protocol: [\x20-\x7e]+\r\n)?'
> +    r'([Cc][Oo][Oo][Kk][Ii][Ee]:[^\r]*\r\n)*'
> +    r'([Cc][Oo][Oo][Kk][Ii][Ee]2:[^\r]*\r\n)?'
> +    r'([Cc][Oo][Oo][Kk][Ii][Ee]:[^\r]*\r\n)*'

You didn't address my comment about Cookie headers also being in restricted WebSocket syntax. Like other headers, they are case sensitive, require one space after colon, etc.

This doesn't correctly handle the case of multiple Cookie headers - that deserves at least a FIXME comment, if not fixing right away.

> -      version='0.4.3',
> +      version='0.4.5',

ChangeLog should be fixed to say 0.4.5.

> MUST level applies for the way the client sends data, but not for the way the
> server examine the data.

Ok.

r=me. ChangeLog will need to be fixed when landing, and please consider further improving Cookie handling in a future patch.
Comment 18 Fumitoshi Ukai 2009-12-17 18:38:21 PST
Committed r52296: <http://trac.webkit.org/changeset/52296>