Bug 35721

Summary: WebSocket closing handshake
Product: WebKit Reporter: Fumitoshi Ukai <ukai>
Component: WebCore JavaScriptAssignee: Fumitoshi Ukai <ukai>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, adamk, ap, commit-queue, dglazkov, eric, joenotcharles, kent.hansen, mike, mjs, tkent, toyoshim, tyoshino, webkit.review.bot, yutak
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on: 35573, 60878, 60945, 61034, 61277, 61643    
Bug Blocks: 35572, 50099    
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Merge to trunk
none
The final part
none
Archive of layout-test-results from ec2-cr-linux-01
none
The final part v2
none
Ready to land none

Description Fumitoshi Ukai 2010-03-03 23:34:00 PST
WebSocket API Editor's Draft 4 March 2010 /draft-hixie-thewebsocketprotocol-76 introduces closing handshake (CLOSING state, etc)
Comment 1 Fumitoshi Ukai 2010-05-21 02:04:04 PDT
Created attachment 56682 [details]
Patch
Comment 2 Fumitoshi Ukai 2010-05-23 22:38:23 PDT
Created attachment 56848 [details]
Patch
Comment 3 Kent Hansen 2010-07-23 14:14:29 PDT
(In reply to comment #2)
> Created an attachment (id=56848) [details]
> Patch

There are no new tests for CLOSING. Isn't it possible to reliably observe this state from a test?
What about the fail() case, how is that codepath tested?
Comment 4 Fumitoshi Ukai 2010-07-25 22:10:57 PDT
(In reply to comment #3)
> (In reply to comment #2)
> > Created an attachment (id=56848) [details] [details]
> > Patch
> 
> There are no new tests for CLOSING. Isn't it possible to reliably observe this state from a test?
> What about the fail() case, how is that codepath tested?

I'll add test for CLOSING that client has been sent closing frame (0xFF 0x00).
The fail case would be exposed by CloseEvent, which would be added by https://bugs.webkit.org/show_bug.cgi?id=35573
Could you take a look of that patch, please?
Thanks.
Comment 5 Fumitoshi Ukai 2010-07-28 01:11:54 PDT
Created attachment 62800 [details]
Patch
Comment 6 Alexey Proskuryakov 2010-10-07 21:28:15 PDT
<rdar://problem/8467487>
Comment 7 Yuta Kitamura 2010-12-14 00:41:11 PST
I think we really need to resurrect this thread and CloseEvent thread and move forward. Maybe I should rebase the patch to trunk and re-upload it for review...
Comment 8 Alexey Proskuryakov 2010-12-14 10:23:43 PST
Is there a reason to be modifying an implementation of v76 protocol? My understanding is that the next protocol is still in early stages of discussion, perhaps we should wait for that?
Comment 9 Yuta Kitamura 2010-12-14 20:28:12 PST
* CloseEvent and closing handshake is part of hixie-76 (= hybi-00) protocol. Our implementation is not compliant with hixie-76.

* I think these features are likely to be included in the next generation WebSocket protocol / API. We probably need to make progress on this patch and CloseEvent patch sooner or later.

* I would like to start to implement the new protocol as early as possible, as some flaws of the current protocol are already pointed out in the hybi list. We can also provide feedback to the protocol by implementing the protocol early.
Comment 10 Alexey Proskuryakov 2010-12-15 11:20:42 PST
Compliance with hixie-76 doesn't matter any more - our implementation is what Web developers target. We shouldn't disrupt it without a very good reason, and compliance with an obsolete spec is not such a reason.
Comment 11 Yuta Kitamura 2010-12-15 19:09:18 PST
So you are saying hixie-76 (= hybi-00) is deprecate and hybi-03 is too early to implement? This means that no protocol specification is effective for us and prevents us from doing anything. Your argument does not make much sense to me.

CloseEvent and closing handshake is unlikely to be removed from the future specification, and these features will be an important basis to implement the future protocol.

Furthermore, CloseEvent and closing handshake are mainly included in the WebSocket API specification rather than the WebSocket protocol specification. I think WebSocket API has been valid since its last update and is not deprecated...
Comment 12 Alexey Proskuryakov 2010-12-15 19:57:07 PST
> This means that no protocol specification is effective for us and prevents us from doing anything.

That cannot prevent us from doing anything.

I think that close events are fairly likely to be changed even from API perspective, and since WebSockets are no longer an experimental feature, the risk of adding the event seems to outweigh the benefit. Specifically, there is almost no known benefit - the only rationale given in bug 35573 was that the event was added to the spec.
Comment 13 Adam Barth 2010-12-15 20:20:23 PST
> WebSockets are no longer an experimental feature

What makes WebSockets no longer an experimental feature?
Comment 14 Joe Mason 2010-12-17 07:07:45 PST
(In reply to comment #12)
> I think that close events are fairly likely to be changed even from API perspective, and since WebSockets are no longer an experimental feature, the risk of adding the event seems to outweigh the benefit. Specifically, there is almost no known benefit - the only rationale given in bug 35573 was that the event was added to the spec.

Why do you think that close events will be changed?  Apart from the opening handshake the spec seems stable now.  Any further changes are likely to be driven by implementor's experiences - ie. us.

I agree that any behaviour change to the -76 protocol is a mistake.  We should do this development for the -03 protocol, in a separate branch.  (Or do them in the main branch but use a feature define to toggle between them so that users of the nightlies can play around with the new protocol.)
Comment 15 Alexey Proskuryakov 2010-12-17 08:51:03 PST
Last I checked, the semantics of the close event were very far from what TCP/IP had, so the folks in IETF may well change it.
Comment 16 Yuta Kitamura 2011-03-15 04:54:22 PDT
Discussion about close frame and closing handshake has been settled down in the IETF hybi list, and hybi draft 06 now includes closing handshake that is almost identical to hixie-76's one.

http://www.ietf.org/id/draft-ietf-hybi-thewebsocketprotocol-06.txt
(See "4.5.1. Close")

The main purpose of introducing closing handshake is (IMO) to avoid TCP RST hazard. When an endpoint receives a close frame, it is guaranteed to receive no more bytes and can safely close the underlying socket. If an endpoint tries to close a socket which has unread data, a TCP RST packet may be sent to the other endpoint and it may cause undesireble result.

So, I think it's good time to get closing handshake implemented and make our implementation conformant to hixie-76 at least.

If there is no objection, I'd like to rebase this patch to make it apply to trunk...
Comment 17 Alexey Proskuryakov 2011-03-15 08:44:31 PDT
OK, let's work on it.
Comment 18 Yuta Kitamura 2011-03-22 22:51:48 PDT
Created attachment 86572 [details]
Merge to trunk
Comment 19 Alexey Proskuryakov 2011-04-08 09:24:41 PDT
Looks like there are objections being made to WebSocket closing handshake, after all. I started researching what people are saying, and found this: <https://www.infosecisland.com/blogview/12681-The-WebSocket-Protocol-Past-Travails-To-Be-Avoided.html>.
Comment 20 Takeshi Yoshino 2011-04-12 00:17:30 PDT
(In reply to comment #19)
<https://www.infosecisland.com/blogview/12681-The-WebSocket-Protocol-Past-Travails-To-Be-Avoided.html>.

He's just arguing wording in the current spec. Not objecting to having closing handshake and its usefulness.

Purposes of closing handshake are
a) tell intermediaries (which understands WebSocket) that the connection is abandoned so it's ok to release resources for it such as memory, TCP session without depending on TCP FIN/RST
b) prevent RST hazard (An endpoint may close socket without draining received data. This causes sending out RST (since TCP stack couldn't deliver the received data to the upper layer) on some platform. This RST may impede the other peer from handling received data). This can be done by relying on TCP's FIN i.e. shutdown(SHUT_WR) and 0 from read() call on some platform, but they're not always available. We're going to address this by making sure both peers send out close frame to tell the other peer there will be no more data coming, so it's safe to call close on their socket.

For a), replying close frame is not necessary. For b), it's necessary. The current spec is adopting b).

As far as I know, no one is objecting to have a). Gabriel of Microsoft was doubting usefulness of b) but not objecting actively now. Several people on the list are demanding b)-feature to perform safe closure including Jamie Lokier one of the people who initially proposed closing handshake. In reality, when user agent doesn't care RST hazard (e.g. the browser is going down) and want to throw away resource for the connection immediately without waiting for close frame in reply, the user agent may just abandon the connection. It may be categorized as abnormal closure but not protocol violation.

There's not consensus call made, but I'm almost sure the closing handshake will be kept as is on the spec now.
Comment 21 Yuta Kitamura 2011-04-19 19:47:21 PDT
Could anybody review this? The discussion about closing semantics has been settled down for long (in IETF hybi mailing list), and no objections were raised as far as I observed. The article in comment #19 argues about just editorial issues, not protocol semantics.

It's good time to move towards implementing the new protocol.
Comment 22 Kent Tamura 2011-05-11 18:59:12 PDT
Comment on attachment 86572 [details]
Merge to trunk

View in context: https://bugs.webkit.org/attachment.cgi?id=86572&action=review

I understand the patch is harmless and we should go ahead.

Can you split the patch into multiple parts?
e.g.
  1. Adding CLOSING to WebSocket.idl, WebSocket.h, and SocketStreamHandleBase.h, and update test results.
  2. Adding ThreadableWebSocketChannelClientWrapper.cpp, and move existing inline function definitions in ThreadableWebSocketChannelClientWrapper.h to .cpp.
  3. Remaining

> LayoutTests/http/tests/websocket/tests/script-tests/close-frame.js:1
> +description("Make sure client sends closing frame.");

You can merge this file into close-frame.html.

> LayoutTests/http/tests/websocket/tests/script-tests/close-frame.js:12
> +if (window.layoutTestController)
> +    layoutTestController.waitUntilDone();
> +
> +function endTest()
> +{
> +    isSuccessfullyParsed();
> +    if (window.layoutTestController)
> +        layoutTestController.notifyDone();
> +}

You can simplify this code by jsTestIsAsync=true and finishJSTest().

> Source/WebCore/websockets/WebSocket.cpp:182
> +    if (m_state == CONNECTING) {
> +      m_state = CLOSING;
> +      m_channel->fail();
> +      return;
> +    }

Wrong indentation.

> Source/WebCore/websockets/WebSocket.cpp:188
> -        m_channel->close();
> +      m_channel->close();

ditto.
Comment 23 Yuta Kitamura 2011-05-15 22:47:02 PDT
Comment on attachment 86572 [details]
Merge to trunk

Sure, I'm going to split the patch...
Comment 24 Yuta Kitamura 2011-05-16 02:59:50 PDT
Part 1: Add CLOSING state (bug 60878)
Comment 25 Yuta Kitamura 2011-05-18 00:01:27 PDT
Part 2: Clean up ThreadableWebSocketChannelClientWrapper (bug 60945) (already committed)
Comment 26 Yuta Kitamura 2011-05-18 03:27:19 PDT
(I realized that I could still divide the remaining patch into smaller pieces, so I'm posting more patches than Kent's suggestion.)

Part 3: Use ScriptExecutionContext::Task in ThreadableWebSocketChannelClientWrapper (bug 61034).
Comment 27 Kent Tamura 2011-05-18 18:46:35 PDT
(In reply to comment #26)
> (I realized that I could still divide the remaining patch into smaller pieces, so I'm posting more patches than Kent's suggestion.)

Great!  Smaller is better.
Comment 28 Yuta Kitamura 2011-05-26 00:28:32 PDT
Created attachment 94928 [details]
The final part
Comment 29 Kent Tamura 2011-05-26 00:51:08 PDT
Comment on attachment 94928 [details]
The final part

View in context: https://bugs.webkit.org/attachment.cgi?id=94928&action=review

> Source/WebCore/websockets/WebSocketChannel.cpp:127
> +        m_closingTimer.startOneShot(2 * 2 * 60.0); // 2 * TCP Maximum Segment Lifetime.

So we had better add
  const double TCPMaximumSegmentLifetime = 2 * 60
and use it like startOneShot(2 * TCPMaximumSegmentLifetime);

> Source/WebCore/websockets/WebSocketChannelClient.h:45
> +        virtual void didClose(unsigned long /* unhandledBufferedAmount */, bool /* didReceiveClosingHandshake */) { }

Do not add bool argument.  We should use enum.
Comment 30 WebKit Review Bot 2011-05-26 02:54:09 PDT
Comment on attachment 94928 [details]
The final part

Attachment 94928 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/8736212

New failing tests:
http/tests/websocket/tests/client-close.html
http/tests/websocket/tests/server-close.html
Comment 31 WebKit Review Bot 2011-05-26 02:54:15 PDT
Created attachment 94951 [details]
Archive of layout-test-results from ec2-cr-linux-01

The attached test failures were seen while running run-webkit-tests on the chromium-ews.
Bot: ec2-cr-linux-01  Port: Chromium  Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Comment 32 Yuta Kitamura 2011-05-26 03:33:13 PDT
(In reply to comment #30)
> (From update of attachment 94928 [details])
> Attachment 94928 [details] did not pass chromium-ews (chromium-xvfb):
> Output: http://queues.webkit.org/results/8736212
> 
> New failing tests:
> http/tests/websocket/tests/client-close.html
> http/tests/websocket/tests/server-close.html

This seems like a false alarm; the log says the bot failed to start the new pywebsocket server:

[2011-05-26 09:45:34,085] [CRITICAL] root: mod_pywebsocket: [Errno 98] Address already in use
[2011-05-26 09:45:34,086] [CRITICAL] root: mod_pywebsocket: Traceback (most recent call last):
  File "/mnt/git/webkit-chromium-ews/Tools/Scripts/webkitpy/thirdparty/autoinstalled/pywebsocket/mod_pywebsocket/standalone.py", line 464, in _main
    WebSocketRequestHandler)
  File "/mnt/git/webkit-chromium-ews/Tools/Scripts/webkitpy/thirdparty/autoinstalled/pywebsocket/mod_pywebsocket/standalone.py", line 189, in __init__
    self.server_bind()
  File "/usr/lib/python2.6/BaseHTTPServer.py", line 108, in server_bind
    SocketServer.TCPServer.server_bind(self)
  File "/usr/lib/python2.6/SocketServer.py", line 413, in server_bind
    self.socket.bind(self.server_address)
  File "<string>", line 1, in bind
error: [Errno 98] Address already in use

I usually use chromium-linux for local development and tests, and they are passing in my local environment. So I guess they are okay...
Comment 33 Yuta Kitamura 2011-05-26 03:46:16 PDT
Created attachment 94957 [details]
The final part v2
Comment 34 WebKit Commit Bot 2011-05-26 20:24:15 PDT
Comment on attachment 94957 [details]
The final part v2

Clearing flags on attachment: 94957

Committed r87464: <http://trac.webkit.org/changeset/87464>
Comment 35 WebKit Commit Bot 2011-05-26 20:24:26 PDT
All reviewed patches have been landed.  Closing bug.
Comment 37 Adam Klein 2011-05-27 09:35:43 PDT
(In reply to comment #36)
> New tests added in this change are flaky on Windows and Mac.  See:
> 
> http://test-results.appspot.com/dashboards/flakiness_dashboard.html#showExpectations=true&tests=http%2Ftests%2Fwebsocket%2Ftests%2Fclient-close.html%2Chttp%2Ftests%2Fwebsocket%2Ftests%2Fserver-close.html
> 
> and
> 
> http://test-results.appspot.com/dashboards/flakiness_dashboard.html#group=%40ToT%20-%20chromium.org&showExpectations=true&tests=http%2Ftests%2Fwebsocket%2Ftests%2Fclient-close.html%2Chttp%2Ftests%2Fwebsocket%2Ftests%2Fserver-close.html

Since these failures don't look due to test fixture flakiness (all other websocket tests pass in the same run),  these look like real problems with the tests themselves (or the underlying code), so I've rolled your change out.
Comment 38 Yuta Kitamura 2011-05-29 20:19:39 PDT
I think these failures are caused by stale pywebsocket processes in bots.

This patch adds new pywebsocket tests, thus websocket server must be restarted. Normally the websocket server is shut down at the end of test run, but it is known that a stale pywebsocket process may sometimes continue to live; if this happens, the new websocket server process will not be created successfully, because the port is occupied by the stale process.

If there's pywebsocket.ws.log that contains "Address already in use" errors, the bot is probably affected by this issue.

So, if you see failures of client-close.html and/or server-close.html (these are new tests added in this patch), the server restart will probably fix the failures.
Comment 39 Yuta Kitamura 2011-05-29 21:19:54 PDT
I'm trying to re-land the patch and will look at the Chromium bots closely.
Comment 40 Yuta Kitamura 2011-05-29 21:24:41 PDT
Created attachment 95322 [details]
Ready to land
Comment 41 WebKit Commit Bot 2011-05-29 23:22:19 PDT
Comment on attachment 95322 [details]
Ready to land

Clearing flags on attachment: 95322

Committed r87674: <http://trac.webkit.org/changeset/87674>
Comment 42 WebKit Commit Bot 2011-05-29 23:22:30 PDT
All reviewed patches have been landed.  Closing bug.