If you create a WebSocket object during onclose() handler of another WebSocket object, the browser will crash. This crash may be related to http://crbug.com/44792.
Created attachment 60341 [details] Fix
Comment on attachment 60341 [details] Fix LGTM
Comment on attachment 60341 [details] Fix The code change looks reasonable, but there are some logical steps missing from the explanation. Why does creating a WebSocket object during onclose() handler of another WebSocket object cause this crash? They have separate channels and bridges. + while (m_workerContext && clientWrapper && !clientWrapper->syncMethodDone() && result != MessageQueueTerminated) { + result = runLoop.runInMode(m_workerContext.get(), m_taskMode); // May dereference this bridge and m_workerContext. clientWrapper = m_workerClientWrapper.get(); } What is the exact mechanism that dereferences worker context while waiting here? Is this part of the change covered by the test?
(In reply to comment #3) > (From update of attachment 60341 [details]) > The code change looks reasonable, but there are some logical steps missing from the explanation. Why does creating a WebSocket object during onclose() handler of another WebSocket object cause this crash? They have separate channels and bridges. > > + while (m_workerContext && clientWrapper && !clientWrapper->syncMethodDone() && result != MessageQueueTerminated) { > + result = runLoop.runInMode(m_workerContext.get(), m_taskMode); // May dereference this bridge and m_workerContext. > clientWrapper = m_workerClientWrapper.get(); > } > > What is the exact mechanism that dereferences worker context while waiting here? Is this part of the change covered by the test? I found the test case first, and then looked into the failure. So I'm not sure about the exact mechanism of why this test causes the failure, but it is 100% reproducible with this test. The source of the failure is that calling WorkerRunLoop::runInMode() may call WebSocket::didClose() (which frees everything). The stack trace is like the following: WebSocket::close() is called (in ws1.close()) -> WorkerThreadableWebSocketChannel::bufferedAmount() -> WorkerThreadableWebSocketChannel::Bridge::bufferedAmount() -> WorkerThreadableWebSocketChannel::Bridge::waitForMethodCompletion() -> WorkerRunLoop::runInMode() -> ... -> workerContextDidClose() -> ThreadableWebSocketChannelClientWrapper::didClose() -> ThreadableWebSocketChannelClientWrapper::processPendingEvents() -> WebSocket::didClose() This causes the channel and other things to get closed, while WebSocket::close() does not think such thing should happen. > + result = runLoop.runInMode(m_workerContext.get(), m_taskMode); // May dereference this bridge and m_workerContext. This comment was a bit inaccurate. Since m_workerContext may become null in Bridge::disconnect(), the comment should be "May causes the bridge to be disconnected, which makes m_workerContext become null." I will update the comment in a new patch.
Created attachment 60776 [details] a simpler test I'll need to think about this patch more deeply, we may be violating some more general ownership rules here. It's awesome that we have a reproducible test case! > m_bufferedAmountAfterClose = m_channel->bufferedAmount(); // May dereference m_channel. > if (m_channel) > m_channel->close(); This comment at least is wrong - we don't care about m_channel being dereferenced - all we care about to make this call is m_channel being non-null. WebSocket holds its own reference to channel. I have a simpler reproducible case, one that doesn't even need to create another WebSocket. It crashes in several different places.
Comment on attachment 60341 [details] Fix The title of the bug isn't great - there are multiple crash locations being fixed, not just waitForMethodCompletion(). And this isn't about creating a WebSocket object from onclose(), as the original description said. I think that this patch is fine, and just needs better comments.
Created attachment 61023 [details] Fix v2 (Simplify test, add comments)
Apologies for delay in response. In the new patch, I tried to: - Add more comments and clarify the situation. - Simplify the test according to Alexey's proposal. Could you take another look?
Comment on attachment 61023 [details] Fix v2 (Simplify test, add comments) + m_bufferedAmountAfterClose = m_channel->bufferedAmount(); // May dereference m_channel. This comment is still misleading, we don't care about it being dereferenced. WebSocket holds its own reference to m_channel. I thin that the real explanation is that a didClose notification may be already queued, which we will inadvertently process while waiting for bufferedAmount() to return.
Created attachment 61222 [details] Fix v3 (Improve comments as per Alexey's suggestion)
Comment on attachment 61222 [details] Fix v3 (Improve comments as per Alexey's suggestion) Clearing flags on attachment: 61222 Committed r63159: <http://trac.webkit.org/changeset/63159>
All reviewed patches have been landed. Closing bug.