Bug 44137

Summary: Crash beneath WTF::ThreadSpecificThreadExit (jump-to-null) when running websocket/tests/workers/close-in-shared-worker.html
Product: WebKit Reporter: Adam Roben (:aroben) <aroben>
Component: WebCore JavaScriptAssignee: Per Arne Vollan <pvollan>
Status: NEW ---    
Severity: Normal CC: ap, atwilson, bfulgham, dimich, ggaren, joenotcharles, levin, peavo, psolanki, pvollan, ukai
Priority: P2 Keywords: InRadar, LayoutTestFailure, PlatformOnly
Version: 528+ (Nightly build)   
Hardware: PC   
OS: Windows XP   
URL: http://build.webkit.org/results/Windows%20Release%20(Tests)/r65540%20(2834)/CrashLog_0a98_2010-08-17_15-59-44-592.txt
Bug Depends on:    
Bug Blocks: 55579    

Description Adam Roben (:aroben) 2010-08-17 16:07:39 PDT
The bots are crashing when running websocket/tests/workers/close-in-shared-worker.html. The crash logs say we're jumping to null under WTF::ThreadSpecificThreadExit (see <http://build.webkit.org/results/Windows%20Release%20(Tests)/r65540%20(2834)/CrashLog_0a98_2010-08-17_15-59-44-592.txt>).
Comment 1 Adam Roben (:aroben) 2010-08-17 16:10:31 PDT
<rdar://problem/8321688>
Comment 2 Andrew Wilson 2010-08-19 14:32:00 PDT
Looping in Ukai as this may be WebSocket related since it only seems to happen when creating/closing WebSockets on worker threads.
Comment 3 David Levin 2010-11-30 14:42:49 PST
Something went wrong with the value stored in tls.

Here's the code that crashes:

void ThreadSpecificThreadExit()
{
    for (long i = 0; i < tlsKeyCount(); i++) {
        // The layout of ThreadSpecific<T>::Data does not depend on T. So we are safe to do the static cast to ThreadSpecific<int> in order to access its data member.
        ThreadSpecific<int>::Data* data = static_cast<ThreadSpecific<int>::Data*>(TlsGetValue(tlsKeys()[i]));
        if (data)
            data->destructor(data);
    }

It crashed calling destructor here:  "data->destructor(data);"

It was able to access data->destructor, but the value at data->destructor was 0 (which is where it called and crashed).

So either
1. The tls value was over written in some way. (-- Some other code reused the same slot or the tls data structure was corrupted, etc. This seems really unlikely).
2. The data structure was overwritten.
3. The data structure was freed and something new was allocated at that place which happened to have 0 there.

My gut reaction is "3".

Also it is interesting to note that there is a worker running and doing a sync xhr on thread #12 when this happened. (Of course, thread 17 could still be a former web worker thread that was exiting.)