RESOLVED FIXED 212562
Consider a Thread Specific Cache for AssemblerBuffers
https://bugs.webkit.org/show_bug.cgi?id=212562
Summary Consider a Thread Specific Cache for AssemblerBuffers
Michael Saboff
Reported 2020-05-29 21:37:06 PDT
When JIT compiling, we create AssemblerBuffers to store the generated code before linking and copying to executable memory. Those buffers start out with a built in capacity of 128 bytes. Every time we need to grow, we realloc() which is usually a malloc() followed by a copy. This causes malloc churn, possible cache locality issues and other performance hits. If we keep a thread local cache, it should allow JIT worker threads to reuse a prior buffer with little or no realloc()'s.
Attachments
Patch (6.99 KB, patch)
2020-05-29 21:48 PDT, Michael Saboff
no flags
Patch to fix build failure (6.98 KB, patch)
2020-05-30 07:57 PDT, Michael Saboff
no flags
Updated patch (7.87 KB, patch)
2020-05-30 23:01 PDT, Michael Saboff
fpizlo: review+
Michael Saboff
Comment 1 2020-05-29 21:48:09 PDT
Radar WebKit Bug Importer
Comment 2 2020-05-29 21:48:45 PDT
Michael Saboff
Comment 3 2020-05-30 07:57:27 PDT
Created attachment 400668 [details] Patch to fix build failure
Filip Pizlo
Comment 4 2020-05-30 12:13:20 PDT
Comment on attachment 400668 [details] Patch to fix build failure View in context: https://bugs.webkit.org/attachment.cgi?id=400668&action=review > Source/JavaScriptCore/assembler/AssemblerBuffer.h:202 > + *threadSpecific = WTFMove(m_storage); Worth trying just WTFMoving if the destination is clear. Or letting the bigger buffer win. > Source/JavaScriptCore/assembler/AssemblerBuffer.h:324 > + void* ptr = static_cast<AssemblerData*>(threadSpecific); I’m not sure that this placement new thing is the canonical way to initialize a thread specific?
Michael Saboff
Comment 5 2020-05-30 23:01:23 PDT
Created attachment 400696 [details] Updated patch > > Source/JavaScriptCore/assembler/AssemblerBuffer.h:202 > > + *threadSpecific = WTFMove(m_storage); > > Worth trying just WTFMoving if the destination is clear. Or letting the > bigger buffer win. Made that change and it appears to improve performance. > > Source/JavaScriptCore/assembler/AssemblerBuffer.h:324 > > + void* ptr = static_cast<AssemblerData*>(threadSpecific); > > I’m not sure that this placement new thing is the canonical way to > initialize a thread specific? Copied this from a similar recent patch. Speculative fix for the JSC i386 build failure.
Michael Saboff
Comment 6 2020-05-31 07:56:08 PDT
Saam Barati
Comment 7 2020-06-01 10:11:10 PDT
Comment on attachment 400696 [details] Updated patch View in context: https://bugs.webkit.org/attachment.cgi?id=400696&action=review > Source/JavaScriptCore/wasm/WasmWorklist.cpp:123 > + clearAssembleDataThreadSpecificCache(); Maybe clear it out on VM destruction? What about the case when a GC finishes linking some code. What thread is that done on? Maybe we should have a way of clearing that out too.
Saam Barati
Comment 8 2020-06-01 10:16:50 PDT
Comment on attachment 400696 [details] Updated patch LGTM too, but I believe you're currently leaking these for web workers. I wonder if we should just say VM destruction releases this data too.
Saam Barati
Comment 9 2020-06-01 10:18:11 PDT
Comment on attachment 400696 [details] Updated patch View in context: https://bugs.webkit.org/attachment.cgi?id=400696&action=review > Source/JavaScriptCore/assembler/AssemblerBuffer.h:131 > + void takeBufferIfLarger(AssemblerData&& other) nit: I would've used a reference instead of rvalue reference here, since it's not always being moved out of.
Saam Barati
Comment 10 2020-06-01 10:24:02 PDT
(In reply to Filip Pizlo from comment #4) > Comment on attachment 400668 [details] > Patch to fix build failure > > View in context: > https://bugs.webkit.org/attachment.cgi?id=400668&action=review > > > Source/JavaScriptCore/assembler/AssemblerBuffer.h:202 > > + *threadSpecific = WTFMove(m_storage); > > Worth trying just WTFMoving if the destination is clear. Or letting the > bigger buffer win. > > > Source/JavaScriptCore/assembler/AssemblerBuffer.h:324 > > + void* ptr = static_cast<AssemblerData*>(threadSpecific); > > I’m not sure that this placement new thing is the canonical way to > initialize a thread specific? Yeah this is how I did it recently. This seemed like a reasonable way to me to initialize non PODs, but am curious if you think there is a better way?
Michael Saboff
Comment 11 2020-06-01 11:14:36 PDT
(In reply to Saam Barati from comment #7) > Comment on attachment 400696 [details] > Updated patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=400696&action=review > > > Source/JavaScriptCore/wasm/WasmWorklist.cpp:123 > > + clearAssembleDataThreadSpecificCache(); > > Maybe clear it out on VM destruction? > > What about the case when a GC finishes linking some code. What thread is > that done on? Maybe we should have a way of clearing that out too. It puts it into the GC's thread. I really want it to go back to the original thread's cache, but felt the locking and "owning" thread logic could slow things down.
Saam Barati
Comment 12 2020-06-01 12:38:33 PDT
(In reply to Saam Barati from comment #10) > (In reply to Filip Pizlo from comment #4) > > Comment on attachment 400668 [details] > > Patch to fix build failure > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=400668&action=review > > > > > Source/JavaScriptCore/assembler/AssemblerBuffer.h:202 > > > + *threadSpecific = WTFMove(m_storage); > > > > Worth trying just WTFMoving if the destination is clear. Or letting the > > bigger buffer win. > > > > > Source/JavaScriptCore/assembler/AssemblerBuffer.h:324 > > > + void* ptr = static_cast<AssemblerData*>(threadSpecific); > > > > I’m not sure that this placement new thing is the canonical way to > > initialize a thread specific? > > Yeah this is how I did it recently. This seemed like a reasonable way to me > to initialize non PODs, but am curious if you think there is a better way? So the answer here is just to do operator* or operator-> I'm super off base here. We already automatically call ctor and dtor of T in ThreadSpecific<T>. No need for this cleanup code I'm proposing. Will fix.
Geoffrey Garen
Comment 13 2022-02-10 16:47:34 PST
Can we clear on low memory warning too?
Geoffrey Garen
Comment 14 2022-02-10 16:49:00 PST
Oh, I see: Only the compiler threads use this, and they clear right away when done compiling. Probably not profitable to clear in the middle of compiling.
Note You need to log in before you can comment on or make changes to this bug.