RESOLVED FIXED 200329
[JSC] Support WebAssembly in SamplingProfiler
https://bugs.webkit.org/show_bug.cgi?id=200329
Summary [JSC] Support WebAssembly in SamplingProfiler
Yusuke Suzuki
Reported 2019-07-31 18:40:35 PDT
Currently, sampling profiler does not list up wasm functions. We should do that to investigate JetStream2 wasm benchmarks performance bottlenecks.
Attachments
Patch (47.38 KB, patch)
2019-07-31 22:15 PDT, Yusuke Suzuki
no flags
Patch (64.67 KB, patch)
2019-08-01 00:26 PDT, Yusuke Suzuki
no flags
Archive of layout-test-results from ews115 for mac-highsierra (3.12 MB, application/zip)
2019-08-01 02:25 PDT, EWS Watchlist
no flags
Patch (72.94 KB, patch)
2019-08-01 11:35 PDT, Yusuke Suzuki
saam: review+
Yusuke Suzuki
Comment 1 2019-07-31 18:49:46 PDT
(In reply to Yusuke Suzuki from comment #0) > Currently, sampling profiler does not list up wasm functions. We should do > that to investigate JetStream2 wasm benchmarks performance bottlenecks. I'll use this to investigate wasm benchmark status in JetStream2 cli.js
Yusuke Suzuki
Comment 2 2019-07-31 22:15:39 PDT
Created attachment 375292 [details] Patch WIP
EWS Watchlist
Comment 3 2019-07-31 22:18:20 PDT
Attachment 375292 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/wasm/WasmFaultSignalHandler.cpp:78: Extra space before [. [whitespace/brackets] [5] Total errors found: 1 in 17 files If any of these errors are false positives, please file a bug against check-webkit-style.
Yusuke Suzuki
Comment 4 2019-08-01 00:26:10 PDT
EWS Watchlist
Comment 5 2019-08-01 00:28:06 PDT
Attachment 375294 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/wasm/WasmFaultSignalHandler.cpp:78: Extra space before [. [whitespace/brackets] [5] Total errors found: 1 in 22 files If any of these errors are false positives, please file a bug against check-webkit-style.
Saam Barati
Comment 6 2019-08-01 01:33:19 PDT
Comment on attachment 375294 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=375294&action=review Will look in detail tomorrow. > Source/JavaScriptCore/runtime/SamplingProfiler.cpp:91 > + , m_wasmCalleeLocker(wasmCalleeLocker) Stopping Wasm and grabbing a lock is somewhat new in the sense that Wasm::Callee lives outside VM where SamplingProfiler is per VM. Do you foresee any potential for deadlocks when running two sampling profilers? > Source/JavaScriptCore/wasm/WasmCallee.cpp:85 > + return callees->contains(callee); Nit: it’s a little weird to put all this singleton stuff on Callee. I think it’d be better to have something like a CalleeRegistry class. > Source/JavaScriptCore/wasm/WasmIndexOrName.h:38 > +// Keep this class copyable the world is stopped: do not allocate any memory while copying this. “copyable the” => “copyable when the”
Yusuke Suzuki
Comment 7 2019-08-01 02:24:21 PDT
Comment on attachment 375294 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=375294&action=review >> Source/JavaScriptCore/runtime/SamplingProfiler.cpp:91 >> + , m_wasmCalleeLocker(wasmCalleeLocker) > > Stopping Wasm and grabbing a lock is somewhat new in the sense that Wasm::Callee lives outside VM where SamplingProfiler is per VM. Do you foresee any potential for deadlocks when running two sampling profilers? So long as we keep a correct order of locking, the deadlock can be avoided. Since Wasm::Callee's lock is somewhat a leaf of a lock (we do not take other locks while holding this Wasm::Callee's lock), so long as we take this lock at the end of lock taking sequence, I think this is OK. >> Source/JavaScriptCore/wasm/WasmCallee.cpp:85 >> + return callees->contains(callee); > > Nit: it’s a little weird to put all this singleton stuff on Callee. I think it’d be better to have something like a CalleeRegistry class. The benefit of this is that we can completely hide `registerCallee` and `unregisterCallee` inside WasmCallee.cpp. The interfaces modifying this set are strictly limited: Callee::create & Callee::~Callee. But maybe, adding CalleeRegistry is simpler. Changing. >> Source/JavaScriptCore/wasm/WasmIndexOrName.h:38 >> +// Keep this class copyable the world is stopped: do not allocate any memory while copying this. > > “copyable the” => “copyable when the” Fixed.
Yusuke Suzuki
Comment 8 2019-08-01 02:24:54 PDT
BTW, here is sample output. Starting JetStream2 Running gcc-loops-wasm: Startup: 227.273 Run time: 0.588 Score: 11.561 Startup: 227.273 Runtime: 0.588 Total Score: 11.561 Sampling rate: 1000.000000 microseconds Top functions as <numSamples 'functionName:sourceID'> 2603 '<?>.wasm-function[38]:-1' 1494 '<?>.wasm-function[39]:-1' 1004 '<?>.wasm-function[36]:-1' 965 '<?>.wasm-function[35]:-1' 325 '<?>.wasm-function[34]:-1' 202 '<?>.wasm-function[546]:-1' 16 '<?>.wasm-function[40]:-1' 3 'loadString:-1' 3 '<?>.wasm-function[37]:-1' 2 'load:-1' 1 'prepareToRun:2' 1 '(unknown):-1' Sampling rate: 1000.000000 microseconds Hottest bytecodes as <numSamples 'functionName#hash:JITType:bytecodeIndex'> 2603 '<?>.wasm-function[38]#<nil>:BBQ:<nil>' 1494 '<?>.wasm-function[39]#<nil>:BBQ:<nil>' 1004 '<?>.wasm-function[36]#<nil>:BBQ:<nil>' 965 '<?>.wasm-function[35]#<nil>:BBQ:<nil>' 324 '<?>.wasm-function[34]#<nil>:OMG:<nil>' 202 '<?>.wasm-function[546]#<nil>:OMG:<nil>' 16 '<?>.wasm-function[40]#<nil>:BBQ:<nil>' 3 'loadString#<nil>:None:<nil>' 3 '<?>.wasm-function[37]#<nil>:BBQ:<nil>' 2 'load#<nil>:None:<nil>' 1 'createStandardStreams#Asr3FY:LLInt:387' 1 'filter#CXE8tr:Baseline:81' 1 'newPromiseCapability#DzQl2X:Baseline:91' 1 'prepareToRun#DqWFYU:LLInt:129' 1 'callRuntimeCallbacks#EYZrDM:LLInt:200' 1 'updateGlobalBufferViews#AqvIC2:LLInt:85' 1 '(unknown)#<nil>:None:<nil>' 1 '#<nil>:None:<nil>' 1 '<?>.wasm-function[34]#<nil>:BBQ:<nil>' 1 'doRun#C0pLVZ:LLInt:20191' 1 '#A7TOIG:LLInt:22' 1 'instantiateArrayBuffer#ALPYQ8:LLInt:22' 1 'fulfillPromise#AuvX5q:Baseline:95'
EWS Watchlist
Comment 9 2019-08-01 02:25:55 PDT
Comment on attachment 375294 [details] Patch Attachment 375294 [details] did not pass mac-debug-ews (mac): Output: https://webkit-queues.webkit.org/results/12845657 New failing tests: inspector/cpu-profiler/threads.html
EWS Watchlist
Comment 10 2019-08-01 02:25:57 PDT
Created attachment 375296 [details] Archive of layout-test-results from ews115 for mac-highsierra The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews115 Port: mac-highsierra Platform: Mac OS X 10.13.6
Yusuke Suzuki
Comment 11 2019-08-01 11:35:47 PDT
EWS Watchlist
Comment 12 2019-08-01 11:39:03 PDT
Attachment 375328 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/wasm/WasmFaultSignalHandler.cpp:81: Extra space before [. [whitespace/brackets] [5] Total errors found: 1 in 24 files If any of these errors are false positives, please file a bug against check-webkit-style.
Saam Barati
Comment 13 2019-08-02 14:58:37 PDT
Comment on attachment 375328 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=375328&action=review r=me > Source/JavaScriptCore/runtime/SamplingProfiler.cpp:132 > + if (unsafeCallee.isWasm()) { nit: Maybe we should name this function we're in "recordJITFrame" instead of "recordJSFrame" since it now handles Wasm > Source/JavaScriptCore/runtime/SamplingProfiler.cpp:135 > + // At this point, Wasm::Callee would be dying (ref count is 0), but its fields are still live. this is the other end of the memory ordering I was talking about in ~Callee() > Source/JavaScriptCore/runtime/SamplingProfiler.cpp:136 > + // And we can safely copy Wasm::IndexOrName even any lock is held by suspended threads. "even any" => "even when any" > Source/JavaScriptCore/runtime/SamplingProfiler.h:71 > + Optional<Wasm::CompilationMode> wasmCompilationMode; why not also put this under "#if ENABLE(WEBASSEMBLY)"? (If it makes the code more annoying, I'd just skip doing it) > Source/JavaScriptCore/runtime/SamplingProfiler.h:98 > + Optional<Wasm::CompilationMode> wasmCompilationMode; ditto > Source/JavaScriptCore/wasm/WasmCallee.cpp:52 > + CalleeRegistry::singleton().unregisterCallee(this); To be super pedantic, we may need a store store memory fence after this, since theoretically, locking doesn't prevent stores below the lock is released into moving into the lock region (just stores in the lock region can't move out). So theoretically, the compiler could move stores of the fields of |this| object before the store to remove the entry in the hash table. Maybe we don't care about this memory ordering weirdness, but I think it might be worth making sure we don't crash if those bits are bad. > Source/JavaScriptCore/wasm/WasmCallee.h:64 > + return std::make_tuple(start, end); style nit: I'd just do "return { start, end };" > Source/JavaScriptCore/wasm/WasmCalleeRegistry.cpp:43 > +void CalleeRegistry::initialize() > +{ > + static std::once_flag onceKey; > + std::call_once(onceKey, [] { > + calleeRegistry.construct(); > + }); > +} Since this is called from initializeThreading, you don't need this synchronization to happen. Alternatively, you could just not call this from initializeThreading and keep things this way.
Saam Barati
Comment 14 2019-08-02 14:58:55 PDT
(In reply to Yusuke Suzuki from comment #8) > BTW, here is sample output. > > > Starting JetStream2 > Running gcc-loops-wasm: > Startup: 227.273 > Run time: 0.588 > Score: 11.561 > > > Startup: 227.273 > Runtime: 0.588 > > Total Score: 11.561 > > > > Sampling rate: 1000.000000 microseconds > Top functions as <numSamples 'functionName:sourceID'> > 2603 '<?>.wasm-function[38]:-1' > 1494 '<?>.wasm-function[39]:-1' > 1004 '<?>.wasm-function[36]:-1' > 965 '<?>.wasm-function[35]:-1' > 325 '<?>.wasm-function[34]:-1' > 202 '<?>.wasm-function[546]:-1' > 16 '<?>.wasm-function[40]:-1' > 3 'loadString:-1' > 3 '<?>.wasm-function[37]:-1' > 2 'load:-1' > 1 'prepareToRun:2' > 1 '(unknown):-1' > > > Sampling rate: 1000.000000 microseconds > Hottest bytecodes as <numSamples 'functionName#hash:JITType:bytecodeIndex'> > 2603 '<?>.wasm-function[38]#<nil>:BBQ:<nil>' > 1494 '<?>.wasm-function[39]#<nil>:BBQ:<nil>' > 1004 '<?>.wasm-function[36]#<nil>:BBQ:<nil>' > 965 '<?>.wasm-function[35]#<nil>:BBQ:<nil>' > 324 '<?>.wasm-function[34]#<nil>:OMG:<nil>' > 202 '<?>.wasm-function[546]#<nil>:OMG:<nil>' > 16 '<?>.wasm-function[40]#<nil>:BBQ:<nil>' > 3 'loadString#<nil>:None:<nil>' > 3 '<?>.wasm-function[37]#<nil>:BBQ:<nil>' > 2 'load#<nil>:None:<nil>' > 1 'createStandardStreams#Asr3FY:LLInt:387' > 1 'filter#CXE8tr:Baseline:81' > 1 'newPromiseCapability#DzQl2X:Baseline:91' > 1 'prepareToRun#DqWFYU:LLInt:129' > 1 'callRuntimeCallbacks#EYZrDM:LLInt:200' > 1 'updateGlobalBufferViews#AqvIC2:LLInt:85' > 1 '(unknown)#<nil>:None:<nil>' > 1 '#<nil>:None:<nil>' > 1 '<?>.wasm-function[34]#<nil>:BBQ:<nil>' > 1 'doRun#C0pLVZ:LLInt:20191' > 1 '#A7TOIG:LLInt:22' > 1 'instantiateArrayBuffer#ALPYQ8:LLInt:22' > 1 'fulfillPromise#AuvX5q:Baseline:95' Excellent!!
Yusuke Suzuki
Comment 15 2019-08-02 15:57:18 PDT
Comment on attachment 375328 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=375328&action=review >> Source/JavaScriptCore/runtime/SamplingProfiler.cpp:132 >> + if (unsafeCallee.isWasm()) { > > nit: Maybe we should name this function we're in "recordJITFrame" instead of "recordJSFrame" since it now handles Wasm Sounds really nice. Fixed. >> Source/JavaScriptCore/runtime/SamplingProfiler.cpp:135 >> + // At this point, Wasm::Callee would be dying (ref count is 0), but its fields are still live. > > this is the other end of the memory ordering I was talking about in ~Callee() Discussed with Saam, and we conclude that this is OK. The rationale is described in Wasm::Callee's part. >> Source/JavaScriptCore/runtime/SamplingProfiler.cpp:136 >> + // And we can safely copy Wasm::IndexOrName even any lock is held by suspended threads. > > "even any" => "even when any" Fixed. >> Source/JavaScriptCore/runtime/SamplingProfiler.h:71 >> + Optional<Wasm::CompilationMode> wasmCompilationMode; > > why not also put this under "#if ENABLE(WEBASSEMBLY)"? > > (If it makes the code more annoying, I'd just skip doing it) Yeah, changing this part makes a code tricky, like, adding a new parameter based on ifdef. (xxx param #if ENABLE(WEBASSEMBLY) , yyy param2 #endif ) So, I intentionally added this here unconditionally :) Due to the similar reason (annoying code), DFG CompilationMode is also unconditionally defined. >> Source/JavaScriptCore/wasm/WasmCallee.cpp:52 >> + CalleeRegistry::singleton().unregisterCallee(this); > > To be super pedantic, we may need a store store memory fence after this, since theoretically, locking doesn't prevent stores below the lock is released into moving into the lock region (just stores in the lock region can't move out). > > So theoretically, the compiler could move stores of the fields of |this| object before the store to remove the entry in the hash table. > > Maybe we don't care about this memory ordering weirdness, but I think it might be worth making sure we don't crash if those bits are bad. Discussed with Saam. We conclude that we do not need to have this barrier since sampling-profiler reads the state while it is holding this lock. So half-baked modification inside the lock region is not exposed to sampling-profiler. The sampling-profiler see the changes in lock-region as an atomic op. >> Source/JavaScriptCore/wasm/WasmCallee.h:64 >> + return std::make_tuple(start, end); > > style nit: I'd just do "return { start, end };" Nice. Fixed. >> Source/JavaScriptCore/wasm/WasmCalleeRegistry.cpp:43 >> +} > > Since this is called from initializeThreading, you don't need this synchronization to happen. > > Alternatively, you could just not call this from initializeThreading and keep things this way. Removing synchronization seems simple and it just works well :) Changed.
Yusuke Suzuki
Comment 16 2019-08-02 15:58:23 PDT
Radar WebKit Bug Importer
Comment 17 2019-08-02 15:59:18 PDT
Note You need to log in before you can comment on or make changes to this bug.