WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
177305
WebAssembly: cache memory address / size on instance
https://bugs.webkit.org/show_bug.cgi?id=177305
Summary
WebAssembly: cache memory address / size on instance
JF Bastien
Reported
2017-09-21 09:31:55 PDT
(From a FIXME that didn't have a bug) In Wasm::Memory, we should cache address / size to the instances to avoid a load on instance->instance calls. This will require updating all the instances when grow is called.
Attachments
Patch
(15.85 KB, patch)
2017-12-15 14:10 PST
,
GSkachkov
no flags
Details
Formatted Diff
Diff
Patch
(19.10 KB, patch)
2017-12-16 11:30 PST
,
GSkachkov
no flags
Details
Formatted Diff
Diff
Patch
(19.95 KB, patch)
2017-12-17 10:59 PST
,
GSkachkov
no flags
Details
Formatted Diff
Diff
Patch
(20.91 KB, patch)
2017-12-20 12:11 PST
,
GSkachkov
no flags
Details
Formatted Diff
Diff
Patch
(22.39 KB, patch)
2018-01-07 02:00 PST
,
GSkachkov
no flags
Details
Formatted Diff
Diff
Patch
(22.48 KB, patch)
2018-01-11 14:14 PST
,
GSkachkov
no flags
Details
Formatted Diff
Diff
Patch
(22.73 KB, patch)
2018-02-23 12:55 PST
,
GSkachkov
no flags
Details
Formatted Diff
Diff
Show Obsolete
(6)
View All
Add attachment
proposed patch, testcase, etc.
GSkachkov
Comment 1
2017-12-04 12:11:16 PST
I can try to fix this issue. Wasm::Instance will have to add property cachedMemory and cachedSize, but Wasm::Memory should contain links to Wasm::Instance where Wasm::Memory is used to iterate within `grow`, is it correct understating of the issue?
JF Bastien
Comment 2
2017-12-04 17:05:41 PST
(In reply to GSkachkov from
comment #1
)
> I can try to fix this issue.
Cool!
> Wasm::Instance will have to add property cachedMemory and cachedSize,
Yes, and use it in the JIT instead of doing the indirection.
> Wasm::Memory should contain links to Wasm::Instance where Wasm::Memory is > used to iterate within `grow`, is it correct understating of the issue?
Wasm::Memory can be used by multiple Wasm::Instance at the same time, so the Wasm::Memory has to know all of the Wasm::Instances that refer to it. The Instance owns the memory, so the back-pointers need to be weak references. Then yes, you would update all of them from grow.
GSkachkov
Comment 3
2017-12-15 14:10:15 PST
Created
attachment 329518
[details]
Patch WiP. Caching memory size & addess in wasm::Instance
JF Bastien
Comment 4
2017-12-15 14:11:03 PST
(In reply to GSkachkov from
comment #3
)
> Created
attachment 329518
[details]
> Patch > > WiP. Caching memory size & addess in wasm::Instance
You've marked as "review?". Is it WIP or ready for review?
GSkachkov
Comment 5
2017-12-15 14:15:14 PST
(In reply to JF Bastien from
comment #4
)
> (In reply to GSkachkov from
comment #3
) > > Created
attachment 329518
[details]
> > Patch > > > > WiP. Caching memory size & addess in wasm::Instance > > You've marked as "review?". Is it WIP or ready for review?
Sorry, it is WiP and I removed review flag, but I would like to have some feedback if my changes related to what is expected to do in this task.
GSkachkov
Comment 6
2017-12-16 11:30:31 PST
Created
attachment 329580
[details]
Patch Patch for review
GSkachkov
Comment 7
2017-12-17 10:59:47 PST
Created
attachment 329610
[details]
Patch Patch for review + small clearence
JF Bastien
Comment 8
2017-12-18 09:12:52 PST
Comment on
attachment 329610
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=329610&action=review
Very cool! That's the right direction, but I have a few comments. Also for testing you might want to look at the memory fuzz test I wrote in JSTests/wasm/fuzz
> Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:-1284 > - ASSERT(sizeRegs[0].sizeRegister != baseMemory);
I think we should keep this assert.
> Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:1281 > ASSERT(!sizeRegs[0].sizeOffset);
Can you also assert that sizeRegs.size() == 1 ? The code below assumes it is, and would be wrong if that weren't the case. You can like to FIXME
https://bugs.webkit.org/show_bug.cgi?id=162952
> Source/JavaScriptCore/wasm/WasmBinding.cpp:-68 > - ASSERT(!sizeRegs[0].sizeOffset); // The following code assumes we start at 0, and calculates subsequent size registers relative to 0.
Same here on keeping the assert, and adding one for
https://bugs.webkit.org/show_bug.cgi?id=162952
> Source/JavaScriptCore/wasm/WasmInstance.h:77 > + updateChechedMemory();
Typo "cached".
> Source/JavaScriptCore/wasm/WasmInstance.h:79 > + void updateChechedMemory()
Typo.
> Source/JavaScriptCore/wasm/WasmInstance.h:147 > + WeakPtrFactory<Instance> m_weakPtrFactory;
I would try to organize members so that the ones accessed by the JIT are together, at the start. That way we need only small immediates, and they'll be in the same cacheline when accessed. So the cached values would go after m_context, and the weak ptr factory after the call frame callback.
> Source/JavaScriptCore/wasm/js/JSToWasm.cpp:193 > + jit.loadWasmContextInstance(baseMemory);
This now happens twice if the memory wasn't signaling and we use fast TLS. It might be easier to have 4 code paths without any sharing, for all combinations of signal/no + fast TLS/no.
> JSTests/wasm/function-tests/memory-reuse.js:4 > +{
This scope doesn't seem useful?
> JSTests/wasm/function-tests/memory-reuse.js:8 > + function createWasmModule(memory) {
This returns instances, so I'd call it "createInstance".
> JSTests/wasm/function-tests/memory-reuse.js:41 > + for (let i = 0; i < buf.length/4; ++i) {
Spaces between divide. Same thing for other binary operators below.
> JSTests/wasm/function-tests/memory-reuse.js:56 > + const pageSize = 64 * 1024;
Hoist this to the top of the file, and remove the other one above.
> JSTests/wasm/function-tests/memory-reuse.js:58 > + const modules = [];
As above, I'd rename to "instances".
> JSTests/wasm/function-tests/memory-reuse.js:68 > + }
I think the start of the test is good, but you then need to grow the memory. There are two ways to do so: as the grow_memory opcode directly in WebAssembly IR, or through the JS API on WebAssembly.Memory. You should then re-run doCheck(). The difficulty in doing this test is that you want them memory's base void* to have moved (hard!), and the size to change (easy!). You're not really testing size because signaling memory does that implicitly. I think you want a VM primitive for testing that allows you to allocate a WebAssembly.Memory that is never signaling. That way to can test size by going out-of-bounds, catching the trap, then growing, and same out-of-bounds should not trap anymore. Now with a non-signaling memory you know that enough growth will cause us to re-allocate and memmove then void* memory too! After doing so, the tests should kill instances and force-invoke the GC to try to make sure they die, and then re-run the checks again.
GSkachkov
Comment 9
2017-12-20 12:11:48 PST
Created
attachment 329942
[details]
Patch Fix comments
GSkachkov
Comment 10
2017-12-20 12:15:03 PST
Comment on
attachment 329610
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=329610&action=review
>> Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:-1284 >> - ASSERT(sizeRegs[0].sizeRegister != baseMemory); > > I think we should keep this assert.
Done
>> Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:1281 >> ASSERT(!sizeRegs[0].sizeOffset); > > Can you also assert that sizeRegs.size() == 1 ? The code below assumes it is, and would be wrong if that weren't the case. You can like to FIXME
https://bugs.webkit.org/show_bug.cgi?id=162952
As I can see we have this assert, line 1283/1279
>> Source/JavaScriptCore/wasm/WasmBinding.cpp:-68 >> - ASSERT(!sizeRegs[0].sizeOffset); // The following code assumes we start at 0, and calculates subsequent size registers relative to 0. > > Same here on keeping the assert, and adding one for
https://bugs.webkit.org/show_bug.cgi?id=162952
Done
>> Source/JavaScriptCore/wasm/WasmInstance.h:77 >> + updateChechedMemory(); > > Typo "cached".
Fixed
>> Source/JavaScriptCore/wasm/WasmInstance.h:79 >> + void updateChechedMemory() > > Typo.
Fixed
>> Source/JavaScriptCore/wasm/WasmInstance.h:147 >> + WeakPtrFactory<Instance> m_weakPtrFactory; > > I would try to organize members so that the ones accessed by the JIT are together, at the start. That way we need only small immediates, and they'll be in the same cacheline when accessed. > > So the cached values would go after m_context, and the weak ptr factory after the call frame callback.
Done
>> Source/JavaScriptCore/wasm/js/JSToWasm.cpp:193 >> + jit.loadWasmContextInstance(baseMemory); > > This now happens twice if the memory wasn't signaling and we use fast TLS. > > It might be easier to have 4 code paths without any sharing, for all combinations of signal/no + fast TLS/no.
I've tried rewrite this, hope I did not lost something.
>> JSTests/wasm/function-tests/memory-reuse.js:4 >> +{ > > This scope doesn't seem useful?
Removed
>> JSTests/wasm/function-tests/memory-reuse.js:8 >> + function createWasmModule(memory) { > > This returns instances, so I'd call it "createInstance".
Fixed
>> JSTests/wasm/function-tests/memory-reuse.js:41 >> + for (let i = 0; i < buf.length/4; ++i) { > > Spaces between divide. Same thing for other binary operators below.
Fixed
>> JSTests/wasm/function-tests/memory-reuse.js:56 >> + const pageSize = 64 * 1024; > > Hoist this to the top of the file, and remove the other one above.
Done
>> JSTests/wasm/function-tests/memory-reuse.js:58 >> + const modules = []; > > As above, I'd rename to "instances".
Done
>> JSTests/wasm/function-tests/memory-reuse.js:68 >> + } > > I think the start of the test is good, but you then need to grow the memory. There are two ways to do so: as the grow_memory opcode directly in WebAssembly IR, or through the JS API on WebAssembly.Memory. You should then re-run doCheck(). > > The difficulty in doing this test is that you want them memory's base void* to have moved (hard!), and the size to change (easy!). You're not really testing size because signaling memory does that implicitly. I think you want a VM primitive for testing that allows you to allocate a WebAssembly.Memory that is never signaling. That way to can test size by going out-of-bounds, catching the trap, then growing, and same out-of-bounds should not trap anymore. Now with a non-signaling memory you know that enough growth will cause us to re-allocate and memmove then void* memory too! > > After doing so, the tests should kill instances and force-invoke the GC to try to make sure they die, and then re-run the checks again.
Rewritten, I hope it is close to what is expected
JF Bastien
Comment 11
2018-01-04 09:01:46 PST
This patch will conflict with a not-yet-committed but important patch. I don't think the conflict will be very bad, but if you don't mind I'd like to hold off committing this until the other patch can land. Once it's in I'll do the other round of reviews, because this is a really neat improvement that we definitely want!
GSkachkov
Comment 12
2018-01-04 10:02:30 PST
(In reply to JF Bastien from
comment #11
)
> This patch will conflict with a not-yet-committed but important patch. I > don't think the conflict will be very bad, but if you don't mind I'd like to > hold off committing this until the other patch can land. > > Once it's in I'll do the other round of reviews, because this is a really > neat improvement that we definitely want!
Yes sure, let me know when I need to rebase.
JF Bastien
Comment 13
2018-01-05 20:30:11 PST
(In reply to GSkachkov from
comment #12
)
> (In reply to JF Bastien from
comment #11
) > > This patch will conflict with a not-yet-committed but important patch. I > > don't think the conflict will be very bad, but if you don't mind I'd like to > > hold off committing this until the other patch can land. > > > > Once it's in I'll do the other round of reviews, because this is a really > > neat improvement that we definitely want! > > Yes sure, let me know when I need to rebase.
Here's the patch:
https://bugs.webkit.org/show_bug.cgi?id=181313
There's now also an index mask that needs to be cached in the same way as memory and size.
GSkachkov
Comment 14
2018-01-07 02:00:20 PST
Created
attachment 330654
[details]
Patch Rebased patch & using cachedIndexingMask property
GSkachkov
Comment 15
2018-01-07 03:03:45 PST
(In reply to JF Bastien from
comment #13
)
> (In reply to GSkachkov from
comment #12
) > > (In reply to JF Bastien from
comment #11
) > > > This patch will conflict with a not-yet-committed but important patch. I > > > don't think the conflict will be very bad, but if you don't mind I'd like to > > > hold off committing this until the other patch can land. > > > > > > Once it's in I'll do the other round of reviews, because this is a really > > > neat improvement that we definitely want! > > > > Yes sure, let me know when I need to rebase. > > Here's the patch: >
https://bugs.webkit.org/show_bug.cgi?id=181313
> > There's now also an index mask that needs to be cached in the same way as > memory and size.
There is a new rebased patch and I added caching of the index mask in the same way as did for other properties.
GSkachkov
Comment 16
2018-01-11 14:14:14 PST
Created
attachment 331117
[details]
Patch Fix error in added tests
GSkachkov
Comment 17
2018-01-25 13:54:55 PST
JF Bastien, Are there any performance tests that I can run for Wasm and check any difference with my patch and without?
JF Bastien
Comment 18
2018-01-25 13:56:58 PST
(In reply to GSkachkov from
comment #17
)
> JF Bastien, Are there any performance tests that I can run for Wasm and > check any difference with my patch and without?
We don't have public ones because of licenses. The WebAssembly paper used: - PolyBenchC-3.2 - SciMark2.1 It's also fairly easy to compile SPEC2k (non-Fortran parts) and run the training sets. Otherwise Emscripten has benchmarks as well.
GSkachkov
Comment 19
2018-01-26 12:56:58 PST
(In reply to JF Bastien from
comment #18
)
> (In reply to GSkachkov from
comment #17
) > > JF Bastien, Are there any performance tests that I can run for Wasm and > > check any difference with my patch and without? > > We don't have public ones because of licenses. The WebAssembly paper used: > - PolyBenchC-3.2 > - SciMark2.1 > > It's also fairly easy to compile SPEC2k (non-Fortran parts) and run the > training sets. > > Otherwise Emscripten has benchmarks as well.
Thanks for pointing to tests, will prepare some figures during the week.
JF Bastien
Comment 20
2018-02-20 14:31:44 PST
Comment on
attachment 331117
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=331117&action=review
This looks pretty good, with a few comments I'm ready to r=me. Also, did you run the benchmarks you said you would?
> Source/JavaScriptCore/wasm/WasmMemory.cpp:403 > + }
We never remove weak pointers that are now nullptr, correct? That's a (very slow) leak because m_instances never shrinks. I don't think grow() is the right place to remove null entries, though. Maybe in registerInstance you could first traverse m_instances looking for a null entry, reuse the entry if there's one, and if you can't find one then append?
> Source/JavaScriptCore/wasm/js/JSToWasm.cpp:181 > + GPRReg memoryGRP = Context::useFastTLS() ? baseMemory : wasmContextInstanceGPR;
I'd rename memoryGPR to currentInstanceGPR, since we're using the basemMemory register to hold the instance.
GSkachkov
Comment 21
2018-02-21 12:43:49 PST
(In reply to JF Bastien from
comment #20
)
> Comment on
attachment 331117
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=331117&action=review
> > This looks pretty good, with a few comments I'm ready to r=me. Also, did you > run the benchmarks you said you would? > > > Source/JavaScriptCore/wasm/WasmMemory.cpp:403 > > + } > > We never remove weak pointers that are now nullptr, correct? That's a (very > slow) leak because m_instances never shrinks. I don't think grow() is the > right place to remove null entries, though. Maybe in registerInstance you > could first traverse m_instances looking for a null entry, reuse the entry > if there's one, and if you can't find one then append? > > > Source/JavaScriptCore/wasm/js/JSToWasm.cpp:181 > > + GPRReg memoryGRP = Context::useFastTLS() ? baseMemory : wasmContextInstanceGPR; > > I'd rename memoryGPR to currentInstanceGPR, since we're using the > basemMemory register to hold the instance.
Thanks for the feedback, will try to fix comments soon. Ohh yes, I forget about perf tests. I downloaded tests, but had some troubles with running then switch to another task. I'll try provide figures with next patch.
GSkachkov
Comment 22
2018-02-23 10:16:34 PST
(In reply to JF Bastien from
comment #20
)
> Comment on
attachment 331117
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=331117&action=review
> > This looks pretty good, with a few comments I'm ready to r=me. Also, did you > run the benchmarks you said you would? >
There is a result for scimark2_1c: test was build with following parameters: emcc -O *.c -s WASM=1 -s ALLOW_MEMORY_GROWTH=1 -s ABORTING_MALLOC=0 -o scimark2_1c.html and checked by following url:
https://gskachkov.github.io/wasm-perf-tests/scimark2_1c/scimark2_1c.html
Result for Safari with Patch ** ** ** SciMark2 Numeric Benchmark, see
http://math.nist.gov/scimark
** ** for details. (Results can be submitted to
pozo@nist.gov
) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 1186.26 FFT Mflops: 1281.59 (N=1024) SOR Mflops: 1269.40 (100 x 100) MonteCarlo: Mflops: 184.56 Sparse matmult Mflops: 1254.28 (N=1000, nz=5000) LU Mflops: 1941.46 (M=100, N=100) Result for Safari without Patch: ** ** ** SciMark2 Numeric Benchmark, see
http://math.nist.gov/scimark
** ** for details. (Results can be submitted to
pozo@nist.gov
) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 1182.51 FFT Mflops: 1310.47 (N=1024) SOR Mflops: 1251.27 (100 x 100) MonteCarlo: Mflops: 184.94 Sparse matmult Mflops: 1231.30 (N=1000, nz=5000) LU Mflops: 1934.58 (M=100, N=100) Result from terminal: ** ** ** SciMark2 Numeric Benchmark, see
http://math.nist.gov/scimark
** ** for details. (Results can be submitted to
pozo@nist.gov
) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 1879.40 FFT Mflops: 1681.28 (N=1024) SOR Mflops: 1253.96 (100 x 100) MonteCarlo: Mflops: 488.40 Sparse matmult Mflops: 2064.39 (N=1000, nz=5000) LU Mflops: 3908.97 (M=100, N=100) Looks like my changes do not affects this test
GSkachkov
Comment 23
2018-02-23 12:55:06 PST
Created
attachment 334544
[details]
Patch
JF Bastien
Comment 24
2018-02-23 14:44:02 PST
Running the non-Fortran part of spec2k, I see roughly 5% throughput win on x86-64. Pretty nice!
JF Bastien
Comment 25
2018-02-23 14:44:38 PST
Comment on
attachment 334544
[details]
Patch r=me
GSkachkov
Comment 26
2018-02-23 15:07:08 PST
(In reply to JF Bastien from
comment #24
)
> Running the non-Fortran part of spec2k, I see roughly 5% throughput win on > x86-64. Pretty nice!
Cool! Thanks! Where I can download spec2k? I only found by line
https://www.spec.org/order.html
and I'm not sure what exact version I should download to run tests? CPU2006 V1.2?
JF Bastien
Comment 27
2018-02-23 15:09:53 PST
(In reply to GSkachkov from
comment #26
)
> (In reply to JF Bastien from
comment #24
) > > Running the non-Fortran part of spec2k, I see roughly 5% throughput win on > > x86-64. Pretty nice! > > Cool! Thanks! > Where I can download spec2k? I only found by line >
https://www.spec.org/order.html
and I'm not sure what exact version I should > download to run tests? CPU2006 V1.2?
We have a access to it internally, but cannot make it public because of licensing. I'm using CPU2000 because 2006 uses well over 1GiB of memory on many tests (unless you benchmark the training set).
WebKit Commit Bot
Comment 28
2018-02-23 15:17:03 PST
Comment on
attachment 334544
[details]
Patch Clearing flags on attachment: 334544 Committed
r228966
: <
https://trac.webkit.org/changeset/228966
>
WebKit Commit Bot
Comment 29
2018-02-23 15:17:05 PST
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 30
2018-02-23 15:18:25 PST
<
rdar://problem/37841252
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug