WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
196091
B3::Value should have different kinds of adjacency lists
https://bugs.webkit.org/show_bug.cgi?id=196091
Summary
B3::Value should have different kinds of adjacency lists
Robin Morisset
Reported
2019-03-21 10:43:07 PDT
B3::Value currently uses Vector<Value*, 3> as its adjacency list, taking 40 bytes no matter the type of value. In particular, things like constants (which are very common) waste 40 of the 80 bytes they use. And a similar although less extreme thing is true of Identity (wastes 32 bytes out of 72), binary operations (waste 24 bytes out of 72), etc.. Since many kind of b3 values are already subclasses of B3::Value, I hope to try the following plan: - Remove setOpcodeUnsafely and setKindUnsafely. They are only used in two places and are fairly easy to fix - Replace all code "children().size()" by numChildren(). - Sort every kind of b3 value into one subclass of B3Value (creating some subclasses for the values that are currently represented by a bare B3::Value*) - Make B3::Value an abstract class, and the methods that access m_children pure virtual, to be implemented by the subclasses (or by some intermediate ValueImpl that the subclasses inherit from and that inherit from Value) - Split ValueImpl into three: ValueImpl<FixedAdjacencyList<n>>, ValueImpl<BoundedAdjacencyList<n>> and ValueImpl<UnboundedAdjacencyList>, using respectively a bare array, an array + a size, or a Vector<> as their implementation of m_children. One slightly tricky part is the following functions: - replaceWithOops - replaceWithPhi - replaceWithNop - replaceWithNopIgnoringType - replaceWithJump - replaceWithBottom - replaceWithIdentity The implementations will have to be moved to a templated function that does a static_assert that the actual parameterized ValueType is large enough to hold the value we are trying to store in it (in other words, we want to make sure that Oops/Phi/Nop/Jump/Bottom/Identity all have the same size, and are the smallest of B3Values). Then the actual methods will become pure virtual, and have to be implemented by a call to this function in the subclasses, enforcing that all subclasses must be large enough to fit an Identity (which will probably require one word of padding for Oops/Jump/etc..). I will have to verify that the overhead of calling these virtual functions is negligible. I am optimistic because we only rarely access the children of a B3::Value without first switching on its opcode, at which point we know its actual type, and can use it if we see a performance cost.
Attachments
WIP
(65.58 KB, patch)
2019-03-22 19:11 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
WIP_nearlyDone
(78.69 KB, patch)
2019-03-25 11:51 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
WIP_reallyNearlyDone
(77.96 KB, patch)
2019-03-25 12:46 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Patch
(76.08 KB, patch)
2019-03-25 13:00 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Patch
(84.24 KB, patch)
2019-03-25 17:51 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Patch
(84.21 KB, patch)
2019-03-25 17:58 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Patch
(83.86 KB, patch)
2019-03-27 15:47 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Patch
(83.92 KB, patch)
2019-03-29 12:38 PDT
,
Robin Morisset
fpizlo
: review+
Details
Formatted Diff
Diff
Numbers file measuring the JetStream2 impact
(418.56 KB, application/zip)
2019-04-15 14:23 PDT
,
Robin Morisset
no flags
Details
PatchForLanding
(83.91 KB, patch)
2019-04-15 16:13 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Show Obsolete
(8)
View All
Add attachment
proposed patch, testcase, etc.
Filip Pizlo
Comment 1
2019-03-21 13:46:24 PDT
(In reply to Robin Morisset from
comment #0
)
> B3::Value currently uses Vector<Value*, 3> as its adjacency list, taking 40 > bytes no matter the type of value. > In particular, things like constants (which are very common) waste 40 of the > 80 bytes they use. And a similar although less extreme thing is true of > Identity (wastes 32 bytes out of 72), binary operations (waste 24 bytes out > of 72), etc.. > Since many kind of b3 values are already subclasses of B3::Value, I hope to > try the following plan: > - Remove setOpcodeUnsafely and setKindUnsafely. They are only used in two > places and are fairly easy to fix > - Replace all code "children().size()" by numChildren(). > - Sort every kind of b3 value into one subclass of B3Value (creating some > subclasses for the values that are currently represented by a bare > B3::Value*) > - Make B3::Value an abstract class, and the methods that access m_children > pure virtual, to be implemented by the subclasses (or by some intermediate > ValueImpl that the subclasses inherit from and that inherit from Value) > - Split ValueImpl into three: ValueImpl<FixedAdjacencyList<n>>, > ValueImpl<BoundedAdjacencyList<n>> and ValueImpl<UnboundedAdjacencyList>, > using respectively a bare array, an array + a size, or a Vector<> as their > implementation of m_children.
How about do this without any virtual anything? 1) Value allocated its adjacency list as a variable length array at the tail of Value rather than a Vector with inline capacity. 2) have something much more clever for out-of-line children, for example where every access branches (small index -> inline, large index -> out of line).
> > One slightly tricky part is the following functions: > - replaceWithOops > - replaceWithPhi > - replaceWithNop > - replaceWithNopIgnoringType > - replaceWithJump > - replaceWithBottom > - replaceWithIdentity > The implementations will have to be moved to a templated function that does > a static_assert that the actual parameterized ValueType is large enough to > hold the value we are trying to store in it (in other words, we want to make > sure that Oops/Phi/Nop/Jump/Bottom/Identity all have the same size, and are > the smallest of B3Values). > Then the actual methods will become pure virtual, and have to be implemented > by a call to this function in the subclasses, enforcing that all subclasses > must be large enough to fit an Identity (which will probably require one > word of padding for Oops/Jump/etc..). > > I will have to verify that the overhead of calling these virtual functions > is negligible. I am optimistic because we only rarely access the children of > a B3::Value without first switching on its opcode, at which point we know > its actual type, and can use it if we see a performance cost.
Robin Morisset
Comment 2
2019-03-21 13:55:31 PDT
> > How about do this without any virtual anything? > > 1) Value allocated its adjacency list as a variable length array at the tail > of Value rather than a Vector with inline capacity.
This was my original idea, the problem is that there are already quite a few subclasses of Value, and they add fields to it. For example constants add the constant value, MemoryValue adds a heap range, a fence range and an offset, etc.. So it seems to me that these fields would be after the end of the fields of Value, and thus conflict with the variable size array.
Filip Pizlo
Comment 3
2019-03-21 13:57:13 PDT
(In reply to Robin Morisset from
comment #2
)
> > > > How about do this without any virtual anything? > > > > 1) Value allocated its adjacency list as a variable length array at the tail > > of Value rather than a Vector with inline capacity. > > This was my original idea, the problem is that there are already quite a few > subclasses of Value, and they add fields to it. For example constants add > the constant value, MemoryValue adds a heap range, a fence range and an > offset, etc.. > So it seems to me that these fields would be after the end of the fields of > Value, and thus conflict with the variable size array.
Then prepend the adjaceny list to Valur and make it negatively indexed. :-)
Robin Morisset
Comment 4
2019-03-21 14:05:46 PDT
(In reply to Filip Pizlo from
comment #3
)
> (In reply to Robin Morisset from
comment #2
) > > > > > > How about do this without any virtual anything? > > > > > > 1) Value allocated its adjacency list as a variable length array at the tail > > > of Value rather than a Vector with inline capacity. > > > > This was my original idea, the problem is that there are already quite a few > > subclasses of Value, and they add fields to it. For example constants add > > the constant value, MemoryValue adds a heap range, a fence range and an > > offset, etc.. > > So it seems to me that these fields would be after the end of the fields of > > Value, and thus conflict with the variable size array. > > Then prepend the adjaceny list to Valur and make it negatively indexed. :-)
I had not thought of that trick. Thanks, I'll change my approach to that.
Robin Morisset
Comment 5
2019-03-21 14:49:14 PDT
The size of the children (0, 1, 2, 3, or varargs) will be stored in a field added in the free byte between m_type and m_origin. In order to correctly support replaceWithIdentity (which needs one child), even values with no children will have one extra free word allocated before them (we could optionally make constants store their value there eventually, in a separate patch).
Robin Morisset
Comment 6
2019-03-22 19:11:09 PDT
Created
attachment 365794
[details]
WIP Not ready for review, just sending it as a backup. Still needs to be done: - Adding an explicit copy constructor that calls buildAdjacencyList (this currently causes the patch to fail testb3) - probably more bugs still to fix.. - Add an optimization where the methods when called on a subclass don't need to test whether or not the value is a varargs - General clean-up, moving things to the .cpp from the headers, etc..
Robin Morisset
Comment 7
2019-03-25 11:51:45 PDT
Created
attachment 365879
[details]
WIP_nearlyDone The last parts remaining are running benchmarks to get performance numbers, and writing the changelog.
EWS Watchlist
Comment 8
2019-03-25 11:57:30 PDT
Attachment 365879
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/b3/B3Const32Value.h:87: Wrong number of spaces before statement. (expected: 8) [whitespace/indent] [4] ERROR: Source/JavaScriptCore/b3/B3Const32Value.h:88: Wrong number of spaces before statement. (expected: 8) [whitespace/indent] [4] ERROR: Source/JavaScriptCore/b3/B3Value.h:125: An else statement can be removed when the prior "if" concludes with a return, break, continue or goto statement. [readability/control_flow] [4] ERROR: Source/JavaScriptCore/b3/B3Value.h:135: An else statement can be removed when the prior "if" concludes with a return, break, continue or goto statement. [readability/control_flow] [4] ERROR: Source/JavaScriptCore/b3/B3Value.h:385: A case label should not be indented, but line up with its switch statement. [whitespace/indent] [4] ERROR: Source/JavaScriptCore/b3/B3Value.h:529: Extra space before last semicolon. If this should be an empty statement, use { } instead. [whitespace/semicolon] [5] ERROR: Source/JavaScriptCore/b3/B3ValueInlines.h:55: Missing space before ( in switch( [whitespace/parens] [5] ERROR: Source/JavaScriptCore/b3/B3ValueInlines.h:167: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/B3ValueInlines.h:173: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/B3MoveConstants.cpp:42: Bad include order. Mixing system and custom headers. [build/include_order] [4] ERROR: Source/JavaScriptCore/b3/B3CCallValue.h:60: Should have a space between // and comment [whitespace/comments] [4] ERROR: Source/JavaScriptCore/b3/B3CCallValue.h:69: Should have a space between // and comment [whitespace/comments] [4] ERROR: Source/JavaScriptCore/b3/B3Const64Value.h:87: Wrong number of spaces before statement. (expected: 8) [whitespace/indent] [4] ERROR: Source/JavaScriptCore/b3/B3Const64Value.h:88: Wrong number of spaces before statement. (expected: 8) [whitespace/indent] [4] Total errors found: 14 in 38 files If any of these errors are false positives, please file a bug against check-webkit-style.
Robin Morisset
Comment 9
2019-03-25 12:46:02 PDT
Created
attachment 365884
[details]
WIP_reallyNearlyDone Fix the build and lint issues.
Robin Morisset
Comment 10
2019-03-25 13:00:13 PDT
Created
attachment 365887
[details]
Patch Fixing one last build issue (a missing import of wtf/Vector.h). Really surprising that they did not show up on my machine.
Robin Morisset
Comment 11
2019-03-25 17:51:11 PDT
Created
attachment 365928
[details]
Patch - Added a changelog - Fixed the build issue (it was just a missing import of wtf/Vector.h) - Made B3::Value WTF_MAKE_FAST_ALLOCATED again (I had removed it because my new did not match my delete temporarily and forgot about it).
Robin Morisset
Comment 12
2019-03-25 17:54:36 PDT
I launched two A/B tasks to see the effect on RAMification and JetStream2:
https://perf-safari.apple.com/v3/#/analysis/task/6342
https://perf-safari.apple.com/v3/#/analysis/task/6341
I think my earlier confusing results were due to having deleted WTF_MAKE_FAST_ALLOCATED.
Robin Morisset
Comment 13
2019-03-25 17:58:44 PDT
Created
attachment 365931
[details]
Patch And this time putting back #include <wtf/FastMalloc.h> before the bots complain (for some reason it compiled find on my system despite lacking it).
Robin Morisset
Comment 14
2019-03-26 13:33:20 PDT
This was a regression on RAMification, I think because I removed the inline capacity in the VarArgs case. I am now looking at adding it back, ideally tuning it per opcode, and depending on whether we are creating a value from scratch or copying another one.
Robin Morisset
Comment 15
2019-03-27 15:47:35 PDT
Created
attachment 366116
[details]
Patch This uses an inline capacity of 3 (as before) for the VarArgs case. I have an alternative patch that customizes the inline capacity to the opcode, but the implementation gets extremely gross, for (I think) little benefit. A global run on RAMification was inconclusive (dominated by noise in LuaJSFight among other things), so I ran 4 runs of RAMification on my machine, only looking at the peak footprint of the wasm benchmarks, after having disabled their call to main to only measure the compilation footprint. The result was a rather disappointing 1% reduction in peak footprint. I'd like to land this since it is still an improvement, and monitor the bots for any regression/progression.
Robin Morisset
Comment 16
2019-03-29 12:38:21 PDT
Created
attachment 366298
[details]
Patch Almost the same as the last one, just add a missing cast to (hopefully) fix the build failure on the gtk bot.
Saam Barati
Comment 17
2019-04-04 16:22:30 PDT
What’s the results on memory on RAMification?
Robin Morisset
Comment 18
2019-04-04 16:59:20 PDT
(In reply to Saam Barati from
comment #17
)
> What’s the results on memory on RAMification?
It is a regression, purely from LuaJSFight, which this has no way of altering in any way.. I'll relaunch the bots with more repetitions to bring the noise down.
Robin Morisset
Comment 19
2019-04-15 14:22:47 PDT
Here are some basic results from running JetStream2 with some instrumentation in B3Generate.cpp This patch only saves an average of 13kB per function being compiled to B3. There are some significant outliers, here is the top 10: 57262 1781296 Tsf-wasm 13780 403232 OfflineAssembler 9715 271584 Regexp 7331 205880 Regexp 6237 184896 Gbemu 5361 163424 Mandreel 5559 160184 Regexp 5490 157752 Typescript 5191 145760 Typescript 5077 144680 Typescript (left: number of values allocated, middle: number of bytes saved on those, right: jetstream2 benchmark it came from) Here is the top outlier per benchmark that reach B3: 3264 98424 3d-cube-sp 3827 119768 3d-raytrace-SP 3055 89216 Acorn-wtb 1560 44784 ai-astar 713 23744 Air 1691 49912 async-fs 2856 81608 babylon 3130 86392 babylon-wtb 1005 30568 base64-SP 1120 32296 Basic 2724 75528 Box2D 1842 53288 Cdjs 1358 40008 Chai-wtb 817 24280 coffeescript-wtb 676 19312 Crypto 1769 51408 crypto-aes-sp 4771 138160 crypto-md5-sp 1104 32592 crypto-sha1-SP 2703 82136 date-format-tofte-sp 1645 49176 Date-format-xparb-sp 1231 35768 delta-blue 647 19016 earley-boyer 2407 71424 Espree-wtb 1087 32520 flightPlanner 1108 33176 Gaussian-blur 6237 184896 Gbemu 1079 30944 gcc-loops-wasm 1136 32576 hash-map 3044 90432 HashSet-wasm 3002 88296 jshint-wtb 3147 90416 json-stringify-inspector (only function to reach B3) 1609 49736 lebab-wtb 5361 163424 Mandreel 1095 31632 ML 637 17496 N-body-SP 690 17784 Navier-stokes 4075 125600 Octane-zlib 13780 403232 OfflineAssembler 3563 107248 Pdfs 1084 30632 Prepack-wtb 118 3632 quicksort-wasm 3400 98560 Raytrace 9715 271584 Regexp 997 28472 Richards 308 9480 Richards-wasm 986 30280 splay 1268 35336 Stanford-crypto-aes 1098 31288 Stanford-crypto-pbkdf2 646 19400 Stanford-crypto-sha256 720 20264 string-unpack-code-SP 1021 30800 tagcloud-SP 57262 1781296 Tsf-wasm 5490 157752 Typescript 1163 37080 Uglify-js-wtb 2908 83256 UniPoker 3251 92808 WSL
Robin Morisset
Comment 20
2019-04-15 14:23:44 PDT
Created
attachment 367458
[details]
Numbers file measuring the JetStream2 impact From which the comment above was written.
Robin Morisset
Comment 21
2019-04-15 16:13:29 PDT
Created
attachment 367474
[details]
PatchForLanding
WebKit Commit Bot
Comment 22
2019-04-15 16:53:29 PDT
Comment on
attachment 367474
[details]
PatchForLanding Clearing flags on attachment: 367474 Committed
r244309
: <
https://trac.webkit.org/changeset/244309
>
WebKit Commit Bot
Comment 23
2019-04-15 16:53:31 PDT
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 24
2019-04-15 16:54:21 PDT
<
rdar://problem/49923813
>
Robin Morisset
Comment 25
2019-07-12 15:47:45 PDT
***
Bug 196093
has been marked as a duplicate of this bug. ***
Robin Morisset
Comment 26
2019-07-12 15:48:12 PDT
***
Bug 196092
has been marked as a duplicate of this bug. ***
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug