When discussing with Phil the padding in NodeOrigin he suggested compressing CodeOrigin by putting the bytecodeIndex in the high bits of the InlineCallFrame* in the common case where they are not unreasonable big.
Created attachment 365111 [details] Patch Note: I have not tested this code on ARM, I am waiting to see what EWS says.
Comment on attachment 365111 [details] Patch Attachment 365111 [details] did not pass mac-ews (mac): Output: https://webkit-queues.webkit.org/results/11558057 New failing tests: js/date-set-to-nan.html
Created attachment 365117 [details] Archive of layout-test-results from ews100 for mac-highsierra The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews100 Port: mac-highsierra Platform: Mac OS X 10.13.6
I haven’t looked at the patch but I could guess we could probably just encode the CodeOrigin with two 32 bit words StructureID style. Not sure how much it matters too much one way or the other though.
Comment on attachment 365111 [details] Patch Attachment 365111 [details] did not pass mac-debug-ews (mac): Output: https://webkit-queues.webkit.org/results/11559251 New failing tests: js/date-set-to-nan.html
Created attachment 365132 [details] Archive of layout-test-results from ews114 for mac-highsierra The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-highsierra Platform: Mac OS X 10.13.6
Comment on attachment 365111 [details] Patch Attachment 365111 [details] did not pass mac-wk2-ews (mac-wk2): Output: https://webkit-queues.webkit.org/results/11560119 New failing tests: js/date-set-to-nan.html imported/w3c/web-platform-tests/mediacapture-record/MediaRecorder-constructor.html
Created attachment 365139 [details] Archive of layout-test-results from ews106 for mac-highsierra-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews106 Port: mac-highsierra-wk2 Platform: Mac OS X 10.13.6
Comment on attachment 365111 [details] Patch Attachment 365111 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: https://webkit-queues.webkit.org/results/11562607 New failing tests: js/dfg-arrayify-when-late-prevent-extensions.html
Created attachment 365147 [details] Archive of layout-test-results from ews125 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews125 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.6
Comment on attachment 365111 [details] Patch Attachment 365111 [details] did not pass jsc-ews (mac): Output: https://webkit-queues.webkit.org/results/11563059 New failing tests: mozilla-tests.yaml/ecma/GlobalObject/15.1.2.3-1.js.mozilla-baseline airjs-tests.yaml/stress-test.js.default airjs-tests.yaml/stress-test.js.ftl-no-cjit mozilla-tests.yaml/ecma/FunctionObjects/15.3.2.1-3.js.mozilla airjs-tests.yaml/stress-test.js.no-ftl modules.yaml/modules/module-jit-reachability.js.no-llint-modules stress/symbol-is-destructed-before-refing-underlying-symbol-impl.js.default mozilla-tests.yaml/js1_5/Regress/regress-111557.js.mozilla-baseline ChakraCore.yaml/ChakraCore/test/Object/defineProperty.js.default stress/regress-169445.js.dfg-eager-no-cjit-validate stress/broken-have-a-bad-time-with-arguments-for-gc-testing.js.misc-ftl-no-cjit mozilla-tests.yaml/ecma/GlobalObject/15.1.2.3-1.js.mozilla-dfg-eager-no-cjit-validate-phases mozilla-tests.yaml/ecma/FunctionObjects/15.3.5-1.js.mozilla mozilla-tests.yaml/ecma/FunctionObjects/15.3.1.1-3.js.mozilla modules.yaml/modules/module-jit-reachability.js.ftl-eager-no-cjit-modules mozilla-tests.yaml/js1_5/Regress/regress-111557.js.mozilla-dfg-eager-no-cjit-validate-phases stress/have-a-bad-time-with-arguments.js.misc-ftl-no-cjit mozilla-tests.yaml/ecma_3/RegExp/perlstress-001.js.mozilla-ftl-eager-no-cjit-validate-phases stress/async-iteration-basic.js.ftl-eager-no-cjit modules.yaml/modules/module-jit-reachability.js.no-cjit-validate-phases-modules ChakraCore.yaml/ChakraCore/test/Strings/long_concatstr.js.default mozilla-tests.yaml/ecma_3/RegExp/perlstress-002.js.mozilla-ftl-eager-no-cjit-validate-phases mozilla-tests.yaml/ecma_3/RegExp/perlstress-002.js.mozilla-dfg-eager-no-cjit-validate-phases mozilla-tests.yaml/ecma_3/RegExp/perlstress-002.js.mozilla-baseline stress/regress-169445.js.dfg-eager stress/regress-169445.js.ftl-eager airjs-tests.yaml/stress-test.js.ftl-eager-no-cjit mozilla-tests.yaml/ecma/GlobalObject/15.1.2.3-1.js.mozilla-ftl-eager-no-cjit-validate-phases stress/regress-169445.js.no-llint modules.yaml/modules/module-jit-reachability.js.ftl-no-cjit-validate-modules mozilla-tests.yaml/ecma_3/RegExp/perlstress-001.js.mozilla-baseline modules.yaml/modules/module-jit-reachability.js.ftl-no-cjit-small-pool-modules stress/regress-169445.js.ftl-eager-no-cjit modules.yaml/modules/module-jit-reachability.js.ftl-no-cjit-no-inline-validate-modules mozilla-tests.yaml/ecma_3/RegExp/perlstress-001.js.mozilla-dfg-eager-no-cjit-validate-phases stress/async-iteration-basic.js.no-llint stress/async-iteration-basic.js.dfg-eager-no-cjit-validate modules.yaml/modules/module-jit-reachability.js.dfg-eager-no-cjit-validate-modules mozilla-tests.yaml/js1_5/Regress/regress-111557.js.mozilla-ftl-eager-no-cjit-validate-phases stress/tagged-templates-raw-strings.js.ftl-eager stress/regress-169445.js.ftl-eager-no-cjit-b3o1 stress/async-iteration-basic.js.ftl-eager-no-cjit-b3o1 apiTests
Created attachment 365200 [details] Patch
I had two errors: - Missing move/copy assignment operators and constructors to make sure that each pointer to out-of-line storage is unique - The deletedMarker() returned a pointer with its lowest bit set, naturally causing confusion. A pointer of 0x10 (what I currently use) should work fines I believe that the entire first page of memory is left unused.
Created attachment 365215 [details] Patch
I just fixed a missing #ifdef, and moved OutOfLineCodeOrigin.
Comment on attachment 365215 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365215&action=review > Source/JavaScriptCore/ChangeLog:8 > + The trick is that pointers only take 45 bits on x86_64 in practice, and even less on ARM64. Don't you mean 48? > Source/JavaScriptCore/bytecode/CodeOrigin.h:78 > + if (UNLIKELY(this->m_compositeValue & 1)) Can we make the value "1" here some constant? Or perhaps have a getter saying if we're out of line? > Source/JavaScriptCore/bytecode/CodeOrigin.h:79 > + delete bitwise_cast<OutOfLineCodeOrigin*>(m_compositeValue & s_maskCompositeValueForPointer); can we have a getter called outOfLineCodeOrigin or something? Then this could be: if (isOutOfLine()) delete outOfLineCodeOrigin() > Source/JavaScriptCore/bytecode/CodeOrigin.h:124 > + return !(m_compositeValue & 2); can we give "2" a name? > Source/JavaScriptCore/bytecode/CodeOrigin.h:201 > + static const uintptr_t s_maskCompositeValueForPointer = 0x0000fffffffffff8; This seems wrong. It doesn't line up with s_freeBitsAtTop > Source/JavaScriptCore/bytecode/CodeOrigin.h:237 > + return reinterpret_cast<uintptr_t>(inlineCallFrame) | 2; style nit: bitwise_cast instead of reinterpret_cast here and elsewhere > Source/JavaScriptCore/bytecode/CodeOrigin.h:245 > + ASSERT(!(encodedBytecodeIndex & reinterpret_cast<uintptr_t>(inlineCallFrame))); can we also assert this in a stronger way with some mask for the max bytecode index bits? > Source/JavaScriptCore/bytecode/CodeOrigin.h:250 > +#if CPU(X86_64) || CPU(ARM64) You really want: ... || (CPU(ARM64) && ADDRESS(64)) Not just arm64 > Source/JavaScriptCore/bytecode/CodeOrigin.h:272 > +#if CPU(X86_64) || CPU(ARM64) instead of using this everywhere, can we just define some bit that stands for this type of memory model?
Created attachment 365251 [details] Patch
Thanks for the review! (In reply to Saam Barati from comment #16) > Comment on attachment 365215 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=365215&action=review > > > Source/JavaScriptCore/ChangeLog:8 > > + The trick is that pointers only take 45 bits on x86_64 in practice, and even less on ARM64. > > Don't you mean 48? I actually meant 45 because we can use the bottom 3 bits for alignment reasons. I clarified my comment. > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:78 > > + if (UNLIKELY(this->m_compositeValue & 1)) > > Can we make the value "1" here some constant? Done > > Or perhaps have a getter saying if we're out of line? > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:79 > > + delete bitwise_cast<OutOfLineCodeOrigin*>(m_compositeValue & s_maskCompositeValueForPointer); > > can we have a getter called outOfLineCodeOrigin or something? > > Then this could be: > > if (isOutOfLine()) delete outOfLineCodeOrigin() Done, and it looks a lot better. Thanks for the suggestion. > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:124 > > + return !(m_compositeValue & 2); > > can we give "2" a name? Done > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:201 > > + static const uintptr_t s_maskCompositeValueForPointer = 0x0000fffffffffff8; > > This seems wrong. It doesn't line up with s_freeBitsAtTop Oops, fixed. > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:237 > > + return reinterpret_cast<uintptr_t>(inlineCallFrame) | 2; > > style nit: bitwise_cast instead of reinterpret_cast here and elsewhere Done > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:245 > > + ASSERT(!(encodedBytecodeIndex & reinterpret_cast<uintptr_t>(inlineCallFrame))); > > can we also assert this in a stronger way with some mask for the max > bytecode index bits? I added an assert in the constructor, that the InlineCallFrame* fit within the mask. > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:250 > > +#if CPU(X86_64) || CPU(ARM64) > > You really want: > ... || (CPU(ARM64) && ADDRESS(64)) > > Not just arm64 > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:272 > > +#if CPU(X86_64) || CPU(ARM64) > > instead of using this everywhere, can we just define some bit that stands > for this type of memory model? Done, added HAS_FREE_BITS_AT_TOP_OF_POINTERS (the name is long, but I could not think of a shorter name that still fits).
> > > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:250 > > > +#if CPU(X86_64) || CPU(ARM64) > > > > You really want: > > ... || (CPU(ARM64) && ADDRESS(64)) > > > > Not just arm64 I think you can just use CPU(ADDRESS64) since that’s only ever true on 64-but platforms. > > > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:272 > > > +#if CPU(X86_64) || CPU(ARM64) > > > > instead of using this everywhere, can we just define some bit that stands > > for this type of memory model? > > Done, added HAS_FREE_BITS_AT_TOP_OF_POINTERS (the name is long, but I could > not think of a shorter name that still fits). I would just use CPU(ADDRESS64).
Comment on attachment 365251 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365251&action=review r=me > Source/JavaScriptCore/bytecode/CodeOrigin.h:43 > +#define HAS_FREE_BITS_AT_TOP_OF_POINTERS 0 can we undef this at the end of the header? > Source/JavaScriptCore/bytecode/CodeOrigin.h:108 > + // For the same reason we must provide explicit copy and move constructors. nit: Not sure we really need this comment given above comment sets the stage. Perhaps just drop assignment operators from above. Or you could remove the comments entirely. It's pretty self evident from reading the code what's going on. > Source/JavaScriptCore/bytecode/CodeOrigin.h:111 > + // We don't use the initializer list because it would not let us optimize the common case where there is no out-of-line storage initializer list? > Source/JavaScriptCore/bytecode/CodeOrigin.h:209 > + struct OutOfLineCodeOrigin { make this WTF_MAKE_FAST_ALLOCATED > Source/JavaScriptCore/bytecode/CodeOrigin.h:233 > + auto value = static_cast<uintptr_t>(1<<3); style: 1<<3 => 1 << 3 > Source/JavaScriptCore/bytecode/CodeOrigin.h:242 > + static const unsigned s_freeBitsAtTop = 16; nit: I'd make all of these constexpr. > Source/JavaScriptCore/bytecode/CodeOrigin.h:247 > + // On ARM64 cpus we only use the bottom 36 bits of pointers so we've got an enormous 28 free bits at the top. > + // If we could guarantee the value of Options::maximumOptimizationCandidateInstructionCount at compile-time, we could completely avoid the path for OutOfLine allocation. > + // Since we cannot guarantee that this option will never get ludicrously large, we just let this in-practice dead branch around. It should be trivial to predict anyway. Not sure this comment is really adding much since we use CodeOrigin outside of the compiler as well. So we'd need to never generate a CodeBlock w/ >= 2^28 instructions. > Source/JavaScriptCore/bytecode/CodeOrigin.h:269 > + // or a deletion marker for a hash table). style: remove this whitespace.
Thanks! (In reply to Saam Barati from comment #20) > Comment on attachment 365251 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=365251&action=review > > r=me > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:43 > > +#define HAS_FREE_BITS_AT_TOP_OF_POINTERS 0 > > can we undef this at the end of the header? I will follow Keith's advice and replace it by CPU(ADDRESS64) > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:108 > > + // For the same reason we must provide explicit copy and move constructors. > > nit: Not sure we really need this comment given above comment sets the > stage. Perhaps just drop assignment operators from above. > > Or you could remove the comments entirely. It's pretty self evident from > reading the code what's going on. I will remove the comments. I added them when this code was much less readable. > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:111 > > + // We don't use the initializer list because it would not let us optimize the common case where there is no out-of-line storage > > initializer list? I missed a word, it is called "member initializer list" (the part after ':' in a constructor). Will fix the comment. > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:209 > > + struct OutOfLineCodeOrigin { > > make this WTF_MAKE_FAST_ALLOCATED Will do. > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:233 > > + auto value = static_cast<uintptr_t>(1<<3); > > style: 1<<3 => 1 << 3 Will do. > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:242 > > + static const unsigned s_freeBitsAtTop = 16; > > nit: I'd make all of these constexpr. Will do. > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:247 > > + // On ARM64 cpus we only use the bottom 36 bits of pointers so we've got an enormous 28 free bits at the top. > > + // If we could guarantee the value of Options::maximumOptimizationCandidateInstructionCount at compile-time, we could completely avoid the path for OutOfLine allocation. > > + // Since we cannot guarantee that this option will never get ludicrously large, we just let this in-practice dead branch around. It should be trivial to predict anyway. > > Not sure this comment is really adding much since we use CodeOrigin outside > of the compiler as well. So we'd need to never generate a CodeBlock w/ >= > 2^28 instructions. I had not realized that we use CodeOrigin outside the compiler. I will remove the comment. > > > Source/JavaScriptCore/bytecode/CodeOrigin.h:269 > > + // or a deletion marker for a hash table). > > style: remove this whitespace. Will do.
Created attachment 365361 [details] Patch
I found the following comment in DFGNode.h: // NB. This class must have a trivial destructor. Since DFG::Node contains a NodeOrigin that contains two CodeOrigin that get a non-trivial destructor in this patch, I won't land this until I've understood why this comment exists.
Comment on attachment 365361 [details] Patch Turns out this comment was obsolete, see https://bugs.webkit.org/show_bug.cgi?id=196019
Comment on attachment 365361 [details] Patch Clearing flags on attachment: 365361 Committed r243232: <https://trac.webkit.org/changeset/243232>
All reviewed patches have been landed. Closing bug.
<rdar://problem/49076594>
This broke build on platforms different from x86_64 and arm64 due to: #if CPU(X86_64) && CPU(ADDRESS64) static constexpr unsigned s_freeBitsAtTop = 16; static constexpr uintptr_t s_maskCompositeValueForPointer = 0x0000fffffffffff8; #elif CPU(ARM64) && CPU(ADDRESS64) static constexpr unsigned s_freeBitsAtTop = 28; static constexpr uintptr_t s_maskCompositeValueForPointer = 0x0000000ffffffff8; #endif shouldn't be there some generic value for other arches (ppc64, ppc64le, s390x here)?
(In reply to Tomas Popela from comment #28) > This broke build on platforms different from x86_64 and arm64 due to: > > #if CPU(X86_64) && CPU(ADDRESS64) > static constexpr unsigned s_freeBitsAtTop = 16; > static constexpr uintptr_t s_maskCompositeValueForPointer = > 0x0000fffffffffff8; > #elif CPU(ARM64) && CPU(ADDRESS64) > static constexpr unsigned s_freeBitsAtTop = 28; > static constexpr uintptr_t s_maskCompositeValueForPointer = > 0x0000000ffffffff8; > #endif > > shouldn't be there some generic value for other arches (ppc64, ppc64le, > s390x here)? Tomas, those aren't officially supported platforms. One solution would be to not use the compression scheme if your platform isn't one of the officially supported ones. If you take a look at this patch, you should see that it's not too hard to factor that out. I suggest introducing a USE_COMPRESSED_CODE_ORIGIN at the top of CodeOrigin.h and checking for USE(COMPRESSED_CODE_ORIGIN) instead of CPU(ADDRESS64) at the places that uses the compressed scheme. Basically, for the !USE(COMPRESSED_CODE_ORIGIN) case, make it work like the current !CPU(ADDRESS64) case. Feel free to create a new bug and upload a patch to fix this.
(In reply to Mark Lam from comment #29) > Tomas, those aren't officially supported platforms. One solution would be > to not use the compression scheme if your platform isn't one of the > officially supported ones. If you take a look at this patch, you should see > that it's not too hard to factor that out. I suggest introducing a > USE_COMPRESSED_CODE_ORIGIN at the top of CodeOrigin.h and checking for > USE(COMPRESSED_CODE_ORIGIN) instead of CPU(ADDRESS64) at the places that > uses the compressed scheme. Basically, for the !USE(COMPRESSED_CODE_ORIGIN) > case, make it work like the current !CPU(ADDRESS64) case. > > Feel free to create a new bug and upload a patch to fix this. Thank you Mark, for hints! I didn't know whether the is some complexity hidden, but looks like it isn't, so I opened bug 196072 for that.