RESOLVED FIXED199891
[X86] Emit BT instruction for shift + mask in B3
https://bugs.webkit.org/show_bug.cgi?id=199891
Summary [X86] Emit BT instruction for shift + mask in B3
Justin Michaud
Reported 2019-07-17 17:15:36 PDT
We should emit the BT instruction for patterns like: if (a & (1<<b)) if ((a>>b)&1) if ((~a>>b)&1) if (~a & (1<<b))
Attachments
Patch (29.21 KB, patch)
2019-07-17 17:25 PDT, Justin Michaud
no flags
Patch (29.29 KB, patch)
2019-07-17 17:34 PDT, Justin Michaud
no flags
Patch (32.83 KB, patch)
2019-07-18 10:28 PDT, Justin Michaud
no flags
Archive of layout-test-results from ews115 for mac-highsierra (2.81 MB, application/zip)
2019-07-18 12:29 PDT, EWS Watchlist
no flags
Patch (33.02 KB, patch)
2019-07-22 19:50 PDT, Justin Michaud
no flags
Patch (32.94 KB, patch)
2019-07-26 17:10 PDT, Justin Michaud
no flags
Patch (33.03 KB, patch)
2019-07-26 19:50 PDT, Justin Michaud
no flags
patch (33.03 KB, patch)
2019-07-26 19:52 PDT, Justin Michaud
no flags
Patch (1.17 KB, patch)
2019-07-29 10:53 PDT, Justin Michaud
rmorisset: review+
patch (1.23 KB, patch)
2019-07-29 11:12 PDT, Justin Michaud
no flags
Justin Michaud
Comment 1 2019-07-17 17:25:30 PDT
Justin Michaud
Comment 2 2019-07-17 17:34:57 PDT
Robin Morisset
Comment 3 2019-07-17 17:57:49 PDT
Comment on attachment 374357 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=374357&action=review > Source/JavaScriptCore/assembler/testmasm.cpp:1300 > +#endif Is your code tested at all on 32-bits x86 ? > Source/JavaScriptCore/b3/B3LowerToAir.cpp:3333 > + Air::Opcode opcode = opcodeForType(BranchTestBit32, BranchTestBit64, andValue->type()); What happens to these Air opcodes on ARM? > Source/JavaScriptCore/b3/B3LowerToAir.cpp:3341 > + // Turn if ((val >> x)&1) -> Bt val x You have three cases in this file: if ((val >> x)&1) -> Bt val x if (val & (1<<x)) -> Bt val x if (~val & (1<<x)) But you have 4 cases in the changelog: if (a & (1<<b)) if ((a>>b)&1) if ((~a>>b)&1) if (~a & (1<<b)) It looks like you are missing if((~a>>b)&1) here. And in bit-test-non-constant, you test: if ((number>>bit)&1) if (((~number)>>(bit+1))&1) if (number & (1<<(bit-1))) So you're both testing a case you're not optimizing (~a>>b)&1, and not testing a case that you are optimizing (~a & (1<<b)). In theory you could even add a fifth case if you want to be exhaustive: ~(a>>b)&1 > Source/JavaScriptCore/b3/B3LowerToAir.cpp:3342 > + // For shifts, we always mask the mask with 31/61 so we do not need to worry about large values of x I am not entirely sure I understand this comment. The semantic of shifts at B3 indeed requires this masking, but where is it actually applied in the code below?
Justin Michaud
Comment 4 2019-07-17 19:34:03 PDT
(In reply to Robin Morisset from comment #3) > Comment on attachment 374357 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=374357&action=review > > > Source/JavaScriptCore/assembler/testmasm.cpp:1300 > > +#endif > > Is your code tested at all on 32-bits x86 ? No, I don't have any hardware > > Source/JavaScriptCore/b3/B3LowerToAir.cpp:3333 > > + Air::Opcode opcode = opcodeForType(BranchTestBit32, BranchTestBit64, andValue->type()); > > What happens to these Air opcodes on ARM? They will have isValidForm() = false. I will double check that all of the tests run correctly. > > Source/JavaScriptCore/b3/B3LowerToAir.cpp:3341 > > + // Turn if ((val >> x)&1) -> Bt val x > > You have three cases in this file: > if ((val >> x)&1) -> Bt val x > if (val & (1<<x)) -> Bt val x > if (~val & (1<<x)) > But you have 4 cases in the changelog: > if (a & (1<<b)) > if ((a>>b)&1) > if ((~a>>b)&1) > if (~a & (1<<b)) > It looks like you are missing if((~a>>b)&1) here. The comment is misleading. The case: >if (bitBase && bitBase->opcode() == BitXor && (bitBase->child(1)->isInt32(-1) || bitBase->child(1)->isInt64(-1l))) { Runs after bitBase is set, so it works in both cases. > And in bit-test-non-constant, you test: > if ((number>>bit)&1) > if (((~number)>>(bit+1))&1) > if (number & (1<<(bit-1))) > So you're both testing a case you're not optimizing (~a>>b)&1, and not > testing a case that you are optimizing (~a & (1<<b)). I could add another test for the second case, but I verified the first case does get optimized. > In theory you could even add a fifth case if you want to be exhaustive: > ~(a>>b)&1 This complicates the commitInternal logic, so I avoided it. I will take a second look. > > Source/JavaScriptCore/b3/B3LowerToAir.cpp:3342 > > + // For shifts, we always mask the mask with 31/61 so we do not need to worry about large values of x > > I am not entirely sure I understand this comment. The semantic of shifts at > B3 indeed requires this masking, but where is it actually applied in the > code below? The comment should probably be removed. The masking happens in the air opcode / x86 bt instruction
Justin Michaud
Comment 5 2019-07-18 10:28:34 PDT
EWS Watchlist
Comment 6 2019-07-18 12:29:25 PDT
Comment on attachment 374394 [details] Patch Attachment 374394 [details] did not pass mac-debug-ews (mac): Output: https://webkit-queues.webkit.org/results/12767403 New failing tests: storage/indexeddb/pending-version-change-on-exit-private.html
EWS Watchlist
Comment 7 2019-07-18 12:29:27 PDT
Created attachment 374404 [details] Archive of layout-test-results from ews115 for mac-highsierra The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews115 Port: mac-highsierra Platform: Mac OS X 10.13.6
Justin Michaud
Comment 8 2019-07-18 12:39:54 PDT
Test failures seem unrelated
Saam Barati
Comment 9 2019-07-22 18:01:13 PDT
Comment on attachment 374394 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=374394&action=review > Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h:2649 > + m_assembler.bt_ir(static_cast<unsigned>(bit.m_value)%32, reg); add a space between ) and % and 3 > Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h:2659 > + m_assembler.bt_im(static_cast<unsigned>(bit.m_value)%32, bitBase.offset, bitBase.base); ditto > Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h:1108 > + m_assembler.btw_ir(static_cast<unsigned>(bit.m_value)%64, bitBase); style: spacing > Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h:1118 > + m_assembler.btw_im(static_cast<unsigned>(bit.m_value)%64, bitBase.offset, bitBase.base); ditto > Source/JavaScriptCore/b3/B3LowerToAir.cpp:3343 > + && andValue->opcode() == BitXor && (andValue->child(1)->isInt32(-1) || andValue->child(1)->isInt64(-1l)) && canBeInternal(andValue) -1l => -1? Seems easier to read > Source/JavaScriptCore/b3/B3LowerToAir.cpp:3363 > + bitOffset = andMask->child(1); Does this work for RotL if andMask->child(1) is > 32/64? Can you add a test for the expected behavior? Ditto above for RotR > Source/JavaScriptCore/b3/B3LowerToAir.cpp:3368 > + if (!negationNode && bitBase && bitBase->opcode() == BitXor && (bitBase->child(1)->isInt32(-1) || bitBase->child(1)->isInt64(-1l))) { I think you need to check that bitBase canBeInternal
Justin Michaud
Comment 10 2019-07-22 19:06:00 PDT
(In reply to Saam Barati from comment #9) > Comment on attachment 374394 [details] > Patch > > > Source/JavaScriptCore/b3/B3LowerToAir.cpp:3343 > > + && andValue->opcode() == BitXor && (andValue->child(1)->isInt32(-1) || andValue->child(1)->isInt64(-1l)) && canBeInternal(andValue) > > -1l => -1? Seems easier to read Different semantics. -1 is 0x0FFFFFFFF, but -1l is all 1s. > > > Source/JavaScriptCore/b3/B3LowerToAir.cpp:3363 > > + bitOffset = andMask->child(1); > > Does this work for RotL if andMask->child(1) is > 32/64? Can you add a test > for the expected behavior? Ditto above for RotR Good point, just going to remove it if that's OK with you.
Justin Michaud
Comment 11 2019-07-22 19:50:06 PDT
EWS Watchlist
Comment 12 2019-07-22 22:15:21 PDT
Comment on attachment 374666 [details] Patch Attachment 374666 [details] did not pass jsc-ews (mac): Output: https://webkit-queues.webkit.org/results/12792718 New failing tests: mozilla-tests.yaml/js1_5/Array/regress-101964.js.mozilla-dfg-eager-no-cjit-validate-phases
Filip Pizlo
Comment 13 2019-07-26 11:41:52 PDT
Comment on attachment 374666 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=374666&action=review > Source/JavaScriptCore/b3/B3LowerToAir.cpp:3388 > + } This needs an else case for inclusive matching, or it needs to just use the commit unconditionally. I recommend committing unconditionally since that's just easier.
Justin Michaud
Comment 14 2019-07-26 17:10:45 PDT
Keith Miller
Comment 15 2019-07-26 17:58:57 PDT
Comment on attachment 375001 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=375001&action=review r=me. > Source/JavaScriptCore/b3/B3LowerToAir.cpp:3335 > + Value* bitBase = nullptr; Can we call this value or testValue? bitBase is kinda confusing. It would probably be good to change this everywhere. > Source/JavaScriptCore/b3/B3LowerToAir.cpp:3375 > + for (auto& basePromise : Vector<ArgPromise>::from(loadPromise(bitBase), tmpPromise(bitBase))) { I think you can just do: for (auto& basePromise : { loadPromise(bitBase), tmpPromise(bitBase) }) { You also won't do a heap allocation that way. Although, maybe the compiler will elide that?
Justin Michaud
Comment 16 2019-07-26 19:10:43 PDT
(In reply to Keith Miller from comment #15) > > I think you can just do: > > for (auto& basePromise : { loadPromise(bitBase), tmpPromise(bitBase) }) { > > You also won't do a heap allocation that way. Although, maybe the compiler > will elide that? It needs to be mutable, so an initializer list won't work :(
Justin Michaud
Comment 17 2019-07-26 19:50:30 PDT
Justin Michaud
Comment 18 2019-07-26 19:52:13 PDT
WebKit Commit Bot
Comment 19 2019-07-27 00:08:08 PDT
Comment on attachment 375012 [details] patch Clearing flags on attachment: 375012 Committed r247889: <https://trac.webkit.org/changeset/247889>
WebKit Commit Bot
Comment 20 2019-07-27 00:08:10 PDT
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 21 2019-07-27 00:09:19 PDT
Ryan Haddad
Comment 22 2019-07-29 09:34:56 PDT
microbenchmarks/bit-test-load.js.ftl-no-cjit-small-pool is timing out on the debug JSC bot. It is passing on the release bot, so maybe we need to skip it on debug because it is very slow?
Justin Michaud
Comment 23 2019-07-29 10:45:17 PDT
(In reply to Ryan Haddad from comment #22) > microbenchmarks/bit-test-load.js.ftl-no-cjit-small-pool is timing out on the > debug JSC bot. > > It is passing on the release bot, so maybe we need to skip it on debug > because it is very slow? I will just cut the number of iterations down. Just a sec.
Justin Michaud
Comment 24 2019-07-29 10:53:57 PDT
Reopening to attach new patch.
Justin Michaud
Comment 25 2019-07-29 10:53:58 PDT
Robin Morisset
Comment 26 2019-07-29 11:08:19 PDT
Comment on attachment 375084 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=375084&action=review r=me > JSTests/ChangeLog:8 > + * microbenchmarks/bit-test-load.js: Maybe just add a line to the changelog?
Justin Michaud
Comment 27 2019-07-29 11:12:02 PDT
WebKit Commit Bot
Comment 28 2019-07-29 11:54:52 PDT
Comment on attachment 375086 [details] patch Clearing flags on attachment: 375086 Committed r247913: <https://trac.webkit.org/changeset/247913>
WebKit Commit Bot
Comment 29 2019-07-29 11:54:54 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.