WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 177281
Add Above/Below comparisons for UInt32 patterns
https://bugs.webkit.org/show_bug.cgi?id=177281
Summary
Add Above/Below comparisons for UInt32 patterns
Yusuke Suzuki
Reported
2017-09-20 16:47:07 PDT
Add Above/Below comparisons for UInt32 patterns
Attachments
Patch
(64.55 KB, patch)
2017-09-20 17:06 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews116 for mac-elcapitan
(666.87 KB, application/zip)
2017-09-20 18:04 PDT
,
Build Bot
no flags
Details
WIP
(59.93 KB, patch)
2017-09-21 06:53 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
WIP
(58.02 KB, patch)
2017-09-21 07:34 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(66.47 KB, patch)
2017-09-21 09:33 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(66.50 KB, patch)
2017-09-23 05:40 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch for landing
(66.71 KB, patch)
2017-09-26 13:18 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(64.15 KB, patch)
2017-10-14 08:35 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Show Obsolete
(6)
View All
Add attachment
proposed patch, testcase, etc.
Yusuke Suzuki
Comment 1
2017-09-20 17:06:25 PDT
Created
attachment 321391
[details]
Patch
Yusuke Suzuki
Comment 2
2017-09-20 17:07:32 PDT
Spawn from
bug 177100
. WIP, adding tests now. But ready for early review.
Yusuke Suzuki
Comment 3
2017-09-20 17:11:48 PDT
Comment on
attachment 321391
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=321391&action=review
> Source/JavaScriptCore/dfg/DFGIntegerRangeOptimizationPhase.cpp:1144 > + // TODO: Handle CompareAbove etc.
I'll do this.
> Source/JavaScriptCore/dfg/DFGMayExit.cpp:102 > + case CompareAboveEq:
This is correct, but I don't think they are necessary. I'll drop this part.
> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:2543 > + case CompareBelow: > + if (compare(node, JITCompiler::Below, JITCompiler::DoubleLessThan, operationCompareLess)) > + return; > + break; > + > + case CompareBelowEq: > + if (compare(node, JITCompiler::BelowOrEqual, JITCompiler::DoubleLessThanOrEqual, operationCompareLessEq)) > + return; > + break; > + > + case CompareAbove: > + if (compare(node, JITCompiler::Above, JITCompiler::DoubleGreaterThan, operationCompareGreater)) > + return; > + break; > + > + case CompareAboveEq: > + if (compare(node, JITCompiler::AboveOrEqual, JITCompiler::DoubleGreaterThanOrEqual, operationCompareGreaterEq)) > + return; > + break;
I'll add `compareUnsigned`, which only accepts KnownInt32 binary uses and perform unsigned comparison.
> Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:2685 > + break;
Ditto.
Build Bot
Comment 4
2017-09-20 18:04:22 PDT
Comment on
attachment 321391
[details]
Patch
Attachment 321391
[details]
did not pass mac-debug-ews (mac): Output:
http://webkit-queues.webkit.org/results/4610048
Number of test failures exceeded the failure limit.
Build Bot
Comment 5
2017-09-20 18:04:23 PDT
Created
attachment 321398
[details]
Archive of layout-test-results from ews116 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews116 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Yusuke Suzuki
Comment 6
2017-09-21 06:53:47 PDT
Created
attachment 321425
[details]
WIP
Yusuke Suzuki
Comment 7
2017-09-21 07:34:15 PDT
Created
attachment 321427
[details]
WIP
Yusuke Suzuki
Comment 8
2017-09-21 09:33:11 PDT
Created
attachment 321436
[details]
Patch
Yusuke Suzuki
Comment 9
2017-09-23 05:40:51 PDT
Created
attachment 321629
[details]
Patch Just updating ChangeLog
Darin Adler
Comment 10
2017-09-23 17:12:29 PDT
Comment on
attachment 321629
[details]
Patch To me, nothing about the names "above" and "below" say "treat negative numbers as if they were large numbers" or "treat as unsigned without checking". I would consider names that make that explicit, maybe a "u" added to the existing names?
Yusuke Suzuki
Comment 11
2017-09-23 17:24:35 PDT
(In reply to Darin Adler from
comment #10
)
> Comment on
attachment 321629
[details]
> Patch > > To me, nothing about the names "above" and "below" say "treat negative > numbers as if they were large numbers" or "treat as unsigned without > checking". I would consider names that make that explicit, maybe a "u" added > to the existing names?
`above` / `below` is used in assembly lang as unsigned comparison, but using op_uless / op_ugreater is OK to me.
Saam Barati
Comment 12
2017-09-23 18:43:28 PDT
(In reply to Darin Adler from
comment #10
)
> Comment on
attachment 321629
[details]
> Patch > > To me, nothing about the names "above" and "below" say "treat negative > numbers as if they were large numbers" or "treat as unsigned without > checking". I would consider names that make that explicit, maybe a "u" added > to the existing names?
I agree with your intuition and reasoning here, but I think it’s more important to be consistent with the rest of JSC here. We adopt various assembly language conventions and use Above to mean unsigned comparison and GreaterThan to mean signed comparison throughout the various compilers and MacroAssemblers.
Darin Adler
Comment 13
2017-09-23 18:53:25 PDT
(In reply to Saam Barati from
comment #12
)
> I agree with your intuition and reasoning here, but I think it’s more > important to be consistent with the rest of JSC here. We adopt various > assembly language conventions and use Above to mean unsigned comparison and > GreaterThan to mean signed comparison throughout the various compilers and > MacroAssemblers.
Good to know. My knowledge is out of date. The instruction sets I know best, ones other than X86, call unsigned comparison "higher" and "lower" rather than "above" and "below". But I recall now that the signed comparisons were the only ones called "greater" and "less than" and I suppose I had forgotten that. I see how this is following tradition, especially for people familiar with X86; sorry for the distraction.
Saam Barati
Comment 14
2017-09-23 19:16:53 PDT
(In reply to Darin Adler from
comment #13
)
> (In reply to Saam Barati from
comment #12
) > > I agree with your intuition and reasoning here, but I think it’s more > > important to be consistent with the rest of JSC here. We adopt various > > assembly language conventions and use Above to mean unsigned comparison and > > GreaterThan to mean signed comparison throughout the various compilers and > > MacroAssemblers. > > Good to know. > > My knowledge is out of date. The instruction sets I know best, ones other > than X86, call unsigned comparison "higher" and "lower" rather than "above" > and "below". But I recall now that the signed comparisons were the only ones > called "greater" and "less than" and I suppose I had forgotten that. > > I see how this is following tradition, especially for people familiar with > X86; sorry for the distraction.
FWIW, this distinction has caused cognitive overhead for me in the past. At some point in the last year or so I crossed over to these names intuitively meaning what they do, but before that, I had to keep looking it up. I’m not against us using better names than X86, but if we do it, I think we should pull the trigger and just do it everywhere. Perhaps there is enough inertia behind the names that it’s not actually worth changing. My main comment here is just that I think it’s best to be consistent within JSC.
Filip Pizlo
Comment 15
2017-09-23 19:34:04 PDT
(In reply to Saam Barati from
comment #14
)
> (In reply to Darin Adler from
comment #13
) > > (In reply to Saam Barati from
comment #12
) > > > I agree with your intuition and reasoning here, but I think it’s more > > > important to be consistent with the rest of JSC here. We adopt various > > > assembly language conventions and use Above to mean unsigned comparison and > > > GreaterThan to mean signed comparison throughout the various compilers and > > > MacroAssemblers. > > > > Good to know. > > > > My knowledge is out of date. The instruction sets I know best, ones other > > than X86, call unsigned comparison "higher" and "lower" rather than "above" > > and "below". But I recall now that the signed comparisons were the only ones > > called "greater" and "less than" and I suppose I had forgotten that. > > > > I see how this is following tradition, especially for people familiar with > > X86; sorry for the distraction. > > FWIW, this distinction has caused cognitive overhead for me in the past. At > some point in the last year or so I crossed over to these names intuitively > meaning what they do, but before that, I had to keep looking it up. I’m not > against us using better names than X86, but if we do it, I think we should > pull the trigger and just do it everywhere. Perhaps there is enough inertia > behind the names that it’s not actually worth changing. My main comment here > is just that I think it’s best to be consistent within JSC.
Yeah... in my case I was already fluent in this terminology before hacking on WebKit and I was super happy that WebKit used these terms. I’ve also hacked on systems that say ULessThan/UGreaterThan. I didn’t like that as much. If I had to guess why, I would say it’s because these are super common operations internally and having one-word names makes it easier to speak about it (below is two syllables, unsigned less than is four).
Yusuke Suzuki
Comment 16
2017-09-25 09:44:42 PDT
OK, for now, we use above/below :)
Yusuke Suzuki
Comment 17
2017-09-26 10:22:31 PDT
Ping?
Saam Barati
Comment 18
2017-09-26 11:32:58 PDT
Comment on
attachment 321629
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=321629&action=review
r=me I'm not entirely convinced we want all this machinery for a very specific type of program written in a very specific way at the JS source level. That said, I don't want to keep you from landing this. I still wonder if a better approach would've been to teach OSR exit in a way similar to it undoing math operations that we eliminate Uint32ToNumber and just perform the operation upon exit.
> Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:2005 > + static_cast<BinaryOpNode*>(left)->m_notCastUnsigned = true;
I'd assert here it's urshift
> Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:2007 > + static_cast<BinaryOpNode*>(right)->m_notCastUnsigned = true;
ditto
> Source/JavaScriptCore/dfg/DFGIntegerRangeOptimizationPhase.cpp:1143 > + // FIXME: Handle CompareBelow and CompareBelowEq.
Please remove or link to a bug.
> Source/JavaScriptCore/parser/Nodes.h:1084 > + bool m_notCastUnsigned { false };
Name nit: I think I'd flip the boolean state of this variable and maybe name it as: bool m_shouldToUnsignedResult { true } and the pattern matching code can set it to false I think that'll be more clear from reading the code
Yusuke Suzuki
Comment 19
2017-09-26 11:59:30 PDT
Comment on
attachment 321629
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=321629&action=review
Thanks. I think this patch is worth introducing because of the nature of JS. JS does not support uint32 efficiently. But it is not a large problem since we can do int32 operations. add, sub etc. are OK since we can perform int32 operations and recognize the resulted value as uint32 bit patterns. But one problematic thing is comparison. It is frequently executed and it recognizes unsignedness. So, performing >>> 0 operation when doing comparison is one of the frequently seen pattern in JS, and actually it is shown in Octane/zlib. So I think this optimization covers large part of JS programs using uint32 comparison. In addition, we also introduced DFG strength reduction rule for unsigned comparison. So once our DFG MovHint eliminations's conservative analysis is fixed, these UInt32ToNumber will be removed, which is kept by MovHint. I think this could remove large part of UInt32ToNumber in the future. After that, I think we could have UInt32ToNumber yet. They can be removed by using somewhat PhantoUInt32ToNumber which will materialize UInt32ToNumber resulted value when performing OSR exit. But I think this requires OSR availability analysis for MovHint analysis, so it is only done in FTL. So, I think this optimization is worth introducing since it can improve some patterns frequently seen in JS supported by this reasonable story.
>> Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:2005 >> + static_cast<BinaryOpNode*>(left)->m_notCastUnsigned = true; > > I'd assert here it's urshift
Fixed.
>> Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:2007 >> + static_cast<BinaryOpNode*>(right)->m_notCastUnsigned = true; > > ditto
Fixed.
>> Source/JavaScriptCore/dfg/DFGIntegerRangeOptimizationPhase.cpp:1143 >> + // FIXME: Handle CompareBelow and CompareBelowEq. > > Please remove or link to a bug.
Added a bug link for this.
>> Source/JavaScriptCore/parser/Nodes.h:1084 >> + bool m_notCastUnsigned { false }; > > Name nit: I think I'd flip the boolean state of this variable and maybe name it as: > bool m_shouldToUnsignedResult { true } > and the pattern matching code can set it to false > I think that'll be more clear from reading the code
Sounds nice. Fixed.
Yusuke Suzuki
Comment 20
2017-09-26 13:18:24 PDT
Created
attachment 321860
[details]
Patch for landing
Yusuke Suzuki
Comment 21
2017-09-26 13:32:26 PDT
Committed
r222518
: <
http://trac.webkit.org/changeset/222518
>
WebKit Commit Bot
Comment 22
2017-09-26 14:21:54 PDT
Re-opened since this is blocked by
bug 177507
Yusuke Suzuki
Comment 23
2017-09-26 17:22:01 PDT
(In reply to WebKit Commit Bot from
comment #22
)
> Re-opened since this is blocked by
bug 177507
I'll land it later with a fix. BTW, this patch actually improved performance of Octane/zlib by 5%.
https://arewefastyet.com/#machine=29&view=single&suite=octane&subtest=zlib
Yusuke Suzuki
Comment 24
2017-09-27 11:51:21 PDT
Committed
r222564
: <
http://trac.webkit.org/changeset/222564
>
Radar WebKit Bug Importer
Comment 25
2017-09-27 12:19:27 PDT
<
rdar://problem/34693039
>
Saam Barati
Comment 26
2017-10-01 10:13:24 PDT
This patch is a 50% slowdown on JetStream/bigfig on iOS. This is leading to a 2% JetStream regression on iOS.
WebKit Commit Bot
Comment 27
2017-10-01 10:20:08 PDT
Re-opened since this is blocked by
bug 177720
Yusuke Suzuki
Comment 28
2017-10-02 07:36:46 PDT
(In reply to WebKit Commit Bot from
comment #27
)
> Re-opened since this is blocked by
bug 177720
I've measured the execution time of bigfib.cpp in several configurations. In my Linux box (Xeon, 8 physical cores with hyperthreading), this patch rather poses performance progression. baseline patched bigfib.cpp 570.4888+-37.6582 563.6416+-35.9494 might be 1.0121x faster In my MacBook Pro, without any CPU configuration change, this patch rather improves performance by 1.8%. With CPU configuration (disabling hyperthreading and using only 2 cores), it does not show any differences. Saam, could you ensure the following two things? 1. After reverting this patch, the regression is fixed in iOS bot. 2. MacBook Pro / MacPro bots do not show regression due to this patch.
Filip Pizlo
Comment 29
2017-10-02 08:04:20 PDT
(In reply to Saam Barati from
comment #18
)
> Comment on
attachment 321629
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=321629&action=review
> > r=me > I'm not entirely convinced we want all this machinery for a very specific > type of program written in a very specific way at the JS source level. That > said, I don't want to keep you from landing this. > > I still wonder if a better approach would've been to teach OSR exit in a way > similar to it undoing math operations that we eliminate Uint32ToNumber and > just perform the operation upon exit.
Yes. This. -Filip
> > > Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:2005 > > + static_cast<BinaryOpNode*>(left)->m_notCastUnsigned = true; > > I'd assert here it's urshift > > > Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:2007 > > + static_cast<BinaryOpNode*>(right)->m_notCastUnsigned = true; > > ditto > > > Source/JavaScriptCore/dfg/DFGIntegerRangeOptimizationPhase.cpp:1143 > > + // FIXME: Handle CompareBelow and CompareBelowEq. > > Please remove or link to a bug. > > > Source/JavaScriptCore/parser/Nodes.h:1084 > > + bool m_notCastUnsigned { false }; > > Name nit: I think I'd flip the boolean state of this variable and maybe name > it as: > bool m_shouldToUnsignedResult { true } > and the pattern matching code can set it to false > I think that'll be more clear from reading the code
Saam Barati
Comment 30
2017-10-02 13:08:42 PDT
(In reply to Yusuke Suzuki from
comment #28
)
> (In reply to WebKit Commit Bot from
comment #27
) > > Re-opened since this is blocked by
bug 177720
> > I've measured the execution time of bigfib.cpp in several configurations. > > In my Linux box (Xeon, 8 physical cores with hyperthreading), this patch > rather poses performance progression. > > baseline patched > > > bigfib.cpp 570.4888+-37.6582 563.6416+-35.9494 > might be 1.0121x faster > > In my MacBook Pro, without any CPU configuration change, this patch rather > improves performance by 1.8%. > With CPU configuration (disabling hyperthreading and using only 2 cores), it > does not show any differences. > > Saam, could you ensure the following two things? > > 1. After reverting this patch, the regression is fixed in iOS bot.
iOS perf bots recovered this 2%.
> 2. MacBook Pro / MacPro bots do not show regression due to this patch.
I don't have enough data to answer this question yet.
Yusuke Suzuki
Comment 31
2017-10-03 05:43:06 PDT
(In reply to Saam Barati from
comment #30
)
> (In reply to Yusuke Suzuki from
comment #28
) > > (In reply to WebKit Commit Bot from
comment #27
) > > > Re-opened since this is blocked by
bug 177720
> > > > I've measured the execution time of bigfib.cpp in several configurations. > > > > In my Linux box (Xeon, 8 physical cores with hyperthreading), this patch > > rather poses performance progression. > > > > baseline patched > > > > > > bigfib.cpp 570.4888+-37.6582 563.6416+-35.9494 > > might be 1.0121x faster > > > > In my MacBook Pro, without any CPU configuration change, this patch rather > > improves performance by 1.8%. > > With CPU configuration (disabling hyperthreading and using only 2 cores), it > > does not show any differences. > > > > Saam, could you ensure the following two things? > > > > 1. After reverting this patch, the regression is fixed in iOS bot. > iOS perf bots recovered this 2%. > > > 2. MacBook Pro / MacPro bots do not show regression due to this patch. > I don't have enough data to answer this question yet.
I've a bit hacked in
bug 177806
. I extended Object Allocation Sinking phase to handle wider range of materialization (including UInt32ToNumber). Converting UInt32ToNumber instead. With this patch's DFG part, almost all UInt32ToNumber becomes PhantomUInt32ToNumber, which is no-op in FTL. Actually, we get 2% improvement in the above approach. But this patch offers 5% improvement, it is better. I think this is because of LLInt/Baseline/DFG optimization effect in this patch. In the above approach, only FTL gets performance boost. But this patch's simple pattern matching can make LLInt/Baseline/DFG faster. And this pattern can be seen in the actual code. So I think having this patch is good. Yeah, after that, we would like to introdue PhantomUInt32ToNumber thing too (in
bug 177806
).
Yusuke Suzuki
Comment 32
2017-10-11 04:47:11 PDT
(In reply to Yusuke Suzuki from
comment #31
)
> (In reply to Saam Barati from
comment #30
) > > (In reply to Yusuke Suzuki from
comment #28
) > > > (In reply to WebKit Commit Bot from
comment #27
) > > > > Re-opened since this is blocked by
bug 177720
> > > > > > I've measured the execution time of bigfib.cpp in several configurations. > > > > > > In my Linux box (Xeon, 8 physical cores with hyperthreading), this patch > > > rather poses performance progression. > > > > > > baseline patched > > > > > > > > > bigfib.cpp 570.4888+-37.6582 563.6416+-35.9494 > > > might be 1.0121x faster > > > > > > In my MacBook Pro, without any CPU configuration change, this patch rather > > > improves performance by 1.8%. > > > With CPU configuration (disabling hyperthreading and using only 2 cores), it > > > does not show any differences. > > > > > > Saam, could you ensure the following two things? > > > > > > 1. After reverting this patch, the regression is fixed in iOS bot. > > iOS perf bots recovered this 2%. > > > > > 2. MacBook Pro / MacPro bots do not show regression due to this patch. > > I don't have enough data to answer this question yet. > > I've a bit hacked in
bug 177806
. I extended Object Allocation Sinking phase > to handle wider range of materialization (including UInt32ToNumber). > Converting UInt32ToNumber instead. With this patch's DFG part, almost all > UInt32ToNumber becomes PhantomUInt32ToNumber, which is no-op in FTL. > > Actually, we get 2% improvement in the above approach. But this patch offers > 5% improvement, it is better. I think this is because of LLInt/Baseline/DFG > optimization effect in this patch. In the above approach, only FTL gets > performance boost. But this patch's simple pattern matching can make > LLInt/Baseline/DFG faster. > > And this pattern can be seen in the actual code. So I think having this > patch is good. Yeah, after that, we would like to introdue > PhantomUInt32ToNumber thing too (in
bug 177806
).
I would like to reland this patch without DFG strength reduction part. This means that only `a >>> 0 <=> b >>>0` pattern will be handled. In that case, integer range analysis does not offer any values for this since `a >>> 0` is only used for this part. So even without this extension, it does not disable the existing optimization. If that patch causes regression, it's a bit difficult to investigate without actual machine... Maybe, due to accidental code generation in FTL. Anyway, reducing the patch would offer a chance to bisect the culprit.
Yusuke Suzuki
Comment 33
2017-10-14 08:35:03 PDT
Created
attachment 323810
[details]
Patch
Yusuke Suzuki
Comment 34
2017-10-14 08:35:59 PDT
Committed
r223318
: <
https://trac.webkit.org/changeset/223318
>
Zan Dobersek
Comment 35
2017-10-17 00:35:15 PDT
Comment on
attachment 323810
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=323810&action=review
> Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:1996 > + int32_t value = static_cast<int32_t>(static_cast<IntegerNode*>(node)->value());
This conversion is causing a failure on Linux ARM64, likely due to the conversion instruction that GCC is generating. Opened
bug #178379
.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug