WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
NEW
124780
Kraken causes a lot of slow path calls when using Math.pow()
https://bugs.webkit.org/show_bug.cgi?id=124780
Summary
Kraken causes a lot of slow path calls when using Math.pow()
Romain Perier
Reported
2013-11-22 09:32:04 PST
Kraken causes a lot of OSR exits when using Math.pow()
Attachments
Patch
(4.44 KB, patch)
2013-11-22 09:45 PST
,
Romain Perier
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from webkit-ews-09 for mac-mountainlion-wk2
(471.09 KB, application/zip)
2013-11-22 10:39 PST
,
Build Bot
no flags
Details
Archive of layout-test-results from webkit-ews-05 for mac-mountainlion
(521.96 KB, application/zip)
2013-11-22 11:28 PST
,
Build Bot
no flags
Details
Patch
(4.45 KB, patch)
2013-11-23 05:30 PST
,
Romain Perier
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from webkit-ews-15 for mac-mountainlion-wk2
(471.20 KB, application/zip)
2013-11-23 06:21 PST
,
Build Bot
no flags
Details
Archive of layout-test-results from webkit-ews-06 for mac-mountainlion
(515.35 KB, application/zip)
2013-11-23 07:07 PST
,
Build Bot
no flags
Details
Archive of layout-test-results from webkit-ews-01 for mac-mountainlion
(525.28 KB, application/zip)
2013-11-23 08:27 PST
,
Build Bot
no flags
Details
Patch
(6.04 KB, patch)
2013-12-07 10:50 PST
,
Romain Perier
no flags
Details
Formatted Diff
Diff
performance gain
(671 bytes, text/plain)
2013-12-08 03:13 PST
,
Romain Perier
no flags
Details
Patch
(5.47 KB, patch)
2013-12-09 02:33 PST
,
Romain Perier
no flags
Details
Formatted Diff
Diff
Patch
(8.14 KB, patch)
2013-12-16 08:17 PST
,
Romain Perier
no flags
Details
Formatted Diff
Diff
Patch
(8.62 KB, patch)
2013-12-26 23:37 PST
,
Romain Perier
no flags
Details
Formatted Diff
Diff
Patch
(8.64 KB, patch)
2013-12-26 23:44 PST
,
Romain Perier
benjamin
: review-
Details
Formatted Diff
Diff
Show Obsolete
(11)
View All
Add attachment
proposed patch, testcase, etc.
Romain Perier
Comment 1
2013-11-22 09:40:16 PST
The Kraken benchmark produces a lot of OSR exits when you start it from Safari or webkit nightly. If you have a look at Instruments, you can find that's the function " mathProtoFuncPow(ExecState*)" which slows the benchmark significantly. Now take a look at the kraken code source, it call Math.pow() a lot of times with negative integers or floating point numbers as exponents. These cases are not handled in jit/ThunkGenerators.cpp (except when the exponent is equal to 1/2) which causes OSR exits. I propose the following patch which basically improves the performances between 430 and 900%. (Just test it with PerformanceTests with a simple run per seconds). I will probably use the DFG compiler when the code is "hot" , but it will be a future contribution
Romain Perier
Comment 2
2013-11-22 09:45:55 PST
Created
attachment 217695
[details]
Patch
Romain Perier
Comment 3
2013-11-22 09:47:42 PST
As said in the changelog, it also improves the kraken performances (between 1 and 2.5%). Sorry for the first useless comment by the way, I just tested to create the bug using the script Tools/Script/webkit-patch but apparently it did not work completely ;)
Romain Perier
Comment 4
2013-11-22 09:54:14 PST
So the plan here is to use the baselineJIT in order to reduce as most as possible the number of OSR exits caused by kraken. the binary size is also very important, this is why I created a wrapper to the libc function pow. What I have in mind is: we could give types and code informations to the DFG compiler in parallel of this code to improve the performances even better when the code is really "hot". I am still a beginner in JSC so my point of view or my code are not probably perfects.
Filip Pizlo
Comment 5
2013-11-22 10:17:32 PST
Comment on
attachment 217695
[details]
Patch I'm very confused. The DFG doesn't support have a Math.pow() intrinsic. So, where are the OSR exits?
Romain Perier
Comment 6
2013-11-22 10:23:28 PST
By default the code of Math.pow is produced by the ThunkGenerator "powThunkGenerator" in jit/ThunkGenerators.cpp . The unhandled cases in this ThunkGenerator (when the exponent is a negative integer, or when the exponent is a double != -1/2) causes OSR exits -> mathProtoFuncPow() is called. (it don't know if it's the good technical term for that)
Romain Perier
Comment 7
2013-11-22 10:24:11 PST
The DFG is an idea, for future improvements . Nothing more :)
Build Bot
Comment 8
2013-11-22 10:25:56 PST
Comment on
attachment 217695
[details]
Patch
Attachment 217695
[details]
did not pass win-ews (win): Output:
http://webkit-queues.appspot.com/results/34448034
Filip Pizlo
Comment 9
2013-11-22 10:26:37 PST
(In reply to
comment #6
)
> By default the code of Math.pow is produced by the ThunkGenerator "powThunkGenerator" in jit/ThunkGenerators.cpp . The unhandled cases in this ThunkGenerator (when the exponent is a negative integer, or when the exponent is a double != -1/2) causes OSR exits -> mathProtoFuncPow() is called. (it don't know if it's the good technical term for that)
That's not an OSR exit. ;-) That's a slow path call. OSR implies that a piece of compiled code gets replaced, while on the stack, with a different piece of compiled code that does the same thing. powThunkGenerator has a call to mathProtoFuncPow. It's not an exit. ;-)
Build Bot
Comment 10
2013-11-22 10:39:48 PST
Comment on
attachment 217695
[details]
Patch
Attachment 217695
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.appspot.com/results/33598087
New failing tests: sputnik/Conformance/15_Native_Objects/15.8_Math/15.8.2/15.8.2.13_pow/S15.8.2.13_A7.html js/math.html sputnik/Conformance/15_Native_Objects/15.8_Math/15.8.2/15.8.2.13_pow/S15.8.2.13_A8.html
Build Bot
Comment 11
2013-11-22 10:39:50 PST
Created
attachment 217699
[details]
Archive of layout-test-results from webkit-ews-09 for mac-mountainlion-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: webkit-ews-09 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.5
Build Bot
Comment 12
2013-11-22 11:28:56 PST
Comment on
attachment 217695
[details]
Patch
Attachment 217695
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.appspot.com/results/34498050
New failing tests: sputnik/Conformance/15_Native_Objects/15.8_Math/15.8.2/15.8.2.13_pow/S15.8.2.13_A7.html js/math.html sputnik/Conformance/15_Native_Objects/15.8_Math/15.8.2/15.8.2.13_pow/S15.8.2.13_A8.html
Build Bot
Comment 13
2013-11-22 11:28:58 PST
Created
attachment 217704
[details]
Archive of layout-test-results from webkit-ews-05 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-05 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Romain Perier
Comment 14
2013-11-23 05:30:39 PST
Created
attachment 217746
[details]
Patch
Build Bot
Comment 15
2013-11-23 06:10:46 PST
Comment on
attachment 217746
[details]
Patch
Attachment 217746
[details]
did not pass win-ews (win): Output:
http://webkit-queues.appspot.com/results/35308125
Build Bot
Comment 16
2013-11-23 06:21:00 PST
Comment on
attachment 217746
[details]
Patch
Attachment 217746
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.appspot.com/results/35308130
New failing tests: sputnik/Conformance/15_Native_Objects/15.8_Math/15.8.2/15.8.2.13_pow/S15.8.2.13_A7.html js/math.html sputnik/Conformance/15_Native_Objects/15.8_Math/15.8.2/15.8.2.13_pow/S15.8.2.13_A8.html
Build Bot
Comment 17
2013-11-23 06:21:02 PST
Created
attachment 217747
[details]
Archive of layout-test-results from webkit-ews-15 for mac-mountainlion-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: webkit-ews-15 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.5
Build Bot
Comment 18
2013-11-23 07:07:19 PST
Comment on
attachment 217746
[details]
Patch
Attachment 217746
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.appspot.com/results/35368127
New failing tests: sputnik/Conformance/15_Native_Objects/15.8_Math/15.8.2/15.8.2.13_pow/S15.8.2.13_A7.html js/math.html sputnik/Conformance/15_Native_Objects/15.8_Math/15.8.2/15.8.2.13_pow/S15.8.2.13_A8.html
Build Bot
Comment 19
2013-11-23 07:07:26 PST
Created
attachment 217748
[details]
Archive of layout-test-results from webkit-ews-06 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-06 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Build Bot
Comment 20
2013-11-23 08:27:21 PST
Comment on
attachment 217746
[details]
Patch
Attachment 217746
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.appspot.com/results/35218136
New failing tests: sputnik/Conformance/15_Native_Objects/15.8_Math/15.8.2/15.8.2.13_pow/S15.8.2.13_A7.html js/math.html sputnik/Conformance/15_Native_Objects/15.8_Math/15.8.2/15.8.2.13_pow/S15.8.2.13_A8.html
Build Bot
Comment 21
2013-11-23 08:27:24 PST
Created
attachment 217749
[details]
Archive of layout-test-results from webkit-ews-01 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-01 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Geoffrey Garen
Comment 22
2013-11-23 11:38:52 PST
Comment on
attachment 217746
[details]
Patch Looks like this is failing some Sputnik tests. Do you know why?
Romain Perier
Comment 23
2013-11-24 02:35:23 PST
I will investigate why
Romain Perier
Comment 24
2013-12-02 05:50:27 PST
The tests fail because the cases where the exponent is NaN and infinity are not handled in my code (nothing really complicated). However, I was wondering how "NaN" is encoded in the VM. If I try to test if the exponent is NaN (exponent != exponent or using std::isnan) it does not work. Any ideas ? I did the following tests: - branchDouble(DoubleNotEqual with the exponent => it does not work - I saved the exponent on the stack (using storeDouble) and used std::isnan => it does work work - branchDouble(DoubleEqual ==> does not make sense
Romain Perier
Comment 25
2013-12-02 23:32:48 PST
branchDouble with DoubleNotEqualOrUnordered works ;)
Romain Perier
Comment 26
2013-12-07 10:50:37 PST
Created
attachment 218659
[details]
Patch
Build Bot
Comment 27
2013-12-07 11:32:22 PST
Comment on
attachment 218659
[details]
Patch
Attachment 218659
[details]
did not pass win-ews (win): Output:
http://webkit-queues.appspot.com/results/45948151
Romain Perier
Comment 28
2013-12-08 03:13:53 PST
Created
attachment 218679
[details]
performance gain See performance gain above
Romain Perier
Comment 29
2013-12-09 02:33:03 PST
Created
attachment 218741
[details]
Patch
Build Bot
Comment 30
2013-12-09 03:13:37 PST
Comment on
attachment 218741
[details]
Patch
Attachment 218741
[details]
did not pass win-ews (win): Output:
http://webkit-queues.appspot.com/results/47258045
Benjamin Poulain
Comment 31
2013-12-10 16:25:52 PST
Comment on
attachment 218741
[details]
Patch Please add a ARMv7 wrapper as well. The "jit.sqrtDouble" generator should be behind jit.supportsFloatingPointSqrt. And Windows does not build.
Romain Perier
Comment 32
2013-12-16 08:17:10 PST
Created
attachment 219320
[details]
Patch
Romain Perier
Comment 33
2013-12-16 08:18:35 PST
Could you check and test this patch on ARM ? Unfortunately I have not ARM devices. Thanks
Romain Perier
Comment 34
2013-12-17 23:54:52 PST
Ping
Benjamin Poulain
Comment 35
2013-12-20 14:05:25 PST
I just tried on ARM device. I get 2.5% better timing on Kraken.
Filip Pizlo
Comment 36
2013-12-20 14:35:35 PST
Comment on
attachment 219320
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=219320&action=review
> Source/JavaScriptCore/ChangeLog:12 > + calls significantly and improves performance between 15% and 185%, > + also, it improves the kraken benchmark (5%).
Can you be more specific about which benchmarks are so affected? If any of those benchmarks are available under open source terms, can you put them in LayoutTests/js/regress?
> Source/JavaScriptCore/jit/ThunkGenerators.cpp:669 > +#define defineBinaryDoubleOpWrapper(function) \ > + asm( \ > + ".text\n" \ > + ".globl " SYMBOL_STRING(function##Thunk) "\n" \ > + HIDE_SYMBOL(function##Thunk) "\n" \ > + SYMBOL_STRING(function##Thunk) ":" "\n" \ > + "subl $8, %esp\n" \ > + "movsd %xmm0, (%esp) \n" \ > + "subl $8, %esp\n" \ > + "movsd %xmm1, (%esp) \n" \ > + "call " GLOBAL_REFERENCE(function) "\n" \ > + "fstpl (%esp) \n" \ > + "addl $8, %esp\n" \ > + "movsd (%esp), %xmm0 \n" \ > + "addl $8, %esp\n" \ > + "ret\n" \ > + ); \ > + extern "C" { \ > + MathThunkCallingConvention function##Thunk(MathThunkCallingConvention); \ > + } \ > + static MathThunk UnaryDoubleOpWrapper(function) = &function##Thunk;
Can you explain this change?
> Source/JavaScriptCore/jit/ThunkGenerators.cpp:719 > +#define defineBinaryDoubleOpWrapper(function) \ > + asm( \ > + ".text\n" \ > + ".align 2\n" \ > + ".globl " SYMBOL_STRING(function##Thunk) "\n" \ > + HIDE_SYMBOL(function##Thunk) "\n" \ > + ".thumb\n" \ > + ".thumb_func " THUMB_FUNC_PARAM(function##Thunk) "\n" \ > + SYMBOL_STRING(function##Thunk) ":" "\n" \ > + "push {lr}\n" \ > + "vmov r0, r1, d0\n" \ > + "vmov r2, r3, d1\n" \ > + "blx " GLOBAL_REFERENCE(function) "\n" \ > + "vmov d0, r0, r1\n" \ > + "pop {lr}\n" \ > + "bx lr\n" \ > + ); \ > + extern "C" { \ > + MathThunkCallingConvention function##Thunk(MathThunkCallingConvention); \ > + } \ > + static MathThunk UnaryDoubleOpWrapper(function) = &function##Thunk; > + > +#elif CPU(ARM64) > +#define BinaryDoubleOpWrapper(function) UnaryDoubleOpWrapper(function) > +#define defineBinaryDoubleOpWrapper(function) defineUnaryDoubleOpWrapper(function)
And this?
Romain Perier
Comment 37
2013-12-26 23:37:59 PST
Created
attachment 220049
[details]
Patch
Romain Perier
Comment 38
2013-12-26 23:44:12 PST
Created
attachment 220051
[details]
Patch
Romain Perier
Comment 39
2013-12-26 23:52:06 PST
@Benjamin: thanks for your tests @Filip: For PerformanceTests see the attached file "performance gain" (the unit test is at the top of the file) As I said in the changelog, this patch improves the global score of kraken. The main difference is for imaging tests. I am not sure that it does make sense to import Kraken in LayoutTests/js/regress. The kraken benchmark is a big project... (however I did not find informations about the licenses terms)
Michael Saboff
Comment 40
2014-01-15 15:09:13 PST
Comment on
attachment 220051
[details]
Patch I looked at this specifically for ARM64 correctness. Given that the arguments are already in V registers, this change looks good to me. The 32 bit wrappers look good as well.
Benjamin Poulain
Comment 41
2014-01-27 19:42:01 PST
Comment on
attachment 220051
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=220051&action=review
> Source/JavaScriptCore/jit/ThunkGenerators.cpp:915 > + jit.neg32(SpecializedThunkJIT::regT0); > + jit.divDouble(SpecializedThunkJIT::fpRegT0, SpecializedThunkJIT::fpRegT1); > + jit.moveDouble(SpecializedThunkJIT::fpRegT1, SpecializedThunkJIT::fpRegT0); > + jit.loadDouble(&oneConstant, SpecializedThunkJIT::fpRegT1); > + exponentIsPositive.link(&jit);
I have the feeling jit.neg32(SpecializedThunkJIT::regT0); does not work for INT_MIN (since -INT_MIN > INT_MAX). Am I missing something here?
> Source/JavaScriptCore/jit/ThunkGenerators.cpp:947 > + nonIntExponent.link(&jit); > + jit.loadDoubleArgument(1, SpecializedThunkJIT::fpRegT1, SpecializedThunkJIT::regT0); > + jit.loadDouble(&negativeHalfConstant, SpecializedThunkJIT::fpRegT2); > + MacroAssembler::Jump exponentIsInverseSqrt = jit.branchDouble(MacroAssembler::DoubleEqual, SpecializedThunkJIT::fpRegT1, SpecializedThunkJIT::fpRegT2); > > - SpecializedThunkJIT::JumpList doubleResult; > - jit.branchConvertDoubleToInt32(SpecializedThunkJIT::fpRegT1, SpecializedThunkJIT::regT0, doubleResult, SpecializedThunkJIT::fpRegT0); > - jit.returnInt32(SpecializedThunkJIT::regT0); > - doubleResult.link(&jit); > - jit.returnDouble(SpecializedThunkJIT::fpRegT1); > - } else > - jit.appendFailure(nonIntExponent); > + jit.callDoubleToDoublePreservingReturn(BinaryDoubleOpWrapper(jsPow)); > + jit.returnDouble(SpecializedThunkJIT::fpRegT0); > > + exponentIsInverseSqrt.link(&jit); > + jit.loadDouble(&oneConstant, SpecializedThunkJIT::fpRegT1); > + jit.sqrtDouble(SpecializedThunkJIT::fpRegT0, SpecializedThunkJIT::fpRegT0); > + jit.divDouble(SpecializedThunkJIT::fpRegT0, SpecializedThunkJIT::fpRegT1); > + jit.returnDouble(SpecializedThunkJIT::fpRegT1);
The branch using jit.sqrtDouble should be behind test for jit.supportsFloatingPointSqrt().
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug