WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
155594
[JSC] Make CSE's ImpureData faster when dealing with large blocks
https://bugs.webkit.org/show_bug.cgi?id=155594
Summary
[JSC] Make CSE's ImpureData faster when dealing with large blocks
Benjamin Poulain
Reported
2016-03-17 12:48:16 PDT
[JSC] Make CSE's ImpureData faster when dealing with large blocks
Attachments
Patch
(16.70 KB, patch)
2016-03-17 16:14 PDT
,
Benjamin Poulain
no flags
Details
Formatted Diff
Diff
Patch
(16.68 KB, patch)
2016-03-17 16:21 PDT
,
Benjamin Poulain
no flags
Details
Formatted Diff
Diff
Patch
(18.63 KB, patch)
2016-03-17 18:43 PDT
,
Benjamin Poulain
no flags
Details
Formatted Diff
Diff
Patch
(20.00 KB, patch)
2016-03-17 19:22 PDT
,
Benjamin Poulain
no flags
Details
Formatted Diff
Diff
Patch
(20.01 KB, patch)
2016-03-17 19:23 PDT
,
Benjamin Poulain
no flags
Details
Formatted Diff
Diff
Show Obsolete
(4)
View All
Add attachment
proposed patch, testcase, etc.
Benjamin Poulain
Comment 1
2016-03-17 16:14:30 PDT
Created
attachment 274337
[details]
Patch
Benjamin Poulain
Comment 2
2016-03-17 16:21:36 PDT
Created
attachment 274338
[details]
Patch
Benjamin Poulain
Comment 3
2016-03-17 17:15:33 PDT
Worth about 1% on Sunspider on my Haswell MBP: Conf#1 Conf#2 SunSpider: 3d-cube 4.7620+-0.1528 ? 5.0014+-0.2815 ? might be 1.0503x slower 3d-morph 5.6442+-0.6014 5.1633+-0.0740 might be 1.0931x faster 3d-raytrace 5.6119+-0.0455 ? 5.6422+-0.1120 ? access-binary-trees 2.1840+-0.0982 ? 2.2405+-0.4026 ? might be 1.0259x slower access-fannkuch 5.9225+-0.6964 ? 6.0875+-0.0801 ? might be 1.0279x slower access-nbody 2.6172+-0.0474 2.5616+-0.1011 might be 1.0217x faster access-nsieve 3.0339+-0.0196 ? 3.0972+-0.0746 ? might be 1.0209x slower bitops-3bit-bits-in-byte 1.1820+-0.0544 1.1519+-0.0194 might be 1.0261x faster bitops-bits-in-byte 2.8886+-0.3271 2.7882+-0.0511 might be 1.0360x faster bitops-bitwise-and 2.0168+-0.0254 ? 2.0743+-0.1251 ? might be 1.0285x slower bitops-nsieve-bits 3.2024+-0.1270 3.1933+-0.0690 controlflow-recursive 2.3562+-0.0547 2.3550+-0.0283 crypto-aes 4.2165+-0.4866 4.0472+-0.0830 might be 1.0418x faster crypto-md5 2.5848+-0.0138 ^ 2.5099+-0.0507 ^ definitely 1.0298x faster crypto-sha1 2.3505+-0.0463 ? 2.4702+-0.3803 ? might be 1.0509x slower date-format-tofte 7.0665+-0.1434 6.7672+-0.2004 might be 1.0442x faster date-format-xparb 5.0988+-0.4837 4.7952+-0.0534 might be 1.0633x faster math-cordic 2.8953+-0.0496 ? 2.9280+-0.1650 ? might be 1.0113x slower math-partial-sums 4.9780+-0.3295 ? 5.0692+-0.5964 ? might be 1.0183x slower math-spectral-norm 2.1288+-0.3209 2.0122+-0.0190 might be 1.0580x faster regexp-dna 6.2723+-1.3781 6.0810+-0.7191 might be 1.0315x faster string-base64 4.5028+-0.1108 ? 4.7388+-0.9072 ? might be 1.0524x slower string-fasta 6.1963+-0.3864 5.9904+-0.1745 might be 1.0344x faster string-tagcloud 8.5071+-1.0164 8.0998+-0.1173 might be 1.0503x faster string-unpack-code 18.6016+-0.8419 ? 18.9164+-0.9788 ? might be 1.0169x slower string-validate-input 4.2703+-0.2170 4.2188+-0.1649 might be 1.0122x faster <arithmetic> 4.6574+-0.1135 4.6154+-0.1004 might be 1.0091x faster Conf#1 Conf#2 Octane: encrypt 0.15853+-0.00595 ? 0.15979+-0.00351 ? decrypt 2.89425+-0.07303 2.83491+-0.02284 might be 1.0209x faster deltablue x2 0.13968+-0.00411 0.13907+-0.00364 earley 0.28603+-0.00424 0.28510+-0.00099 boyer 5.00685+-0.03148 4.90460+-0.29752 might be 1.0208x faster navier-stokes x2 4.93707+-0.03806 ? 4.94465+-0.00570 ? raytrace x2 0.89199+-0.00800 0.89028+-0.01185 richards x2 0.08227+-0.00105 0.08128+-0.00171 might be 1.0121x faster splay x2 0.34863+-0.00383 0.34622+-0.00583 regexp x2 19.92103+-0.05291 19.89375+-0.63229 pdfjs x2 39.68920+-1.64384 39.39607+-1.54674 mandreel x2 42.47091+-0.56361 42.23536+-0.30597 gbemu x2 24.69891+-0.19513 24.56994+-0.18549 closure 0.56687+-0.00601 ? 0.57261+-0.01139 ? might be 1.0101x slower jquery 7.45649+-0.12023 7.42992+-0.12502 box2d x2 9.31075+-0.09297 9.29540+-0.02329 zlib x2 377.59475+-29.33945 363.35997+-5.49547 might be 1.0392x faster typescript x2 665.67371+-11.03509 643.76984+-11.76872 might be 1.0340x faster <geometric> 5.21253+-0.03424 5.16709+-0.03924 might be 1.0088x faster Conf#1 Conf#2 Kraken: ai-astar 88.362+-1.991 87.599+-0.827 audio-beat-detection 45.229+-0.313 45.172+-0.230 audio-dft 100.450+-6.442 97.193+-2.396 might be 1.0335x faster audio-fft 35.810+-0.112 35.743+-0.116 audio-oscillator 47.443+-0.509 ? 47.564+-0.390 ? imaging-darkroom 60.868+-2.461 60.726+-1.790 imaging-desaturate 44.586+-0.788 ? 45.959+-5.253 ? might be 1.0308x slower imaging-gaussian-blur 60.520+-7.612 60.275+-7.601 json-parse-financial 36.993+-0.137 36.739+-0.225 json-stringify-tinderbox 23.397+-1.573 23.249+-2.266 stanford-crypto-aes 40.144+-0.079 ? 40.335+-0.540 ? stanford-crypto-ccm 35.667+-2.626 35.213+-1.648 might be 1.0129x faster stanford-crypto-pbkdf2 101.108+-0.954 ? 101.879+-1.316 ? stanford-crypto-sha256-iterative 38.750+-0.506 ? 39.067+-1.336 ? <arithmetic> 54.238+-1.117 54.051+-0.758 might be 1.0035x faster Conf#1 Conf#2 AsmBench: bigfib.cpp 437.6711+-19.6185 ? 440.0609+-12.9610 ? cray.c 358.5124+-5.3609 ? 361.3328+-9.4286 ? dry.c 417.5919+-6.3117 ? 486.2893+-128.8717 ? might be 1.1645x slower FloatMM.c 715.5072+-8.4764 712.2838+-10.9938 gcc-loops.cpp 3666.7940+-24.0637 ? 3694.4663+-37.0843 ? n-body.c 807.5378+-5.3084 ? 815.3618+-11.8043 ? Quicksort.c 390.3484+-4.3191 ? 390.7597+-1.3254 ? stepanov_container.cpp 3314.4139+-54.8842 3257.5195+-19.6485 might be 1.0175x faster Towers.c 267.8196+-0.9613 ? 269.1882+-2.7965 ? <geometric> 713.7646+-4.6354 ? 726.4106+-21.4725 ? might be 1.0177x slower Conf#1 Conf#2 Geomean of preferred means: <scaled-result> 31.1351+-0.3067 31.1058+-0.3086 might be 1.0009x faster
Filip Pizlo
Comment 4
2016-03-17 17:28:11 PDT
Comment on
attachment 274338
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=274338&action=review
I suspect that the SideState map is not needed.
> Source/JavaScriptCore/ChangeLog:36 > + -SideState
Which heap locations list SideState or World as their heap?
Benjamin Poulain
Comment 5
2016-03-17 17:33:51 PDT
(In reply to
comment #4
)
> Comment on
attachment 274338
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=274338&action=review
> > I suspect that the SideState map is not needed. > > > Source/JavaScriptCore/ChangeLog:36 > > + -SideState > > Which heap locations list SideState or World as their heap?
I don't think any. It is just to be complete if someone where to do that. Since the maps are empty on clobber(), that operation does not do much.
Filip Pizlo
Comment 6
2016-03-17 17:44:13 PDT
(In reply to
comment #5
)
> (In reply to
comment #4
) > > Comment on
attachment 274338
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=274338&action=review
> > > > I suspect that the SideState map is not needed. > > > > > Source/JavaScriptCore/ChangeLog:36 > > > + -SideState > > > > Which heap locations list SideState or World as their heap? > > I don't think any. It is just to be complete if someone where to do that. > Since the maps are empty on clobber(), that operation does not do much.
I believe that it would be wrong to use SideState as the heap of a HeapLocation. It would also be wrong to use World or Stack directly. The reason is that a HeapLocation describes some place in the heap that you've already precisely resolved - i.e. it's a global variable and you know which one, or you know it's a field and you know the field's name and the base object Node, or you know it's an array element and you know the kind of array, the base object Node, and the index Node. If you have resolved something so precisely, then you have no need to use a super abstract catch-all heap like World. Also, SideState is write-only, so you'd never want to track it for CSE. It's for writes to things we will read on OSR for example. It would be wrong to ever read(SideState). It would be wrong to def using an abstract heap that is never read. Therefore, SideState is incorrect for HeapLocation. I think it would be best to actually add a validation rule that rules clobberize() and checks that def() never creates a HeapLocation that refers directly to: - SideState - World - Heap I think it's OK to have a HeapLocation that refers to Stack, because you might be indexing into the stack. But if you have a HeapLocation that wants to use World or Heap, then you really should have been more specific.
Benjamin Poulain
Comment 7
2016-03-17 17:47:07 PDT
Comment on
attachment 274338
[details]
Patch I'll make the patch stricter. That's probably be faster too.
Benjamin Poulain
Comment 8
2016-03-17 18:43:30 PDT
Created
attachment 274353
[details]
Patch
Filip Pizlo
Comment 9
2016-03-17 18:52:01 PDT
Comment on
attachment 274353
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=274353&action=review
> Source/JavaScriptCore/dfg/DFGCSEPhase.cpp:194 > + auto addResult = m_abstractHeapStackMap.add(abstractHeap, nullptr);
Don't you want this to just be an integer-indexed HashMap? Once you know that you have Stack, the only thing that matters is the payload, which is an int64_t. But really, it will contain a VirtualRegister, which is just an int32. That means you can use INT_MIN/INT_MAX for the empty/deleted values, for example.
Benjamin Poulain
Comment 10
2016-03-17 18:57:16 PDT
(In reply to
comment #9
)
> Comment on
attachment 274353
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=274353&action=review
> > > Source/JavaScriptCore/dfg/DFGCSEPhase.cpp:194 > > + auto addResult = m_abstractHeapStackMap.add(abstractHeap, nullptr); > > Don't you want this to just be an integer-indexed HashMap? Once you know > that you have Stack, the only thing that matters is the payload, which is an > int64_t. But really, it will contain a VirtualRegister, which is just an > int32. That means you can use INT_MIN/INT_MAX for the empty/deleted values, > for example.
That's a good idea, I did not realize the payload was int32.
Benjamin Poulain
Comment 11
2016-03-17 19:22:44 PDT
Created
attachment 274357
[details]
Patch
Benjamin Poulain
Comment 12
2016-03-17 19:23:38 PDT
Created
attachment 274358
[details]
Patch
WebKit Commit Bot
Comment 13
2016-03-17 21:07:05 PDT
Comment on
attachment 274358
[details]
Patch Clearing flags on attachment: 274358 Committed
r198376
: <
http://trac.webkit.org/changeset/198376
>
WebKit Commit Bot
Comment 14
2016-03-17 21:07:09 PDT
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug