WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
163548
Air::IRC doesn't need to have a special-case for uncolored Tmps since they all get colored
https://bugs.webkit.org/show_bug.cgi?id=163548
Summary
Air::IRC doesn't need to have a special-case for uncolored Tmps since they al...
Filip Pizlo
Reported
2016-10-17 09:55:34 PDT
IRC used to assume that no-degree Tmps can get any color. That's not true. No degree does not really mean no interference because of coalescable Moves.
Bug 163509
was about that. Now we just need to remove the other side of this special case: IRC still has a special case for uncolored Tmps. We don't need that anymore - a truly unused Tmp will get handled naturally by the algorithm. If we ever find that large numbers of unused Tmps are putting a stress on IRC and making it run longer, then we could just add a Tmp pruning pass. Also, it turns out that this special case is hiding other bugs. I'm investigating those now.
Attachments
the patch
(24.90 KB, patch)
2016-10-17 11:42 PDT
,
Filip Pizlo
no flags
Details
Formatted Diff
Diff
the patch
(25.55 KB, patch)
2016-10-17 11:51 PDT
,
Filip Pizlo
no flags
Details
Formatted Diff
Diff
the patch
(25.71 KB, patch)
2016-10-17 11:58 PDT
,
Filip Pizlo
ggaren
: review+
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Filip Pizlo
Comment 1
2016-10-17 09:58:32 PDT
Here's a horrible bug. Say you have a Patchpoint that late-clobbers all regs followed by a Patchpoint that wants a scratch. If you do this, the scratch is uncolorable. We have this bug right now because we don't have callee-save FPRs on x86. So if you have a CCall or JS call Patchpoint followed by a Patchpoint that wants FP scratches, then the FP scratch will not get colored and it will be left as %xmm0 at the end of IRC. That may be wrong, if the return value of the CCall or Patchpoint just before was %xmm0, for example.
Filip Pizlo
Comment 2
2016-10-17 10:17:03 PDT
This bug exists because our strategy to reason about interference at instruction boundaries doesn't capture the essence of what scratch registers are all about. For any other kind of use, we know that the tmp holding the use will have to have been live across the previous instruction, so it will definitely interfere with the previous instruction's late defs and late clobbers. So, we can safely conflate the execution of the previous instruction's late actions with the execution of this instruction's early actions. Reasoning about interference only at instruction boundaries is also imprecise for register clobber sets - but only just. Imagine a pair of consecutive insts where the first has a dead def and the second has an early clobber. In fact, the dead def could use the same register as the early clobber, but we won't let that happen. Perhaps a better example is an inst with a late clobber followed by an inst with an early def. Clearly, the early def could use the same register as the previous instruction's late clobber, but we won't let it. That's actually a serious bug. It would mean that if you early define a floating point value and the thing before you is a CCall, you'll have a really bad time. Fortunately, the only thing we have to do to ensure that things don't go wrong is to insert Nops to pad between instructions that have incompatible constraints. Here are some Nop insertion rules that we could apply: - Insert Nop before any instruction that claims early reg clobber and after any instruction that claims late reg clobber. I think that this would fix all known cases of this bug. This still means that the scratch registers of one inst could affect the colorability of early defs of the next inst, which is pretty weird. - Insert Nop around scratch users. This wouldn't be enough, since it wouldn't fix the early def problem. Maybe one way to look at this is that Scratch and EarlyDef are the problem children. Let's consider if this is really true by looking at the four main forms of arg kinds: early versus late crossed with use versus def. Early Def: We know that this case is problematic, because it's exactly what happens in my current crash. It happens to be an early def triggered by Scratch. Early Use: Uses cause liveness. Either the thing that is being used is defined by the previous instruction or it's live before the previous instruction - so either way, it interferes with all of the previous instruction's clobbers and defs. No problem here. Late Def: This case is also problematic. You could have a late def followed by something that early-clobbers all registers. In that case we'd want the def to use one of those registers and then insert spill code, but we can't do that if the def cannot get any regs at all. Late Use: This could be a killing use. Imagine a late use followed by something that early-clobbers all registers. Right now we don't have any clients that early-clobber all registers. But they may early-clobber some registers, and maybe this is enough to worry about. Note that we can generalize this a bit further. A clobber is like a dead def. So the patterns of interference we have to worry about are: Late Def or Late Clobber followed by Early Def or Early Clobber Late Use followed by Early Def or Early Clobber. Maybe what we should do is have a pre-pass for register allocators that inserts Nops to pad those bad cases.
Radar WebKit Bug Importer
Comment 3
2016-10-17 11:37:11 PDT
<
rdar://problem/28804381
>
Filip Pizlo
Comment 4
2016-10-17 11:42:15 PDT
Created
attachment 291848
[details]
the patch
WebKit Commit Bot
Comment 5
2016-10-17 11:44:41 PDT
Attachment 291848
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/ChangeLog:15: Please consider whether the use of security-sensitive phrasing could help someone exploit WebKit: fuzzing [changelog/unwantedsecurityterms] [3] Total errors found: 1 in 15 files If any of these errors are false positives, please file a bug against check-webkit-style.
Filip Pizlo
Comment 6
2016-10-17 11:51:17 PDT
Created
attachment 291850
[details]
the patch
WebKit Commit Bot
Comment 7
2016-10-17 11:54:12 PDT
Attachment 291850
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/ChangeLog:16: Please consider whether the use of security-sensitive phrasing could help someone exploit WebKit: fuzzing [changelog/unwantedsecurityterms] [3] Total errors found: 1 in 15 files If any of these errors are false positives, please file a bug against check-webkit-style.
Filip Pizlo
Comment 8
2016-10-17 11:58:06 PDT
Created
attachment 291852
[details]
the patch
WebKit Commit Bot
Comment 9
2016-10-17 11:59:27 PDT
Attachment 291852
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/ChangeLog:16: Please consider whether the use of security-sensitive phrasing could help someone exploit WebKit: fuzzing [changelog/unwantedsecurityterms] [3] Total errors found: 1 in 15 files If any of these errors are false positives, please file a bug against check-webkit-style.
Filip Pizlo
Comment 10
2016-10-17 12:26:24 PDT
Perf is neutral: Benchmark report for SunSpider, LongSpider, V8Spider, Octane, Kraken, and AsmBench on murderface (MacBookPro11,5). VMs tested: "TipOfTree" at /Volumes/Data/tertiary/OpenSource/WebKitBuild/Release/jsc (
r207408
) "Things" at /Volumes/Data/secondary/OpenSource/WebKitBuild/Release/jsc (
r207408
) Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree Things SunSpider: 3d-cube 4.5635+-0.2077 ! 4.8669+-0.0921 ! definitely 1.0665x slower 3d-morph 4.6061+-0.0371 ? 4.7535+-0.2435 ? might be 1.0320x slower 3d-raytrace 4.7595+-0.1825 ? 4.8223+-0.1327 ? might be 1.0132x slower access-binary-trees 1.9661+-0.0268 ? 2.0055+-0.0611 ? might be 1.0201x slower access-fannkuch 4.7254+-0.1757 ? 4.7303+-0.1320 ? access-nbody 2.3392+-0.0273 ? 2.3603+-0.0908 ? access-nsieve 3.2064+-0.0675 3.1633+-0.0407 might be 1.0136x faster bitops-3bit-bits-in-byte 1.1117+-0.0726 ? 1.1221+-0.0309 ? bitops-bits-in-byte 2.7420+-0.0654 2.7278+-0.0856 bitops-bitwise-and 1.9352+-0.0342 ? 1.9899+-0.1687 ? might be 1.0283x slower bitops-nsieve-bits 2.9925+-0.0412 ? 3.0083+-0.0516 ? controlflow-recursive 2.2332+-0.0307 ? 2.2554+-0.0644 ? crypto-aes 4.2394+-0.0621 ? 4.3119+-0.1421 ? might be 1.0171x slower crypto-md5 2.6076+-0.0831 ? 2.6100+-0.0558 ? crypto-sha1 2.7259+-0.1662 2.6991+-0.0678 date-format-tofte 6.6424+-0.1466 ? 6.6534+-0.1654 ? date-format-xparb 4.3614+-0.0681 ? 4.4875+-0.2865 ? might be 1.0289x slower math-cordic 2.6462+-0.0672 2.6402+-0.0526 math-partial-sums 3.8569+-0.0742 3.8362+-0.1170 math-spectral-norm 1.9379+-0.0141 ? 1.9799+-0.0537 ? might be 1.0217x slower regexp-dna 6.0219+-0.1280 ? 6.0466+-0.2232 ? string-base64 4.5788+-0.2945 4.5485+-0.1798 string-fasta 5.3533+-0.2661 5.3433+-0.0949 string-tagcloud 8.2626+-0.2509 8.2262+-0.5519 string-unpack-code 18.8714+-0.9339 18.3337+-0.6285 might be 1.0293x faster string-validate-input 4.2263+-0.1359 4.2087+-0.0681 <arithmetic> 4.3659+-0.0488 ? 4.3743+-0.0422 ? might be 1.0019x slower TipOfTree Things LongSpider: 3d-cube 766.3006+-7.8896 ? 774.9436+-10.7597 ? might be 1.0113x slower 3d-morph 564.8380+-5.7219 560.7331+-2.7138 3d-raytrace 451.1555+-6.5341 ? 452.2502+-4.7283 ? access-binary-trees 785.1498+-3.1568 777.8087+-7.5539 access-fannkuch 236.9313+-15.8096 228.9799+-4.0861 might be 1.0347x faster access-nbody 503.0647+-5.3581 502.2517+-4.3050 access-nsieve 283.2287+-12.0946 281.6157+-7.3657 bitops-3bit-bits-in-byte 31.6905+-1.1717 31.3657+-0.7230 might be 1.0104x faster bitops-bits-in-byte 81.0870+-1.7034 ? 82.3441+-2.7083 ? might be 1.0155x slower bitops-nsieve-bits 368.1612+-5.3194 ? 371.5583+-10.6290 ? controlflow-recursive 431.3577+-4.9020 426.2359+-2.2779 might be 1.0120x faster crypto-aes 536.8144+-7.0910 530.1278+-9.4802 might be 1.0126x faster crypto-md5 454.6825+-11.5408 447.8979+-9.3365 might be 1.0151x faster crypto-sha1 593.3735+-8.6011 588.5035+-7.0299 date-format-tofte 329.7884+-8.0984 325.7576+-3.4504 might be 1.0124x faster date-format-xparb 597.3298+-3.5808 ? 597.3829+-3.6885 ? hash-map 135.3798+-1.8664 134.5875+-0.5538 math-cordic 406.5844+-3.0753 406.4504+-1.8212 math-partial-sums 280.3720+-1.1693 ? 285.6557+-7.9468 ? might be 1.0188x slower math-spectral-norm 513.7357+-1.8481 ? 514.9935+-3.5529 ? string-base64 465.3550+-2.4841 ? 470.2137+-4.4853 ? might be 1.0104x slower string-fasta 323.7067+-4.7078 ? 329.2609+-7.7219 ? might be 1.0172x slower string-tagcloud 161.2982+-2.1852 ? 162.5371+-0.8447 ? <geometric> 334.5645+-1.6294 334.0034+-0.7251 might be 1.0017x faster TipOfTree Things V8Spider: crypto 33.8389+-0.2162 ? 33.9471+-0.4037 ? deltablue 48.0852+-1.1778 47.8535+-1.7035 earley-boyer 34.2065+-0.4593 ? 34.5249+-0.4456 ? raytrace 18.6578+-0.3119 18.4515+-0.1656 might be 1.0112x faster regexp 49.6619+-0.5939 49.5150+-0.2479 richards 38.9279+-1.2655 37.6948+-0.7162 might be 1.0327x faster splay 28.6549+-0.9101 ? 29.2582+-0.8115 ? might be 1.0211x slower <geometric> 34.4399+-0.3001 34.3528+-0.2323 might be 1.0025x faster TipOfTree Things Octane: encrypt 0.14809+-0.00079 ? 0.14842+-0.00083 ? decrypt 2.54850+-0.03149 2.54710+-0.00805 deltablue x2 0.11653+-0.00237 ? 0.11678+-0.00111 ? earley 0.23928+-0.00170 0.23807+-0.00154 boyer 4.26517+-0.12756 4.12466+-0.14025 might be 1.0341x faster navier-stokes x2 4.61390+-0.02118 ? 4.62915+-0.03708 ? raytrace x2 0.66140+-0.00761 0.65207+-0.00489 might be 1.0143x faster richards x2 0.07742+-0.00120 0.07704+-0.00025 splay x2 0.31110+-0.00157 0.30727+-0.00233 might be 1.0124x faster regexp x2 16.47500+-0.50938 16.23519+-0.56249 might be 1.0148x faster pdfjs x2 38.62938+-0.52428 38.53210+-0.49054 mandreel x2 39.78858+-0.31988 ? 39.98960+-0.49391 ? gbemu x2 28.49413+-0.34866 28.48138+-0.21021 closure 0.47102+-0.00322 ? 0.47189+-0.00435 ? jquery 6.42113+-0.03222 ? 6.46318+-0.04181 ? box2d x2 8.63700+-0.12467 ? 8.65220+-0.07580 ? zlib x2 336.08184+-12.29731 ? 337.94990+-4.50929 ? typescript x2 600.42969+-13.20123 ? 602.13529+-14.98861 ? <geometric> 4.68792+-0.02586 4.67470+-0.02102 might be 1.0028x faster TipOfTree Things Kraken: ai-astar 88.491+-1.811 ? 89.722+-3.738 ? might be 1.0139x slower audio-beat-detection 35.869+-1.533 35.285+-0.129 might be 1.0165x faster audio-dft 96.468+-3.494 95.269+-2.450 might be 1.0126x faster audio-fft 27.391+-0.069 ? 29.140+-4.349 ? might be 1.0639x slower audio-oscillator 42.631+-0.225 ? 42.715+-0.244 ? imaging-darkroom 55.945+-0.700 55.686+-0.531 imaging-desaturate 40.610+-0.153 ? 42.115+-2.123 ? might be 1.0370x slower imaging-gaussian-blur 59.987+-3.584 58.609+-2.727 might be 1.0235x faster json-parse-financial 32.195+-1.510 31.944+-1.557 json-stringify-tinderbox 21.179+-1.261 ? 22.691+-1.995 ? might be 1.0714x slower stanford-crypto-aes 34.114+-0.352 34.014+-0.209 stanford-crypto-ccm 32.386+-2.168 32.163+-0.476 stanford-crypto-pbkdf2 88.909+-0.829 ? 91.229+-1.995 ? might be 1.0261x slower stanford-crypto-sha256-iterative 29.280+-0.517 ? 30.207+-2.015 ? might be 1.0317x slower <arithmetic> 48.961+-0.559 ? 49.342+-0.677 ? might be 1.0078x slower TipOfTree Things AsmBench: bigfib.cpp 409.4784+-7.7076 ? 409.5128+-5.0230 ? cray.c 354.0447+-5.1482 ? 354.8725+-3.2708 ? dry.c 386.9092+-4.0625 ? 390.0046+-2.7128 ? FloatMM.c 633.1066+-33.3193 619.0405+-30.2663 might be 1.0227x faster gcc-loops.cpp 3342.0907+-22.3980 ? 3353.3735+-14.7225 ? n-body.c 747.9452+-2.9462 ? 750.7775+-3.9302 ? Quicksort.c 367.4604+-6.4377 ? 371.8779+-10.8246 ? might be 1.0120x slower stepanov_container.cpp 3161.1853+-19.0877 ? 3163.3820+-22.5073 ? Towers.c 230.9992+-3.1366 ? 232.0922+-4.7698 ? <geometric> 660.0768+-4.5254 ? 660.9889+-4.0355 ? might be 1.0014x slower TipOfTree Things Geomean of preferred means: <scaled-result> 44.3607+-0.2115 ? 44.3904+-0.1871 ? might be 1.0007x slower
Geoffrey Garen
Comment 11
2016-10-17 14:22:15 PDT
Comment on
attachment 291852
[details]
the patch r=me
Filip Pizlo
Comment 12
2016-10-17 14:49:13 PDT
Landed in
https://trac.webkit.org/changeset/207434
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug