Summary: | Make the old JIT not rely on the timeout check register | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Gabor Ballabas <gaborb> | ||||||||
Component: | JavaScriptCore | Assignee: | Nobody <webkit-unassigned> | ||||||||
Status: | NEW --- | ||||||||||
Severity: | Normal | CC: | barraclough, fpizlo, gaborb, gergely, ggaren, kilvadyb, loki, mark.lam, oliver, ossy, yong.li.webkit, zherczeg | ||||||||
Priority: | P2 | ||||||||||
Version: | 528+ (Nightly build) | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Bug Depends on: | |||||||||||
Bug Blocks: | 79040 | ||||||||||
Attachments: |
|
Description
Gabor Ballabas
2012-09-17 06:32:18 PDT
Created attachment 164384 [details]
Patch
MIPS and SH4 versions of the emitTimeoutCheck function has been left as they were (with added
ifdef guards) due to lack of any kind of device or buildbot for testing the changes.
Add Balázs and myself to the CC list to track for MIPS support. Created attachment 166020 [details]
Updated patch
Ping for review? (In reply to comment #3) > Created an attachment (id=166020) [details] > Updated patch Unfortunately it isn't applyable now. Could you rebase the patch to ToT, please? Created attachment 173616 [details]
Rebased patch
Comment on attachment 173616 [details] Rebased patch View in context: https://bugs.webkit.org/attachment.cgi?id=173616&action=review > Source/JavaScriptCore/ChangeLog:9 > + First step to solve this DFG-JIT related > + bug:https://bugs.webkit.org/show_bug.cgi?id=79040 Perhaps a little more explanation would be good. Otherwise, the patch looks good. Oliver, Filip, anybody any thoughts? How does this relate to bug 79040? What is the plan you're pursuing for restoring timeout checking? Is this just a subset of the patch in bug 79040? We can't evaluate that patch until you post performance numbers for it. (In reply to comment #8) > How does this relate to bug 79040? What is the plan you're pursuing for restoring timeout checking? We discussed this topic with Filip Pizlo before. We should either support timeout checking everywhere or remove the feature completly (for consistency). We decided to keep the feature that time. > Is this just a subset of the patch in bug 79040? We can't evaluate that patch until you post performance numbers for it. Yes, this is the first part. We free the timeout register first so DFG-JIT could use it. Sorry for the missing performance numbers. Gabor, could you copy them here? > Yes, this is the first part.
So your plan is to do timeout checking by polling an in-memory counter at loop boundaries in the interpreter, the baseline JIT, and the DFG?
> So your plan is to do timeout checking by polling an in-memory counter at loop boundaries in the interpreter, the baseline JIT, and the DFG?
Yes. That is a consistent approach. Or perhaps do you have any better idea?
> > So your plan is to do timeout checking by polling an in-memory counter at loop boundaries in the interpreter, the baseline JIT, and the DFG?
>
> Yes. That is a consistent approach. Or perhaps do you have any better idea?
That depends on the performance of polling an in-memory counter. For example, we could use a flag and an alarm to avoid having to read-modify-write the counter. Or we could poll in the interpreter and baseline JIT, where performance matters less, and use some other solution in the DFG.
If there's no performance cost to a read-modify-write counter in all engines, that seems like a fine approach. But I suspect there will be a performance cost. Have you tested this?
Sunspider results: Sunspider: TEST COMPARISON r134387 Patch DETAILS ============================================================================= ** TOTAL **: *1.003x as slow* 205.8ms +/- 0.1% 206.3ms +/- 0.1% significant ============================================================================= 3d: *1.005x as slow* 31.2ms +/- 0.3% 31.3ms +/- 0.4% significant cube: ?? 11.0ms +/- 0.2% 11.1ms +/- 0.7% not conclusive: might be *1.005x as slow* morph: *1.012x as slow* 9.0ms +/- 0.4% 9.2ms +/- 0.8% significant raytrace: - 11.1ms +/- 0.6% 11.1ms +/- 0.6% access: - 18.0ms +/- 0.1% 18.0ms +/- 0.0% binary-trees: - 2.0ms +/- 0.0% 2.0ms +/- 0.0% fannkuch: - 8.0ms +/- 0.2% 8.0ms +/- 0.0% nbody: - 4.0ms +/- 0.0% 4.0ms +/- 0.0% nsieve: - 4.0ms +/- 0.0% 4.0ms +/- 0.0% bitops: - 12.5ms +/- 0.8% 12.4ms +/- 0.8% 3bit-bits-in-byte: - 1.0ms +/- 0.0% 1.0ms +/- 0.0% bits-in-byte: - 5.5ms +/- 1.8% 5.4ms +/- 1.8% bitwise-and: - 2.0ms +/- 0.0% 2.0ms +/- 0.0% nsieve-bits: - 4.0ms +/- 0.0% 4.0ms +/- 0.0% controlflow: - 2.0ms +/- 0.0% 2.0ms +/- 0.0% recursive: - 2.0ms +/- 0.0% 2.0ms +/- 0.0% crypto: 1.002x as fast 17.0ms +/- 0.0% 17.0ms +/- 0.2% significant aes: 1.004x as fast 10.0ms +/- 0.0% 10.0ms +/- 0.4% significant md5: - 4.0ms +/- 0.0% 4.0ms +/- 0.0% sha1: - 3.0ms +/- 0.0% 3.0ms +/- 0.0% date: - 31.0ms +/- 0.1% 31.0ms +/- 0.1% format-tofte: - 16.0ms +/- 0.0% 16.0ms +/- 0.0% format-xparb: ?? 15.0ms +/- 0.3% 15.0ms +/- 0.2% not conclusive: might be *1.001x as slow* math: ?? 18.2ms +/- 0.4% 18.3ms +/- 0.5% not conclusive: might be *1.006x as slow* cordic: - 3.0ms +/- 0.0% 3.0ms +/- 0.0% partial-sums: - 13.0ms +/- 0.2% 13.0ms +/- 0.2% spectral-norm: ?? 2.1ms +/- 3.3% 2.3ms +/- 3.8% not conclusive: might be *1.047x as slow* regexp: - 11.0ms +/- 0.0% 11.0ms +/- 0.0% dna: - 11.0ms +/- 0.0% 11.0ms +/- 0.0% string: *1.005x as slow* 65.0ms +/- 0.1% 65.4ms +/- 0.2% significant base64: - 5.0ms +/- 0.0% 5.0ms +/- 0.0% fasta: - 10.0ms +/- 0.0% 10.0ms +/- 0.0% tagcloud: *1.005x as slow* 16.0ms +/- 0.1% 16.1ms +/- 0.4% significant unpack-code: *1.010x as slow* 26.0ms +/- 0.1% 26.3ms +/- 0.3% significant validate-input: - 8.0ms +/- 0.2% 8.0ms +/- 0.0% V8: TEST COMPARISON r134387 Patch DETAILS ============================================================================= ** TOTAL **: - 762.2ms +/- 0.4% 760.7ms +/- 0.3% ============================================================================= v8: - 762.2ms +/- 0.4% 760.7ms +/- 0.3% crypto: ?? 88.1ms +/- 0.9% 88.4ms +/- 0.7% not conclusive: might be *1.003x as slow* deltablue: *1.007x as slow* 137.3ms +/- 0.4% 138.3ms +/- 0.5% significant earley-boyer: ?? 99.4ms +/- 0.5% 99.8ms +/- 0.7% not conclusive: might be *1.004x as slow* raytrace: - 72.3ms +/- 0.7% 72.0ms +/- 0.7% regexp: 1.015x as fast 109.4ms +/- 0.7% 107.8ms +/- 0.5% significant richards: - 114.6ms +/- 0.7% 113.8ms +/- 0.5% splay: - 141.1ms +/- 0.7% 140.6ms +/- 0.6% (In reply to comment #13) The results were produced on x86_64 Linux. It seems the difference is negligible. Anyway, Geoffrey, if you have a better idea just let us know. The important thing is it should be consistent across JIT and interpreter. The results in bug 79040 show a 1%-2% regression across the board. Whose results are correct? (In reply to comment #16) > The results in bug 79040 show a 1%-2% regression across the board. Whose results are correct? The results here belong to old JIT patch. The results there belong to the DFG JIT patch. Since we are new to DFG JIT, that patch certanly could be improved. Do you have any suggestions how to do that? > Since we are new to DFG JIT, that patch certanly could be improved. Do you have any suggestions how to do that?
I'd start by trying a patch that polls a memory location but doesn't modify the location. If that avoids the regression, we can use an alarm to update the memory location after a certain amount of time, without cost.
For now, I guess I'll mark this patch r-, since it's a step toward a design change that's a performance regression.
As far as I know Gábor won't work on it in the future. Could you confirm it? (In reply to comment #19) > As far as I know Gábor won't work on it in the future. Could you confirm it? No, I won't. |