WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
UNCONFIRMED
79184
Implement SpinLockMutexLocker on WTF and optimize some parts of JavaScriptCore using it
https://bugs.webkit.org/show_bug.cgi?id=79184
Summary
Implement SpinLockMutexLocker on WTF and optimize some parts of JavaScriptCor...
Kwonjin Jeong
Reported
2012-02-21 19:08:58 PST
Implement SpinLockLocker on WTF and optimize some parts of JavaScriptCore using SpinLockLocker
Attachments
Patch
(9.04 KB, patch)
2012-02-21 19:34 PST
,
Kwonjin Jeong
eric
: review-
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Kwonjin Jeong
Comment 1
2012-02-21 19:18:11 PST
Change the class name to SpinLockMutexLocker.
Kwonjin Jeong
Comment 2
2012-02-21 19:34:45 PST
Created
attachment 128114
[details]
Patch
Alexey Proskuryakov
Comment 3
2012-02-22 10:17:01 PST
How did you confirm that this patch is an optimization?
Kwonjin Jeong
Comment 4
2012-02-22 18:18:03 PST
(In reply to
comment #3
)
> How did you confirm that this patch is an optimization?
At first, I compare the performance just between MutexLocker and SpinLockMutexLocker. Upon examination, SpinLockMutexLocker is two times faster than MutexLocker. I created a performance test program that has a loop within a simple calculation and the Lockers. Then, the program loops one billion times. And then, I applied SpinLockMutexLocker on some parts of JavaScriptCore, then ran 'run-sunspider', and then compare the results. Actually, the result using SpinLockMutexLocker is 0.5% ~ 1.0% faster than MutexLocker's - it's within the margin of error. But I guess that the performance can be better by using SpinLockMutexLocker on some more parts. My test environment and test result is as written below. Port: QtWebKit OS: Linux (Ubuntu 11.10) CPU: Intel(R) Xeon(R) CPU X5650 @ 2.67GHz (6 core) RAM: 12GiB Loop test result: Mutex : 17047.357910 ms SpinLockMutex: 7897.659180 ms Simple Loop : 656.346680 ms Sum : 565039360 Sunspider result: The test program is WebKit/Source/JavaScriptCore/jsc.cpp with some modification. I just modify jscmain() function. int jscmain(int argc, char** argv, JSGlobalData* globalData) { int i; int sum = 0; Mutex mutex; SpinLockMutex spinLockMutex; double startTime; double endTimeMutexLocker; double endTimeSpinLock; double endTimeSimpleLoop; startTime = currentTime(); for (i = 0; i < 1000000000; i++) { MutexLocker locker(mutex); sum += i; } endTimeMutexLocker = currentTime(); for (i = 0; i < 1000000000; i++) { SpinLockMutexLocker locker(spinLockMutex); sum += i; } endTimeSpinLock = currentTime(); for (i = 0; i < 1000000000; i++) { sum += i; } endTimeSimpleLoop = currentTime(); printf("Mutex : %lf ms\n", endTimeMutexLocker * 1000.0 - startTime * 1000.0); printf("SpinLockMutex: %lf ms\n", endTimeSpinLock * 1000.0 - endTimeMutexLocker * 1000.0); printf("Simple Loop : %lf ms\n", endTimeSimpleLoop * 1000.0 - endTimeSpinLock * 1000.0); printf("Sum : %d\n", sum); return 0; }
Kwonjin Jeong
Comment 5
2012-02-22 18:42:05 PST
(In reply to
comment #3
)
> How did you confirm that this patch is an optimization?
I attach the sunspider results below. MutexLocker ============================================ RESULTS (means and 95% confidence intervals) -------------------------------------------- Total: 188.7ms +/- 0.5% -------------------------------------------- 3d: 26.9ms +/- 1.5% cube: 8.0ms +/- 0.0% morph: 10.1ms +/- 2.2% raytrace: 8.8ms +/- 3.4% access: 17.7ms +/- 2.0% binary-trees: 2.0ms +/- 0.0% fannkuch: 8.3ms +/- 4.2% nbody: 4.0ms +/- 0.0% nsieve: 3.4ms +/- 10.9% bitops: 15.8ms +/- 1.9% 3bit-bits-in-byte: 1.0ms +/- 0.0% bits-in-byte: 5.8ms +/- 5.2% bitwise-and: 3.0ms +/- 0.0% nsieve-bits: 6.0ms +/- 0.0% controlflow: 2.0ms +/- 0.0% recursive: 2.0ms +/- 0.0% crypto: 12.2ms +/- 2.5% aes: 8.1ms +/- 2.8% md5: 2.0ms +/- 0.0% sha1: 2.1ms +/- 10.8% date: 22.8ms +/- 2.0% format-tofte: 12.2ms +/- 2.5% format-xparb: 10.6ms +/- 3.5% math: 23.7ms +/- 2.5% cordic: 8.0ms +/- 0.0% partial-sums: 13.3ms +/- 2.6% spectral-norm: 2.4ms +/- 15.4% regexp: 11.0ms +/- 0.0% dna: 11.0ms +/- 0.0% string: 56.6ms +/- 1.4% base64: 4.6ms +/- 8.0% fasta: 8.1ms +/- 2.8% tagcloud: 14.1ms +/- 1.6% unpack-code: 23.8ms +/- 1.9% validate-input: 6.0ms +/- 0.0% SpinLockMutexLocker ============================================ RESULTS (means and 95% confidence intervals) -------------------------------------------- Total: 187.4ms +/- 0.4% -------------------------------------------- 3d: 27.0ms +/- 1.2% cube: 8.0ms +/- 0.0% morph: 10.1ms +/- 2.2% raytrace: 8.9ms +/- 2.5% access: 17.5ms +/- 2.2% binary-trees: 2.0ms +/- 0.0% fannkuch: 8.1ms +/- 2.8% nbody: 4.0ms +/- 0.0% nsieve: 3.4ms +/- 10.9% bitops: 16.0ms +/- 0.0% 3bit-bits-in-byte: 1.0ms +/- 0.0% bits-in-byte: 6.0ms +/- 0.0% bitwise-and: 3.0ms +/- 0.0% nsieve-bits: 6.0ms +/- 0.0% controlflow: 2.1ms +/- 10.8% recursive: 2.1ms +/- 10.8% crypto: 12.0ms +/- 0.0% aes: 8.0ms +/- 0.0% md5: 2.0ms +/- 0.0% sha1: 2.0ms +/- 0.0% date: 22.3ms +/- 1.5% format-tofte: 12.2ms +/- 2.5% format-xparb: 10.1ms +/- 2.2% math: 23.1ms +/- 1.0% cordic: 8.0ms +/- 0.0% partial-sums: 13.0ms +/- 0.0% spectral-norm: 2.1ms +/- 10.8% regexp: 11.0ms +/- 0.0% dna: 11.0ms +/- 0.0% string: 56.4ms +/- 0.9% base64: 4.4ms +/- 8.4% fasta: 8.0ms +/- 0.0% tagcloud: 14.0ms +/- 0.0% unpack-code: 23.9ms +/- 0.9% validate-input: 6.1ms +/- 3.7%
Kwonjin Jeong
Comment 6
2012-02-22 18:50:17 PST
I'm trying to fix build errors on mac and win port. It's caused by including TCSpinLock.h but the header file isn't in PrivateHeaders.
Alexey Proskuryakov
Comment 7
2012-02-22 19:34:59 PST
Please use SunSpider built-in feature to compare the results - it will tell you if the change is statistically significant. If the new locker is only two times as fast in a tight loop, it might not be worth adding complexity and bypassing all platform specific optimizations OS vendors add to regular mutex.
Kwonjin Jeong
Comment 8
2012-02-22 21:09:41 PST
(In reply to
comment #7
)
> Please use SunSpider built-in feature to compare the results - it will tell you if the change is statistically significant. > > If the new locker is only two times as fast in a tight loop, it might not be worth adding complexity and bypassing all platform specific optimizations OS vendors add to regular mutex.
Thank you for your comment. I'll try to use the feature and write the result here.
Eric Seidel (no email)
Comment 9
2012-04-19 16:42:50 PDT
Comment on
attachment 128114
[details]
Patch Waiting for the sunspider-compare-results results.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug