We use the wrong API. We should use a concurrent dispatch queue with a dispatch group instead of using a serial queue.
Created attachment 104175 [details] Patch
Performance results on methanol (https://gitorious.org/methanol/) with the svg test set: baseline: 2798.2 with 7.6% deviance parallel jobs enabled: 2663.9 with 15.5% deviance parallel jobs + patch: 1634.6 with 11.6% deviance I don't know where are the huge deviance is coming from. However I believe the results are convincing even with that.
Attachment 104175 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/wtf/ParallelJobsLibdispatch.h:67: More than one command on the same line [whitespace/newline] [4] Source/JavaScriptCore/wtf/ParallelJobsLibdispatch.h:67: Missing space before { [whitespace/braces] [5] Total errors found: 2 in 2 files If any of these errors are false positives, please file a bug against check-webkit-style.
Note that the style failure is because of the parallel libdispatch syle block: ^{(*m_threadFunction)(parameters);}
(In reply to comment #4) > Note that the style failure is because of the parallel libdispatch syle block: ^{(*m_threadFunction)(parameters);} You should have whitespace inside of the braces. In other words: ^{ (*m_threadFunction)(parameters); }
Comment on attachment 104175 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=104175&action=review > Source/JavaScriptCore/ChangeLog:12 > + implementatoin was almost identical (less then 5% win). *implementation* was almost identical (less *than* 5% win).
Comment on attachment 104175 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=104175&action=review > Source/JavaScriptCore/wtf/ParallelJobsLibdispatch.h:63 > - dispatch_queue_t parallelJobsQueue = dispatch_queue_create("ParallelJobs", 0); > + static dispatch_queue_t globalQueue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0); Good catch! However, could we use dispatch_queue_create("ParallelJobs", DISPATCH_QUEUE_CONCURRENT) instead of the global concurrent queue? I do not know what is the main difference between those (in terms of design and performance) ?
(In reply to comment #7) > (From update of attachment 104175 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=104175&action=review > > > Source/JavaScriptCore/wtf/ParallelJobsLibdispatch.h:63 > > - dispatch_queue_t parallelJobsQueue = dispatch_queue_create("ParallelJobs", 0); > > + static dispatch_queue_t globalQueue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0); > > Good catch! However, could we use dispatch_queue_create("ParallelJobs", DISPATCH_QUEUE_CONCURRENT) instead of the global concurrent queue? I do not know what is the main difference between those (in terms of design and performance) ? I don't know which solution is better. Maybe there is an overhead of creating our own queue each time we have been called. Mark, could you share your knowledge about that with us?
One obvious point is that DISPATCH_QUEUE_CONCURRENT is only available on 10.7 and newer.
By the way, couldn’t the explicit loop be replaced with a call to dispatch_apply?
(In reply to comment #10) > By the way, couldn’t the explicit loop be replaced with a call to dispatch_apply? Yes, the same idea came to my mind after a bit more documentation reading. I will do that in the next patch.
I have a new WIP patch with dispatch_apply that turns on ENABLE(THREADING_LIBDISPATCH) for testing but it crashes. If I'm not doing smg wrong than it seems like there is an issue with FELightning. According to the crash log there are 4 threads and all of them crashing in FELightning::LightningData::interior. I'm going to upload the WIP patch in a moment. 1 WTF::ByteArray::get(unsigned int) const 1 WTF::ByteArray::get(unsigned int) const 1 WTF::ByteArray::get(unsigned int) const 1 WTF::ByteArray::get(unsigned int) const 2 WebCore::FELighting::LightingData::interior(int, WebCore::IntPoint&) 2 WebCore::FELighting::LightingData::interior(int, WebCore::IntPoint&) 2 WebCore::FELighting::LightingData::interior(int, WebCore::IntPoint&) 2 WebCore::FELighting::LightingData::interior(int, WebCore::IntPoint&) 3 WebCore::FELighting::platformApplyGenericPaint(WebCore::FELighting::LightingData&, WebCore::LightSource::PaintingData&, int, int) 3 WebCore::FELighting::platformApplyGenericPaint(WebCore::FELighting::LightingData&, WebCore::LightSource::PaintingData&, int, int) 3 WebCore::FELighting::platformApplyGenericPaint(WebCore::FELighting::LightingData&, WebCore::LightSource::PaintingData&, int, int) 3 WebCore::FELighting::platformApplyGenericPaint(WebCore::FELighting::LightingData&, WebCore::LightSource::PaintingData&, int, int) 4 WebCore::FELighting::platformApplyGenericWorker(WebCore::FELighting::PlatformApplyGenericParameters*) 4 WebCore::FELighting::platformApplyGenericWorker(WebCore::FELighting::PlatformApplyGenericParameters*) 4 WebCore::FELighting::platformApplyGenericWorker(WebCore::FELighting::PlatformApplyGenericParameters*) 4 WebCore::FELighting::platformApplyGenericWorker(WebCore::FELighting::PlatformApplyGenericParameters*) 5 __execute_block_invoke_1 5 __execute_block_invoke_1 5 __execute_block_invoke_1 5 __execute_block_invoke_1 6 _dispatch_apply2 6 _dispatch_apply2 6 _dispatch_apply2 6 _dispatch_apply2 7 dispatch_apply_f 7 _dispatch_worker_thread2 7 _dispatch_worker_thread2 7 _dispatch_worker_thread2 8 WTF::ParallelEnvironment::execute(unsigned char*) 8 _pthread_wqthread 8 _pthread_wqthread 8 _pthread_wqthread 9 WTF::ParallelJobs<WebCore::FELighting::PlatformApplyGenericParameters>::execute() 9 start_wqthread 9 start_wqthread 9 start_wqthread
Created attachment 105145 [details] WIP patch
There is some debugging info for SVG experts. The parameters given to the threads are seems to be valid so I have no clue about what's going on. (gdb) f 3 #3 0x0000000101aa93bb in WebCore::FELighting::platformApplyGenericWorker (parameters=0x13c8010d0) at /Users/balazs/master_clean/Source/WebCore/platform/graphics/filters/FELighting.cpp:248 248 parameters->filter->platformApplyGenericPaint(parameters->data, parameters->paintingData, parameters->yStart, parameters->yEnd); (gdb) p parameters $2 = ('WebCore::FELighting::PlatformApplyGenericParameters' *) 0x13c8010d0 (gdb) p *parameters $3 = { filter = 0x12b851130, data = { pixels = 0x1338b2000, surfaceScale = 0.0392156877, widthMultipliedByPixelSize = 2036, widthDecreasedByOne = 508, heightDecreasedByOne = 355 }, paintingData = { lightVector = { m_x = -558, m_y = -354, m_z = 147.215683 }, colorVector = { m_x = 166.03772, m_y = 60.3773499, m_z = 0 }, lightVectorLength = 677.017334, directionVector = { m_x = 0.624695003, m_y = 0.624695003, m_z = -0.468521267 }, privateColorVector = { m_x = 176, m_y = 64, m_z = 0 }, coneCutOffLimit = -0.342020124, coneFullLight = -0.358020127, specularExponent = 1 }, yStart = 407, yEnd = 436 } (gdb) f 0 #0 0x0000000101a9e12c in WTF::ByteArray::get (this=0x1338b2000, index=828655) at ByteArray.h:67 67 ASSERT(index < m_size); (gdb) p index $4 = 828655 (gdb) p m_size $5 = 724816
I tried it on Linux with '#define ENABLE_THREADING_GENERIC 1' in Platform.h, and I increased the max number of threads from 2 to 8 (in ParallelJobsGeneric.h). I runned the methanol svg test suite on a 16 core machine and the same crash happened than on Mac with the WIP patch: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffa24ce700 (LWP 3978)] 0x00007ffff738cfc9 in WebCore::FELighting::platformApplyGenericWorker(WebCore::FELighting::PlatformApplyGenericParameters*) () from /home/balazs/master_clean/WebKitBuild/Release/bin/../lib/libQtWebKit.so.4 (gdb) bt 10 #0 0x00007ffff738cfc9 in WebCore::FELighting::platformApplyGenericWorker(WebCore::FELighting::PlatformApplyGenericParameters*) () from /home/balazs/master_clean/WebKitBuild/Release/bin/../lib/libQtWebKit.so.4 #1 0x00007ffff74698b9 in WTF::ParallelEnvironment::ThreadPrivate::workerThread(void*) () from /home/balazs/master_clean/WebKitBuild/Release/bin/../lib/libQtWebKit.so.4 #2 0x00007ffff746841b in WTF::ThreadPrivate::run() () from /home/balazs/master_clean/WebKitBuild/Release/bin/../lib/libQtWebKit.so.4 Something is definitely wrong with FELightning.
(In reply to comment #15) > Something is definitely wrong with FELightning. Looks like somebody swapped "width" in for "height" in FELighting::platformApplyGeneric, when determining the yStep. I'll file a separate bug about that.
https://bugs.webkit.org/show_bug.cgi?id=67719 is the crash
The speedup on the methanol benchmark with this patch is quite impressive! I'm excited for this to land :-)
(In reply to comment #18) > The speedup on the methanol benchmark with this patch is quite impressive! I'm excited for this to land :-) Thank you for the fix! I think it's time to utilize the infrastructure. First I will update the patch without actually enabling PARALLEL_JOBS by default. The next step will be enabling it - the libdispatch implementation for Mac and the generic for other platforms. I hope these plans sounds good to everyone.
Current methanol results on a MacPro5,1 (6 cores) with SL Server: ref - 2582.733 (16.12% deviance) parallel jobs - 982.06 (19.9% diviance)
Created attachment 106718 [details] Patch
Ping ...
Comment on attachment 106718 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=106718&action=review > Source/JavaScriptCore/ChangeLog:12 > + was a misusage of the API). Enabling PARALLEL_JOBS is now misuse r=me with the typo fix above
Committed r96492: <http://trac.webkit.org/changeset/96492>