Bug 66378

Summary: libdispatch based ParallelJobs is not enough parallel
Product: WebKit Reporter: Balazs Kelemen <kbalazs>
Component: Web Template FrameworkAssignee: Balazs Kelemen <kbalazs>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, loki, mrowe, pandras, rhodovan.u-szeged, thorton, webkit.review.bot, zherczeg
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Mac   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
WIP patch
none
Patch none

Balazs Kelemen
Reported 2011-08-17 07:56:21 PDT
We use the wrong API. We should use a concurrent dispatch queue with a dispatch group instead of using a serial queue.
Attachments
Patch (2.41 KB, patch)
2011-08-17 08:15 PDT, Balazs Kelemen
no flags
WIP patch (3.00 KB, patch)
2011-08-25 01:45 PDT, Balazs Kelemen
no flags
Patch (3.45 KB, patch)
2011-09-08 03:09 PDT, Balazs Kelemen
no flags
Balazs Kelemen
Comment 1 2011-08-17 08:15:19 PDT
Balazs Kelemen
Comment 2 2011-08-17 08:20:04 PDT
Performance results on methanol (https://gitorious.org/methanol/) with the svg test set: baseline: 2798.2 with 7.6% deviance parallel jobs enabled: 2663.9 with 15.5% deviance parallel jobs + patch: 1634.6 with 11.6% deviance I don't know where are the huge deviance is coming from. However I believe the results are convincing even with that.
WebKit Review Bot
Comment 3 2011-08-17 08:23:49 PDT
Attachment 104175 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/wtf/ParallelJobsLibdispatch.h:67: More than one command on the same line [whitespace/newline] [4] Source/JavaScriptCore/wtf/ParallelJobsLibdispatch.h:67: Missing space before { [whitespace/braces] [5] Total errors found: 2 in 2 files If any of these errors are false positives, please file a bug against check-webkit-style.
Balazs Kelemen
Comment 4 2011-08-17 11:42:11 PDT
Note that the style failure is because of the parallel libdispatch syle block: ^{(*m_threadFunction)(parameters);}
Mark Rowe (bdash)
Comment 5 2011-08-17 15:04:53 PDT
(In reply to comment #4) > Note that the style failure is because of the parallel libdispatch syle block: ^{(*m_threadFunction)(parameters);} You should have whitespace inside of the braces. In other words: ^{ (*m_threadFunction)(parameters); }
Mark Rowe (bdash)
Comment 6 2011-08-17 15:10:13 PDT
Comment on attachment 104175 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=104175&action=review > Source/JavaScriptCore/ChangeLog:12 > + implementatoin was almost identical (less then 5% win). *implementation* was almost identical (less *than* 5% win).
Gabor Loki
Comment 7 2011-08-17 23:22:14 PDT
Comment on attachment 104175 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=104175&action=review > Source/JavaScriptCore/wtf/ParallelJobsLibdispatch.h:63 > - dispatch_queue_t parallelJobsQueue = dispatch_queue_create("ParallelJobs", 0); > + static dispatch_queue_t globalQueue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0); Good catch! However, could we use dispatch_queue_create("ParallelJobs", DISPATCH_QUEUE_CONCURRENT) instead of the global concurrent queue? I do not know what is the main difference between those (in terms of design and performance) ?
Balazs Kelemen
Comment 8 2011-08-18 02:14:08 PDT
(In reply to comment #7) > (From update of attachment 104175 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=104175&action=review > > > Source/JavaScriptCore/wtf/ParallelJobsLibdispatch.h:63 > > - dispatch_queue_t parallelJobsQueue = dispatch_queue_create("ParallelJobs", 0); > > + static dispatch_queue_t globalQueue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0); > > Good catch! However, could we use dispatch_queue_create("ParallelJobs", DISPATCH_QUEUE_CONCURRENT) instead of the global concurrent queue? I do not know what is the main difference between those (in terms of design and performance) ? I don't know which solution is better. Maybe there is an overhead of creating our own queue each time we have been called. Mark, could you share your knowledge about that with us?
Mark Rowe (bdash)
Comment 9 2011-08-18 10:34:16 PDT
One obvious point is that DISPATCH_QUEUE_CONCURRENT is only available on 10.7 and newer.
Mark Rowe (bdash)
Comment 10 2011-08-22 02:05:55 PDT
By the way, couldn’t the explicit loop be replaced with a call to dispatch_apply?
Balazs Kelemen
Comment 11 2011-08-22 02:26:14 PDT
(In reply to comment #10) > By the way, couldn’t the explicit loop be replaced with a call to dispatch_apply? Yes, the same idea came to my mind after a bit more documentation reading. I will do that in the next patch.
Balazs Kelemen
Comment 12 2011-08-25 01:44:41 PDT
I have a new WIP patch with dispatch_apply that turns on ENABLE(THREADING_LIBDISPATCH) for testing but it crashes. If I'm not doing smg wrong than it seems like there is an issue with FELightning. According to the crash log there are 4 threads and all of them crashing in FELightning::LightningData::interior. I'm going to upload the WIP patch in a moment. 1 WTF::ByteArray::get(unsigned int) const 1 WTF::ByteArray::get(unsigned int) const 1 WTF::ByteArray::get(unsigned int) const 1 WTF::ByteArray::get(unsigned int) const 2 WebCore::FELighting::LightingData::interior(int, WebCore::IntPoint&) 2 WebCore::FELighting::LightingData::interior(int, WebCore::IntPoint&) 2 WebCore::FELighting::LightingData::interior(int, WebCore::IntPoint&) 2 WebCore::FELighting::LightingData::interior(int, WebCore::IntPoint&) 3 WebCore::FELighting::platformApplyGenericPaint(WebCore::FELighting::LightingData&, WebCore::LightSource::PaintingData&, int, int) 3 WebCore::FELighting::platformApplyGenericPaint(WebCore::FELighting::LightingData&, WebCore::LightSource::PaintingData&, int, int) 3 WebCore::FELighting::platformApplyGenericPaint(WebCore::FELighting::LightingData&, WebCore::LightSource::PaintingData&, int, int) 3 WebCore::FELighting::platformApplyGenericPaint(WebCore::FELighting::LightingData&, WebCore::LightSource::PaintingData&, int, int) 4 WebCore::FELighting::platformApplyGenericWorker(WebCore::FELighting::PlatformApplyGenericParameters*) 4 WebCore::FELighting::platformApplyGenericWorker(WebCore::FELighting::PlatformApplyGenericParameters*) 4 WebCore::FELighting::platformApplyGenericWorker(WebCore::FELighting::PlatformApplyGenericParameters*) 4 WebCore::FELighting::platformApplyGenericWorker(WebCore::FELighting::PlatformApplyGenericParameters*) 5 __execute_block_invoke_1 5 __execute_block_invoke_1 5 __execute_block_invoke_1 5 __execute_block_invoke_1 6 _dispatch_apply2 6 _dispatch_apply2 6 _dispatch_apply2 6 _dispatch_apply2 7 dispatch_apply_f 7 _dispatch_worker_thread2 7 _dispatch_worker_thread2 7 _dispatch_worker_thread2 8 WTF::ParallelEnvironment::execute(unsigned char*) 8 _pthread_wqthread 8 _pthread_wqthread 8 _pthread_wqthread 9 WTF::ParallelJobs<WebCore::FELighting::PlatformApplyGenericParameters>::execute() 9 start_wqthread 9 start_wqthread 9 start_wqthread
Balazs Kelemen
Comment 13 2011-08-25 01:45:55 PDT
Created attachment 105145 [details] WIP patch
Balazs Kelemen
Comment 14 2011-08-25 02:08:33 PDT
There is some debugging info for SVG experts. The parameters given to the threads are seems to be valid so I have no clue about what's going on. (gdb) f 3 #3 0x0000000101aa93bb in WebCore::FELighting::platformApplyGenericWorker (parameters=0x13c8010d0) at /Users/balazs/master_clean/Source/WebCore/platform/graphics/filters/FELighting.cpp:248 248 parameters->filter->platformApplyGenericPaint(parameters->data, parameters->paintingData, parameters->yStart, parameters->yEnd); (gdb) p parameters $2 = ('WebCore::FELighting::PlatformApplyGenericParameters' *) 0x13c8010d0 (gdb) p *parameters $3 = { filter = 0x12b851130, data = { pixels = 0x1338b2000, surfaceScale = 0.0392156877, widthMultipliedByPixelSize = 2036, widthDecreasedByOne = 508, heightDecreasedByOne = 355 }, paintingData = { lightVector = { m_x = -558, m_y = -354, m_z = 147.215683 }, colorVector = { m_x = 166.03772, m_y = 60.3773499, m_z = 0 }, lightVectorLength = 677.017334, directionVector = { m_x = 0.624695003, m_y = 0.624695003, m_z = -0.468521267 }, privateColorVector = { m_x = 176, m_y = 64, m_z = 0 }, coneCutOffLimit = -0.342020124, coneFullLight = -0.358020127, specularExponent = 1 }, yStart = 407, yEnd = 436 } (gdb) f 0 #0 0x0000000101a9e12c in WTF::ByteArray::get (this=0x1338b2000, index=828655) at ByteArray.h:67 67 ASSERT(index < m_size); (gdb) p index $4 = 828655 (gdb) p m_size $5 = 724816
Balazs Kelemen
Comment 15 2011-09-07 03:14:32 PDT
I tried it on Linux with '#define ENABLE_THREADING_GENERIC 1' in Platform.h, and I increased the max number of threads from 2 to 8 (in ParallelJobsGeneric.h). I runned the methanol svg test suite on a 16 core machine and the same crash happened than on Mac with the WIP patch: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffa24ce700 (LWP 3978)] 0x00007ffff738cfc9 in WebCore::FELighting::platformApplyGenericWorker(WebCore::FELighting::PlatformApplyGenericParameters*) () from /home/balazs/master_clean/WebKitBuild/Release/bin/../lib/libQtWebKit.so.4 (gdb) bt 10 #0 0x00007ffff738cfc9 in WebCore::FELighting::platformApplyGenericWorker(WebCore::FELighting::PlatformApplyGenericParameters*) () from /home/balazs/master_clean/WebKitBuild/Release/bin/../lib/libQtWebKit.so.4 #1 0x00007ffff74698b9 in WTF::ParallelEnvironment::ThreadPrivate::workerThread(void*) () from /home/balazs/master_clean/WebKitBuild/Release/bin/../lib/libQtWebKit.so.4 #2 0x00007ffff746841b in WTF::ThreadPrivate::run() () from /home/balazs/master_clean/WebKitBuild/Release/bin/../lib/libQtWebKit.so.4 Something is definitely wrong with FELightning.
Tim Horton
Comment 16 2011-09-07 11:02:36 PDT
(In reply to comment #15) > Something is definitely wrong with FELightning. Looks like somebody swapped "width" in for "height" in FELighting::platformApplyGeneric, when determining the yStep. I'll file a separate bug about that.
Tim Horton
Comment 17 2011-09-07 11:20:12 PDT
Tim Horton
Comment 18 2011-09-07 12:06:31 PDT
The speedup on the methanol benchmark with this patch is quite impressive! I'm excited for this to land :-)
Balazs Kelemen
Comment 19 2011-09-07 15:44:10 PDT
(In reply to comment #18) > The speedup on the methanol benchmark with this patch is quite impressive! I'm excited for this to land :-) Thank you for the fix! I think it's time to utilize the infrastructure. First I will update the patch without actually enabling PARALLEL_JOBS by default. The next step will be enabling it - the libdispatch implementation for Mac and the generic for other platforms. I hope these plans sounds good to everyone.
Balazs Kelemen
Comment 20 2011-09-08 02:36:02 PDT
Current methanol results on a MacPro5,1 (6 cores) with SL Server: ref - 2582.733 (16.12% deviance) parallel jobs - 982.06 (19.9% diviance)
Balazs Kelemen
Comment 21 2011-09-08 03:09:43 PDT
Balazs Kelemen
Comment 22 2011-09-19 08:35:53 PDT
Ping ...
Zoltan Herczeg
Comment 23 2011-10-03 02:02:31 PDT
Comment on attachment 106718 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=106718&action=review > Source/JavaScriptCore/ChangeLog:12 > + was a misusage of the API). Enabling PARALLEL_JOBS is now misuse r=me with the typo fix above
Balazs Kelemen
Comment 24 2011-10-03 06:06:07 PDT
Note You need to log in before you can comment on or make changes to this bug.