WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
REOPENED
202874
Use a Version 1 CFRunLoopSource for faster task dispatch
https://bugs.webkit.org/show_bug.cgi?id=202874
Summary
Use a Version 1 CFRunLoopSource for faster task dispatch
Sihui Liu
Reported
2019-10-11 19:10:30 PDT
...
Attachments
Patch
(4.63 KB, patch)
2019-10-11 19:22 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
patch - mach message
(2.62 KB, patch)
2019-10-13 08:48 PDT
,
Geoffrey Garen
no flags
Details
Formatted Diff
Diff
patch - mach message
(2.76 KB, patch)
2019-10-13 08:52 PDT
,
Geoffrey Garen
no flags
Details
Formatted Diff
Diff
patch - getUserMedia.html
(3.04 KB, patch)
2019-10-13 14:20 PDT
,
Geoffrey Garen
no flags
Details
Formatted Diff
Diff
patch - mach message
(8.26 KB, patch)
2019-10-14 20:26 PDT
,
Geoffrey Garen
no flags
Details
Formatted Diff
Diff
Patch
(14.05 KB, patch)
2019-10-16 13:12 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(13.75 KB, patch)
2019-10-16 13:27 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(13.76 KB, patch)
2019-10-16 14:11 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(13.77 KB, patch)
2019-10-16 15:31 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(17.71 KB, patch)
2019-10-17 01:30 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch for landing
(17.39 KB, patch)
2019-10-17 12:58 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(8.09 KB, patch)
2020-02-17 09:59 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(10.82 KB, patch)
2020-02-24 14:19 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(10.79 KB, patch)
2020-02-25 17:10 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch - just the RunLoop parts
(5.23 KB, patch)
2020-06-09 14:57 PDT
,
Geoffrey Garen
koivisto
: review+
koivisto
: commit-queue-
Details
Formatted Diff
Diff
Patch
(4.19 KB, patch)
2020-11-13 11:54 PST
,
Geoffrey Garen
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(4.37 KB, patch)
2020-11-13 12:00 PST
,
Geoffrey Garen
no flags
Details
Formatted Diff
Diff
Patch
(7.82 KB, patch)
2020-11-16 12:17 PST
,
Geoffrey Garen
no flags
Details
Formatted Diff
Diff
Patch
(5.93 KB, patch)
2020-11-16 20:40 PST
,
Geoffrey Garen
no flags
Details
Formatted Diff
Diff
microbenchmark
(40.26 KB, application/zip)
2020-11-20 13:28 PST
,
Geoffrey Garen
no flags
Details
Patch
(4.75 KB, patch)
2020-12-10 15:35 PST
,
Geoffrey Garen
no flags
Details
Formatted Diff
Diff
Show Obsolete
(18)
View All
Add attachment
proposed patch, testcase, etc.
Sihui Liu
Comment 1
2019-10-11 19:22:52 PDT
Created
attachment 380807
[details]
Patch
Geoffrey Garen
Comment 2
2019-10-12 11:36:10 PDT
Comment on
attachment 380807
[details]
Patch It looks like some tests are timing out. My best guess is that some calls to RunLoop::dispatch() implicitly rely on ordering, and when we don’t use the explicit function queue, we break that ordering. I think it’s worth testing dispatch_async as the implementation of wakeUp. (We tested that once, but only with a broken patch that called isMain(), which is incorrect). That way, we preserve ordering. I’d change wakeUp to take an argument indicating whether this == s_mainRunLoop. Then, if true, the CF implementation should use dispatch_async to call the function that executes all tasks, rather than CFRunLoopWakeup. If that’s not fast enough, or it breaks things, then we should just add an inline function that calls dispatch_async or RunLoop::dispatch, depending on platform. And then call that just from IDB code.
Geoffrey Garen
Comment 3
2019-10-13 08:48:05 PDT
Created
attachment 380845
[details]
patch - mach message Out of curiosity, I disassembled libdispatch to see what method it uses to signal the RunLoop. It looks like libdispatch uses a "version 1" RunLoop source <
https://developer.apple.com/documentation/corefoundation/cfrunloopsource?language=objc
>, which triggers by mach message, rather than CFRunLoopWakeup(). I've duplicated that behavior here. Can you test the performance of this approach?
Geoffrey Garen
Comment 4
2019-10-13 08:48:55 PDT
(In reply to Geoffrey Garen from
comment #3
) (I verified that this patch passes the css3 layout tests.)
Geoffrey Garen
Comment 5
2019-10-13 08:52:03 PDT
Created
attachment 380846
[details]
patch - mach message
Geoffrey Garen
Comment 6
2019-10-13 14:20:42 PDT
Created
attachment 380851
[details]
patch - getUserMedia.html It looks like the iOS API test failure for TestWebKitAPI.WebKit2.CaptureMute is real. The test is racy, and its timing has changed slightly. Here's a patch that should make the test more deterministic.
Sihui Liu
Comment 7
2019-10-13 23:35:16 PDT
(In reply to Geoffrey Garen from
comment #3
)
> Created
attachment 380845
[details]
> patch - mach message > > Out of curiosity, I disassembled libdispatch to see what method it uses to > signal the RunLoop. It looks like libdispatch uses a "version 1" RunLoop > source > <
https://developer.apple.com/documentation/corefoundation/
> cfrunloopsource?language=objc>, which triggers by mach message, rather than > CFRunLoopWakeup(). I've duplicated that behavior here. > > Can you test the performance of this approach?
It's about 15% speed up on the test PerformanceTests/IndexedDB/basic/objectstore-get.html.
Geoffrey Garen
Comment 8
2019-10-14 09:42:25 PDT
> > Can you test the performance of this approach? > > It's about 15% speed up on the test > PerformanceTests/IndexedDB/basic/objectstore-get.html.
So, still slower than dispatch_async?
Geoffrey Garen
Comment 9
2019-10-14 20:24:49 PDT
Oh, actually, it's expected that the patch above doesn't achieve all 20%, because it doesn't include the fix for callOnMainThread. I'll upload a patch that includes callOnMainThread and the regression test change all in one.
Geoffrey Garen
Comment 10
2019-10-14 20:26:46 PDT
Created
attachment 380952
[details]
patch - mach message
Sihui Liu
Comment 11
2019-10-15 00:49:17 PDT
(In reply to Geoffrey Garen from
comment #9
)
> Oh, actually, it's expected that the patch above doesn't achieve all 20%, > because it doesn't include the fix for callOnMainThread. I'll upload a patch > that includes callOnMainThread and the regression test change all in one.
I previously made the change of the callOnMainThread locally and tested. I tried with the latest patch and it seemed it's still not as fast as using dispatch_async directly. It's close though. Also I debugged the layout test failure of the dispatch_async patch. The cause is that we try to block main thread when waiting for reply of sync messages. Our implementation is: 1. Start a task on main thread 2. While not receiving reply, ask RunLoop to run. RunLoop can listen for input sources and processes data from sources(receiving and handling reply) at this time. (Thread is blocked.) 3. Continue to finish the task It looks like when we do dispatch_async, the RunLoop doesn't think it is a source and thus time out at step2.
Geoffrey Garen
Comment 12
2019-10-15 10:37:57 PDT
(In reply to Sihui Liu from
comment #11
)
> (In reply to Geoffrey Garen from
comment #9
) > > Oh, actually, it's expected that the patch above doesn't achieve all 20%, > > because it doesn't include the fix for callOnMainThread. I'll upload a patch > > that includes callOnMainThread and the regression test change all in one. > > I previously made the change of the callOnMainThread locally and tested. I > tried with the latest patch and it seemed it's still not as fast as using > dispatch_async directly. It's close though. > > Also I debugged the layout test failure of the dispatch_async patch. The > cause is that we try to block main thread when waiting for reply of sync > messages. Our implementation is: > 1. Start a task on main thread > 2. While not receiving reply, ask RunLoop to run. RunLoop can listen for > input sources and processes data from sources(receiving and handling reply) > at this time. (Thread is blocked.) > 3. Continue to finish the task > It looks like when we do dispatch_async, the RunLoop doesn't think it is a > source and thus time out at step2.
Is this in Connection::waitForSyncReply? Or something else? I suspect that dispatch_async is not willing to execute inside a nested RunLoop. I believe that's a fundamental difference between libdispatch and CFRunLoop. (libdispatch requires all tasks to dispatch in order, and that means prohibiting re-entrant dispatch. Relatedly, libdispatch explicitly documents that dispatch_sync(dispatch_get_current_queue(), ...) is always a deadlock.) So, I guess we can't use dispatch_async. Let's try to finish up the mach_message / version 1 RunLoopSource solution, since it delivers most of the speedup. Can you do that? Outstanding issues: 1. inspector/css/pseudo-creation.html fails. It used to fail sometimes (e.g.
https://build.webkit.org/builders/Apple%20High%20Sierra%20Debug%20WK1%20(Tests)/builds/11323
), and now it fails all the time. So, the code has gotten better (more deterministic), but we broke a test. It looks like something about InspectorProtocol.eventHandler is racy relative to promise resolution. We need to fix this, or perhaps disable the test. 2. TestWebKitAPI.WebKitLegacy.ScrollingDoesNotPauseMedia times out. It must either time out in its call to Util::run, or its call to callOnMainThreadAndWait. I think it's probably callOnMainThreadAndWait. If so, this test has a simple bug. It calls "callOnMainThreadAndWait" *from the main thread*. I'm not sure exactly why this ever worked, but I think you can fix it by just switching to callOnMainThread.
Sihui Liu
Comment 13
2019-10-16 13:12:41 PDT
Created
attachment 381101
[details]
Patch
Sihui Liu
Comment 14
2019-10-16 13:27:40 PDT
Created
attachment 381103
[details]
Patch
Sam Weinig
Comment 15
2019-10-16 13:35:05 PDT
Comment on
attachment 381103
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=381103&action=review
> Source/WTF/ChangeLog:9 > + We used CFRunLoopWakeUp to wake up runloop to process source, which seems to be slow according to profiling. To > + avoid calling CFRunLoopWakeUp, use CFRunloopSource1 instead of CFRunloopSource0.
What is CFRunloopSource1? Why is this faster? If it's not the current default, what is the reason it's not and what potential downsides are there?
Geoffrey Garen
Comment 16
2019-10-16 13:36:11 PDT
Comment on
attachment 381103
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=381103&action=review
Looks great, and thanks for fixing those tests. One comment on callOonMainThread.
> Source/WTF/wtf/cocoa/MainThreadCocoa.mm:136 > - [staticMainThreadCaller performSelectorOnMainThread:@selector(call) withObject:nil waitUntilDone:NO]; > + dispatch_async(dispatch_get_main_queue(), ^{ > + [staticMainThreadCaller call]; > + });
You can remove some indirection here by deleting staticMainThreadCaller and JSWTFMainThreadCaller and just calling WTF::dispatchFunctionsFromMainThread() directly. staticMainThreadCaller and JSWTFMainThreadCaller only existed for the sake of doing the async dispatch, which you're doing a different way now. Side note: Is dispatch_async better than RunLoop::main().dispatch() in this case? Also, is it OK that callOnMainThread will not run in nested run loops?
Sihui Liu
Comment 17
2019-10-16 14:11:56 PDT
Created
attachment 381106
[details]
Patch
Sihui Liu
Comment 18
2019-10-16 14:51:29 PDT
(In reply to Sam Weinig from
comment #15
)
> Comment on
attachment 381103
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=381103&action=review
> > > Source/WTF/ChangeLog:9 > > + We used CFRunLoopWakeUp to wake up runloop to process source, which seems to be slow according to profiling. To > > + avoid calling CFRunLoopWakeUp, use CFRunloopSource1 instead of CFRunloopSource0. > > What is CFRunloopSource1? Why is this faster? If it's not the current > default, what is the reason it's not and what potential downsides are there?
It's version 1 source in CFRunloop:
https://developer.apple.com/documentation/corefoundation/cfrunloopsource-rhr
According to our profiling on test like OpenSource/PerformanceTests/IndexedDB/basic/objectstore-get.html, we spent a lot of time on waking up the runloop when receiving IPC message. And we tried libdispatch, which uses version 1 source for dispatching to runloop, and is much faster. So we think we could probably adopt this for better performance. I don't know why version 0 source is our default..
Sihui Liu
Comment 19
2019-10-16 15:06:02 PDT
(In reply to Geoffrey Garen from
comment #16
)
> Comment on
attachment 381103
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=381103&action=review
> > Looks great, and thanks for fixing those tests. One comment on > callOonMainThread. > > > Source/WTF/wtf/cocoa/MainThreadCocoa.mm:136 > > - [staticMainThreadCaller performSelectorOnMainThread:@selector(call) withObject:nil waitUntilDone:NO]; > > + dispatch_async(dispatch_get_main_queue(), ^{ > > + [staticMainThreadCaller call]; > > + }); > > You can remove some indirection here by deleting staticMainThreadCaller and > JSWTFMainThreadCaller and just calling > WTF::dispatchFunctionsFromMainThread() directly. staticMainThreadCaller and > JSWTFMainThreadCaller only existed for the sake of doing the async dispatch, > which you're doing a different way now. >
staticMainThreadCaller is used in the #if USE(WEB_THREAD) case, which is a few lines above. I would change this to WTF::dispatchFunctionsFromMainThread().
> Side note: Is dispatch_async better than RunLoop::main().dispatch() in this > case? Also, is it OK that callOnMainThread will not run in nested run loops?
The reason why I didn't use RunLoop::main().dispatch() because RunLoop::main() needs to run after RunLoop::initializeMainRunLoop() and callOnMainThread does not seem to care about whether we have initialized mainRunLoop or not. Why can't callOnMainThread run in nested run loops?
Sihui Liu
Comment 20
2019-10-16 15:31:50 PDT
Created
attachment 381124
[details]
Patch
Sam Weinig
Comment 21
2019-10-16 16:46:16 PDT
(In reply to Sihui Liu from
comment #18
)
> (In reply to Sam Weinig from
comment #15
) > > Comment on
attachment 381103
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=381103&action=review
> > > > > Source/WTF/ChangeLog:9 > > > + We used CFRunLoopWakeUp to wake up runloop to process source, which seems to be slow according to profiling. To > > > + avoid calling CFRunLoopWakeUp, use CFRunloopSource1 instead of CFRunloopSource0. > > > > What is CFRunloopSource1? Why is this faster? If it's not the current > > default, what is the reason it's not and what potential downsides are there? > > It's version 1 source in CFRunloop: >
https://developer.apple.com/documentation/corefoundation/cfrunloopsource-rhr
> > According to our profiling on test like > OpenSource/PerformanceTests/IndexedDB/basic/objectstore-get.html, we spent a > lot of time on waking up the runloop when receiving IPC message. And we > tried libdispatch, which uses version 1 source for dispatching to runloop, > and is much faster. So we think we could probably adopt this for better > performance.
Does that mean this is not a speedup on it's own?
Sihui Liu
Comment 22
2019-10-16 17:22:19 PDT
(In reply to Sam Weinig from
comment #21
)
> (In reply to Sihui Liu from
comment #18
) > > (In reply to Sam Weinig from
comment #15
) > > > Comment on
attachment 381103
[details]
> > > Patch > > > > > > View in context: > > >
https://bugs.webkit.org/attachment.cgi?id=381103&action=review
> > > > > > > Source/WTF/ChangeLog:9 > > > > + We used CFRunLoopWakeUp to wake up runloop to process source, which seems to be slow according to profiling. To > > > > + avoid calling CFRunLoopWakeUp, use CFRunloopSource1 instead of CFRunloopSource0. > > > > > > What is CFRunloopSource1? Why is this faster? If it's not the current > > > default, what is the reason it's not and what potential downsides are there? > > > > It's version 1 source in CFRunloop: > >
https://developer.apple.com/documentation/corefoundation/cfrunloopsource-rhr
> > > > According to our profiling on test like > > OpenSource/PerformanceTests/IndexedDB/basic/objectstore-get.html, we spent a > > lot of time on waking up the runloop when receiving IPC message. And we > > tried libdispatch, which uses version 1 source for dispatching to runloop, > > and is much faster. So we think we could probably adopt this for better > > performance. > > Does that mean this is not a speedup on it's own?
It's a speedup on it's own. The patch gives us about 15% speedup on the test we profiled (PerformanceTests/IndexedDB/basic/objectstore-get.html). It saves time by eliminating CFRunLoopWakeUps from the trace.
Geoffrey Garen
Comment 23
2019-10-16 21:16:51 PDT
> staticMainThreadCaller is used in the #if USE(WEB_THREAD) case, which is a > few lines above.
You can change that case to stop using staticMainThreadCaller too.
> > Side note: Is dispatch_async better than RunLoop::main().dispatch() in this > > case? Also, is it OK that callOnMainThread will not run in nested run loops? > > The reason why I didn't use RunLoop::main().dispatch() because > RunLoop::main() needs to run after RunLoop::initializeMainRunLoop() and > callOnMainThread does not seem to care about whether we have initialized > mainRunLoop or not.
callOnMainThread depends on initializeMainThread() to initialize staticMainThreadCaller.
> Why can't callOnMainThread run in nested run loops?
Because dispatch_async doesn't run in nested run loops. (I confirmed this theory at
https://www.thecave.com/2015/08/10/dispatch-async-to-main-queue-doesnt-work-with-modal-window-on-mac-os-x/
.) If there isn't a specific advantage to dispatch_async here, maybe RunLoop::main() is better, to avoid this minor change in behavior.
Sihui Liu
Comment 24
2019-10-17 01:30:06 PDT
Created
attachment 381180
[details]
Patch
Sam Weinig
Comment 25
2019-10-17 07:59:00 PDT
(In reply to Sihui Liu from
comment #22
)
> (In reply to Sam Weinig from
comment #21
) > > (In reply to Sihui Liu from
comment #18
) > > > (In reply to Sam Weinig from
comment #15
) > > > > Comment on
attachment 381103
[details]
> > > > Patch > > > > > > > > View in context: > > > >
https://bugs.webkit.org/attachment.cgi?id=381103&action=review
> > > > > > > > > Source/WTF/ChangeLog:9 > > > > > + We used CFRunLoopWakeUp to wake up runloop to process source, which seems to be slow according to profiling. To > > > > > + avoid calling CFRunLoopWakeUp, use CFRunloopSource1 instead of CFRunloopSource0. > > > > > > > > What is CFRunloopSource1? Why is this faster? If it's not the current > > > > default, what is the reason it's not and what potential downsides are there? > > > > > > It's version 1 source in CFRunloop: > > >
https://developer.apple.com/documentation/corefoundation/cfrunloopsource-rhr
> > > > > > According to our profiling on test like > > > OpenSource/PerformanceTests/IndexedDB/basic/objectstore-get.html, we spent a > > > lot of time on waking up the runloop when receiving IPC message. And we > > > tried libdispatch, which uses version 1 source for dispatching to runloop, > > > and is much faster. So we think we could probably adopt this for better > > > performance. > > > > Does that mean this is not a speedup on it's own? > > It's a speedup on it's own. The patch gives us about 15% speedup on the test > we profiled (PerformanceTests/IndexedDB/basic/objectstore-get.html). It > saves time by eliminating CFRunLoopWakeUps from the trace.
Very nice!
Geoffrey Garen
Comment 26
2019-10-17 10:40:20 PDT
Comment on
attachment 381180
[details]
Patch r=me
Sihui Liu
Comment 27
2019-10-17 12:58:29 PDT
Created
attachment 381217
[details]
Patch for landing
WebKit Commit Bot
Comment 28
2019-10-17 14:09:57 PDT
The commit-queue encountered the following flaky tests while processing
attachment 381217
[details]
: inspector/console/webcore-logging.html
bug 203118
(authors:
drousso@apple.com
and
eric.carlson@apple.com
) The commit-queue is continuing to process your patch.
WebKit Commit Bot
Comment 29
2019-10-17 14:10:03 PDT
The commit-queue encountered the following flaky tests while processing
attachment 381217
[details]
: imported/w3c/web-platform-tests/websockets/bufferedAmount-unchanged-by-sync-xhr.any.worker.html
bug 202003
(author:
youennf@gmail.com
) The commit-queue is continuing to process your patch.
WebKit Commit Bot
Comment 30
2019-10-17 15:05:30 PDT
Comment on
attachment 381217
[details]
Patch for landing Clearing flags on attachment: 381217 Committed
r251261
: <
https://trac.webkit.org/changeset/251261
>
WebKit Commit Bot
Comment 31
2019-10-17 15:05:33 PDT
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 32
2019-10-17 15:06:29 PDT
<
rdar://problem/56387205
>
Truitt Savell
Comment 33
2019-10-18 09:34:03 PDT
It looks like the changes in
https://trac.webkit.org/changeset/251261/webkit
broke inspector/console/webcore-logging.html History:
https://results.webkit.org/?suite=layout-tests&test=inspector%2Fconsole%2Fwebcore-logging.html
Diff: --- /Volumes/Data/slave/mojave-release-tests-wk1/build/layout-test-results/inspector/console/webcore-logging-expected.txt +++ /Volumes/Data/slave/mojave-release-tests-wk1/build/layout-test-results/inspector/console/webcore-logging-actual.txt @@ -27,7 +27,9 @@ -- Running test case: Console.Logging.MediaLogging PASS: Media logging disabled. PASS: Media logging has been enabled. -PASS: Media log message should have source 'media'. +FAIL: Media log message should have source 'media'. + Expected: truthy + Actual: false -- Running test case: Console.Logging.LogAsJSONWithoutRepeat PASS: Media logging disabled. This reproduces easily by running the test in iterations. It fails on 251261 but not 251260.
Truitt Savell
Comment 34
2019-10-18 09:39:55 PDT
This also broke inspector/runtime/getProperties.html History
https://results.webkit.org/?suite=layout-tests&test=inspector%2Fruntime%2FgetProperties.html
Truitt Savell
Comment 35
2019-10-23 17:06:03 PDT
Reverted
r251261
for reason: This broke multiple tests Committed
r251514
: <
https://trac.webkit.org/changeset/251514
>
Sihui Liu
Comment 36
2019-10-23 17:45:12 PDT
(In reply to Truitt Savell from
comment #35
)
> Reverted
r251261
for reason: > > This broke multiple tests > > Committed
r251514
: <
https://trac.webkit.org/changeset/251514
>
inspector/console/webcore-logging.html was fixed by
r251482
(and got reverted too), which fixed the test and made another meaningful bug fix. inspector/runtime/getProperties.html is caused by that the test file doesn't handle error case correctly. (The diff is one error console message.) scrollingcoordinator/ios/non-stable-viewport-scroll.html is flaky by its nature, as I commented in
rdar://problem/56410921
and I replied in Slack. Are there any other failure caused by this patch? If yes, what are they? If no, what led this to be reverted?
Truitt Savell
Comment 37
2019-10-24 08:06:24 PDT
(In reply to Sihui Liu from
comment #36
)
> (In reply to Truitt Savell from
comment #35
) > > Reverted
r251261
for reason: > > > > This broke multiple tests > > > > Committed
r251514
: <
https://trac.webkit.org/changeset/251514
> > > inspector/console/webcore-logging.html was fixed by
r251482
(and got > reverted too), which fixed the test and made another meaningful bug fix. > > inspector/runtime/getProperties.html is caused by that the test file doesn't > handle error case correctly. (The diff is one error console message.) > > scrollingcoordinator/ios/non-stable-viewport-scroll.html is flaky by its > nature, as I commented in
rdar://problem/56410921
and I replied in Slack. > > Are there any other failure caused by this patch? If yes, what are they? > If no, what led this to be reverted?
Those are the tests that lead us to roll this patch out
Geoffrey Garen
Comment 38
2019-10-24 12:47:36 PDT
> inspector/console/webcore-logging.html was fixed by
r251482
(and got > reverted too), which fixed the test and made another meaningful bug fix.
Sounds great!
> inspector/runtime/getProperties.html is caused by that the test file doesn't > handle error case correctly. (The diff is one error console message.)
Does this mean that inspector/runtime/getProperties.html does fail when this patch is applied? If so, I'd suggest updating the test to handle error or updating the expected result to include the console message when re-landing.
> scrollingcoordinator/ios/non-stable-viewport-scroll.html is flaky by its > nature, as I commented in
rdar://problem/56410921
and I replied in Slack.
Was this test failing sometimes before this patch landed? If so, seems OK to ignore. If not, we probably need to skip this test until it gets fixed.
Geoffrey Garen
Comment 39
2019-10-24 13:25:49 PDT
> inspector/runtime/getProperties.html is caused by that the test file doesn't handle error case correctly. (The diff is one error console message.)
For this case, let's use if (window.internals) window.internals.settings.setUnhandledPromiseRejectionToConsoleEnabled(false); and then update the test result not to expect a console message.
Geoffrey Garen
Comment 40
2019-10-24 13:46:06 PDT
> scrollingcoordinator/ios/non-stable-viewport-scroll.html is flaky by its > nature, as I commented in
rdar://problem/56410921
and I replied in Slack.
For this case, the test is marked as flaky, so I think it's OK to land without fixing it, since doing so won't make the bots red. (We analyzed this test a bit, and fixing it will be non-trivial. More information from <
rdar://problem/56410921
>: "Talked to Simon, the test is flaky without my patch. We found some graphics work is done after callback of uiController.doAfterPresentationUpdate and RenderLayerCompositor::layerTreeAsText, but layerTreeAsText should show us the final result. With my patch, some of those work (result is not reflected in the -expected.txt file) has bigger chance to finish sooner than layerTreeAsText, so the test has become more flaky." )
Antti Koivisto
Comment 41
2019-10-25 01:53:33 PDT
Seems like this patch was a significant (~11%) StyleBench perf regression:
https://perf.webkit.org/v3/#/charts?since=1571129192465&paneList=((20-1649-36468600-null-(5-2.5-500
))) Rollout brought the perf back where it was. This should be looked into before relanding.
Sihui Liu
Comment 42
2019-10-25 06:22:34 PDT
(In reply to Antti Koivisto from
comment #41
)
> Seems like this patch was a significant (~11%) StyleBench perf regression: > >
https://perf.webkit.org/v3/#/charts?since=1571129192465&paneList=((20-1649
- > 36468600-null-(5-2.5-500))) > > Rollout brought the perf back where it was. This should be looked into > before relanding.
That's surprising! How can I run this test?
Geoffrey Garen
Comment 43
2019-10-25 08:14:19 PDT
(In reply to Sihui Liu from
comment #42
)
> (In reply to Antti Koivisto from
comment #41
) > > Seems like this patch was a significant (~11%) StyleBench perf regression: > > > >
https://perf.webkit.org/v3/#/charts?since=1571129192465&paneList=((20-1649
- > > 36468600-null-(5-2.5-500))) > > > > Rollout brought the perf back where it was. This should be looked into > > before relanding. > > That's surprising! How can I run this test?
OpenSource/PerformanceTests/StyleBench/index.html
Antti Koivisto
Comment 44
2019-10-25 09:07:53 PDT
cd PerformanceTests/StyleBench python -m SimpleHTTPServer 8000 navigate to localhost:8000 Alternatively individual subtests can be run with web server by opening PerformanceTests/StyleBench/InteractiveRunner.html
Antti Koivisto
Comment 45
2019-10-25 09:08:32 PDT
> Alternatively individual subtests can be run with web server by opening
*without web server
Ryosuke Niwa
Comment 46
2019-10-25 11:26:46 PDT
There is also Tools/Scripts/run-perf-tests <path>
Antti Koivisto
Comment 47
2020-02-13 13:43:54 PST
Are there any show stoppers for landing this? It might help with first paint performance (type 1 sources don't prevent runloop observers from firing and so don't delay layer flushes). I don't think StyleBench regression is a show stopper, it is not a zero regression test. It sounded like it was a bug in how measurements are made we understood what was wrong.
Sihui Liu
Comment 48
2020-02-13 14:05:31 PST
(In reply to Antti Koivisto from
comment #47
)
> Are there any show stoppers for landing this? It might help with first paint > performance (type 1 sources don't prevent runloop observers from firing and > so don't delay layer flushes). > > I don't think StyleBench regression is a show stopper, it is not a zero > regression test. It sounded like it was a bug in how measurements are made > we understood what was wrong.
No,the we could try landing this again.
Ben Nham
Comment 49
2020-02-14 11:42:32 PST
I still think we should try to commit this but I'm not sure if this will help first paint perf due to excessive batching before an observer fires (namely the one that causes layer flush to occur). As far as I can tell, it looks like servicing a type 0 source will still turn the entire run loop and fire observers. After a type 0 source is handled: 1. The GCD main queue port is polled with 0 timeout; if this succeeds, go to (5) skipping before/after waiting observers 2. Before waiting observers fired 3. wakeup/timer/source1 port set polled with a 0 timeout 4. after waiting observers fired 5. GCD main queue / wakeup / timer / source1 callback runs This patch mainly seems to optimize things by using a source1 mach port to wake up the run loop instead of CFRunLoopWakeUp. This is more efficient because this patch just sends a mach message, while CFRunLoopWakeUp both takes a lock and sends a mach message. But both before and after this patch, we can end up handling a bunch of IPC messages in a row before a run loop observer fires because of the way WTF::RunLoop::performWork batches up work.
Simon Fraser (smfr)
Comment 50
2020-02-14 11:47:22 PST
(In reply to Ben Nham from
comment #49
)
> I still think we should try to commit this but I'm not sure if this will > help first paint perf due to excessive batching before an observer fires > (namely the one that causes layer flush to occur). As far as I can tell, it > looks like servicing a type 0 source will still turn the entire run loop and > fire observers.
It does not fire "before waiting" observers, which the CoreAnimation commit observer is:
https://github.com/opensource-apple/CF/blob/master/CFRunLoop.c#L2403
https://github.com/opensource-apple/CF/blob/master/CFRunLoop.c#L2425
After a type 0 source is handled:
> > 1. The GCD main queue port is polled with 0 timeout; if this succeeds, go to > (5) skipping before/after waiting observers > 2. Before waiting observers fired
No, because of "if (!poll".
> 3. wakeup/timer/source1 port set polled with a 0 timeout > 4. after waiting observers fired > 5. GCD main queue / wakeup / timer / source1 callback runs > > This patch mainly seems to optimize things by using a source1 mach port to > wake up the run loop instead of CFRunLoopWakeUp. This is more efficient > because this patch just sends a mach message, while CFRunLoopWakeUp both > takes a lock and sends a mach message. But both before and after this patch, > we can end up handling a bunch of IPC messages in a row before a run loop > observer fires because of the way WTF::RunLoop::performWork batches up work.
Ben Nham
Comment 51
2020-02-14 11:57:19 PST
I see, I missed that. So the hypothesis is basically that an IPC that causes a layout to occur in a source 0 callback, which then calls flushLayers, and then CA's auto-commit observer doesn't run as soon as it should because of this source 0 behavior?
Simon Fraser (smfr)
Comment 52
2020-02-14 13:19:49 PST
(In reply to Ben Nham from
comment #51
)
> I see, I missed that. So the hypothesis is basically that an IPC that causes > a layout to occur in a source 0 callback, which then calls flushLayers, and > then CA's auto-commit observer doesn't run as soon as it should because of > this source 0 behavior?
More simply put, a flood of IPC messages can postpone rendering for many event loop cycles.
Sihui Liu
Comment 53
2020-02-17 09:59:53 PST
Created
attachment 390927
[details]
Patch
Sihui Liu
Comment 54
2020-02-24 14:19:34 PST
Created
attachment 391575
[details]
Patch
Sihui Liu
Comment 55
2020-02-25 17:10:17 PST
Created
attachment 391705
[details]
Patch
Geoffrey Garen
Comment 56
2020-06-09 14:57:17 PDT
Created
attachment 401478
[details]
Patch - just the RunLoop parts Let's see if we can land the refactoring part of this patch without the behavior change, to shrink the diff and help move to just one function queue.
Sihui Liu
Comment 57
2020-06-09 21:53:56 PDT
Comment on
attachment 401478
[details]
Patch - just the RunLoop parts Looks green. Do we need a new bug for this change?
Antti Koivisto
Comment 58
2020-06-09 23:32:39 PDT
Comment on
attachment 401478
[details]
Patch - just the RunLoop parts View in context:
https://bugs.webkit.org/attachment.cgi?id=401478&action=review
> Source/WTF/ChangeLog:3 > + Use CFRunLoopSource1 for faster task dispatch
The tittle does not describe this patch.
Geoffrey Garen
Comment 59
2020-06-10 12:50:25 PDT
Comment on
attachment 401478
[details]
Patch - just the RunLoop parts Yeah, I'll obsolete this patch and post in a separate bug.
Geoffrey Garen
Comment 60
2020-06-10 12:55:09 PDT
Refactoring patch is now posted @
https://bugs.webkit.org/show_bug.cgi?id=213043
.
Geoffrey Garen
Comment 61
2020-11-13 11:54:53 PST
Created
attachment 414072
[details]
Patch
Geoffrey Garen
Comment 62
2020-11-13 11:56:19 PST
Comment on
attachment 414072
[details]
Patch Re-based patch.
Geoffrey Garen
Comment 63
2020-11-13 12:00:26 PST
Created
attachment 414073
[details]
Patch
Sihui Liu
Comment 64
2020-11-13 16:05:59 PST
Note using version 1 source may cause issue in our test runner, details in <
rdar://problem/59637925
>.
Geoffrey Garen
Comment 65
2020-11-16 12:17:58 PST
Created
attachment 414266
[details]
Patch
Geoffrey Garen
Comment 66
2020-11-16 20:40:28 PST
Created
attachment 414303
[details]
Patch
Geoffrey Garen
Comment 67
2020-11-20 13:28:14 PST
Created
attachment 414710
[details]
microbenchmark Uploading a microbenchmark for posterity. Interestingly, the fastest case for V0 and V1 is about equal. But on average V0 is 2X slower; and in the worst case, it's 5X slower, causing a full 2ms of latency!
Geoffrey Garen
Comment 68
2020-11-20 13:29:37 PST
Comment on
attachment 414303
[details]
Patch Tests pass and microbenchmark confirms a speedup. So, I think this is ready for review.
EWS
Comment 69
2020-11-20 14:04:47 PST
Committed
r270132
: <
https://trac.webkit.org/changeset/270132
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 414303
[details]
.
Ryan Haddad
Comment 70
2020-12-01 10:00:55 PST
Reverted
r270132
for reason: Caused at least one regression test failure (
webkit.org/b/219401
) Committed
r270311
: <
https://trac.webkit.org/changeset/270311
>
Ryan Haddad
Comment 71
2020-12-01 13:47:01 PST
(In reply to Ryan Haddad from
comment #70
)
> Caused at least one regression test failure (
webkit.org/b/219401
)
It appears that these WPT worklet tests that had become flaky on iOS are OK again after the revert, so these were also affected:
https://results.webkit.org/?suite=layout-tests&suite=layout-tests&suite=layout-tests&test=imported%2Fw3c%2Fweb-platform-tests%2Fworklets%2Fanimation-worklet-csp.https.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fworklets%2Fanimation-worklet-referrer.https.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fworklets%2Flayout-worklet-referrer.https.html
https://build.webkit.org/results/Apple-iOS-14-Simulator-Release-WK2-Tests/r270300%20(1074)/results.html
Geoffrey Garen
Comment 72
2020-12-10 15:35:37 PST
Created
attachment 415939
[details]
Patch
Geoffrey Garen
Comment 73
2020-12-10 15:36:28 PST
Previous test failures should be resolved by:
https://bugs.webkit.org/show_bug.cgi?id=219717
https://bugs.webkit.org/show_bug.cgi?id=219656
EWS
Comment 74
2020-12-10 16:17:48 PST
Committed
r270661
: <
https://trac.webkit.org/changeset/270661
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 415939
[details]
.
Ryan Haddad
Comment 75
2020-12-12 21:22:36 PST
Looks like there are still more failures/timeouts because of this change. The failures are a bit difficult to enumerate because some are flaky, but here are the ones that clearly start with this change:
https://results.webkit.org/?suite=layout-tests&suite=layout-tests&suite=layout-tests&suite=layout-tests&suite=layout-tests&suite=layout-tests&test=fast%2Ftext%2Fcanvas-color-fonts%2FCOLR.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fcss%2Fcss-transitions%2Fproperties-value-001.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fcss%2Fcss-transitions%2Fproperties-value-003.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fcss%2Fcss-transitions%2Fproperties-value-inherit-001.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fcss%2Fcss-transitions%2Fproperties-value-inherit-002.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fcss%2Fcss-transitions%2Fproperties-value-inherit-003.html
Ryan Haddad
Comment 76
2020-12-12 21:27:20 PST
Two more that appear to be iOS specific:
https://results.webkit.org/?suite=layout-tests&suite=layout-tests&test=imported%2Fw3c%2Fweb-platform-tests%2Fworkers%2Fmodules%2Fshared-worker-import-csp.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fworklets%2Flayout-worklet-csp.https.html
Ryan Haddad
Comment 77
2020-12-12 21:40:57 PST
Another:
https://results.webkit.org/?suite=layout-tests&test=inspector%2Fformatting%2Fsource-map-javascript-2.html
Ryan Haddad
Comment 78
2020-12-12 21:44:20 PST
Reverted
r270661
for reason: Caused layout test failures and timeouts Committed
r270746
: <
https://trac.webkit.org/changeset/270746
>
Ryan Haddad
Comment 79
2020-12-12 21:46:35 PST
(In reply to Ryan Haddad from
comment #78
)
> Reverted
r270661
for reason: > > Caused layout test failures and timeouts
Here is one link that shows all the ones I've found so far:
https://results.webkit.org/?suite=layout-tests&suite=layout-tests&suite=layout-tests&suite=layout-tests&suite=layout-tests&suite=layout-tests&suite=layout-tests&suite=layout-tests&suite=layout-tests&suite=layout-tests&test=fast%2Ftext%2Fcanvas-color-fonts%2FCOLR.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fcss%2Fcss-transitions%2Fproperties-value-001.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fcss%2Fcss-transitions%2Fproperties-value-003.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fcss%2Fcss-transitions%2Fproperties-value-inherit-001.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fcss%2Fcss-transitions%2Fproperties-value-inherit-002.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fcss%2Fcss-transitions%2Fproperties-value-inherit-003.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fworkers%2Fmodules%2Fshared-worker-import-csp.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fworklets%2Flayout-worklet-csp.https.html&test=inspector%2Fformatting%2Fsource-map-javascript-2.html&test=inspector%2Fruntime%2Fparse.html
Per Arne Vollan
Comment 80
2020-12-17 03:30:51 PST
Comment on
attachment 415939
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=415939&action=review
> Source/WTF/wtf/cf/RunLoopCF.cpp:76 > + mach_msg_header_t header; > + header.msgh_bits = MACH_MSGH_BITS(MACH_MSG_TYPE_COPY_SEND, 0); > + header.msgh_size = sizeof(mach_msg_header_t); > + header.msgh_remote_port = CFMachPortGetPort(m_port.get()); > + header.msgh_local_port = MACH_PORT_NULL; > + header.msgh_id = 0; > + mach_msg_return_t result = mach_msg(&header, MACH_SEND_MSG | MACH_SEND_TIMEOUT, header.msgh_size, 0, MACH_PORT_NULL, 0, MACH_PORT_NULL); > + RELEASE_ASSERT(result == MACH_MSG_SUCCESS || result == MACH_SEND_TIMED_OUT); > + if (result == MACH_SEND_TIMED_OUT) > + mach_msg_destroy(&header);
Could this be simplified by using CFMessagePort instead of CFMachPort?
Geoffrey Garen
Comment 81
2020-12-17 08:42:25 PST
> Could this be simplified by using CFMessagePort instead of CFMachPort?
Unfortunately, CFMessagePortSendRequest still locks a pthread_mutex_t before sending, and pthread_mutex_t is most of the overhead we are trying to avoid.
Per Arne Vollan
Comment 82
2021-01-20 14:14:12 PST
(In reply to Geoffrey Garen from
comment #81
)
> > Could this be simplified by using CFMessagePort instead of CFMachPort? > > Unfortunately, CFMessagePortSendRequest still locks a pthread_mutex_t before > sending, and pthread_mutex_t is most of the overhead we are trying to avoid.
Ah, got it, thanks! If there is a little bit of flakiness associated with this patch, could this possibly be addressed by dispatching off a run loop observer for the main thread run loop, in order to make the ordering deterministic?
Geoffrey Garen
Comment 83
2021-01-21 15:27:31 PST
> If there is a little bit of flakiness associated with this patch, could this > possibly be addressed by dispatching off a run loop observer for the main > thread run loop, in order to make the ordering deterministic?
I had the same thought! :P I do think that approach *could* work. I spent a few days trying to make that work, but I ended up with too many test failures. It's possible that I did not succeed in making behavior more deterministic; or maybe I succeeded in making behavior more deterministic, but deterministically failing. One gotcha is that run loop observers don't run at *exactly* the same point as run loop sources. Another gotcha is that the relative order of run loop observers is different in the case where you first call CFRunLoopRun() vs the case where you were already sleeping inside CFRunLoopRun(). My next move is to A/B test to see if the change is worth it. If it is, I think I can make things more deterministic by standardizing how we schedule tasks. (I started this in
https://trac.webkit.org/changeset/271493/webkit
.)
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug