Summary: | editing/selection/programmatic-selection-on-mac-is-directionless.html is flaky (IPC message delivery order is unreliable) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Alexey Proskuryakov <ap> | ||||||
Component: | WebKit2 | Assignee: | Nobody <webkit-unassigned> | ||||||
Status: | NEW --- | ||||||||
Severity: | Normal | CC: | achristensen, andersca, buildbot, cdumez, enrica, jer.noble, rniwa, sam, thorton, webkit-bug-importer | ||||||
Priority: | P2 | Keywords: | InRadar | ||||||
Version: | 528+ (Nightly build) | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
See Also: | https://bugs.webkit.org/show_bug.cgi?id=171827 | ||||||||
Attachments: |
|
Description
Alexey Proskuryakov
2015-01-31 11:26:13 PST
Created attachment 245789 [details]
naive fix
This definitely fixes the test for me locally. But I don't understand this code well - there are so many mutexes there, and the fix makes us lock two mutexes at once. I'm concerned about introducing deadlocks.
Comment on attachment 245789 [details] naive fix Attachment 245789 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/5114670889304064 Number of test failures exceeded the failure limit. Created attachment 245792 [details]
Archive of layout-test-results from ews104 for mac-mavericks-wk2
The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews104 Port: mac-mavericks-wk2 Platform: Mac OS X 10.9.5
Comment on attachment 245789 [details]
naive fix
Clearly, there is a deadlock :(
A deadlock that I see in debugging is easy to resolve - it's just an attempt to recursively lock m_syncReplyStateMutex (with the patch, we call the client while holding a lock on m_syncReplyStateMutex, and the client can attempt to send a message, taking this lock again). However, looking at this, it appears that we have the opposite issue too - a DispatchMessageEvenWhenWaitingForSyncReply message can sneak in before a sync reply. This is because we process all such messages that are in m_messagesToDispatchWhileWaitingForSyncReply before returning a sync reply. That seems harder to fix. Marked the test as flaky in r183386. Unfortunately, there are almost certainly other tests affected by this issue. Easily reproducible for me like this: run-webkit-tests editing/selection/programmatic-selection-on-mac-is-directionless.html --repeat 100 -v --no-build -f --child-processes=10 The test uses EventSender, which puts our IPC::Connection in a special mode where async IPC messages get sent synchronously (not just the EventSender-related ones but ALL of them). As a result of this behavior, you can imagine: 1. Send an async IPC (a) 2. EventSender does a click which puts our IPC::Connection in SyncModeForTesting 3. Send an async IPC (b) -> becomes sync and may get received before IPC (a) (In reply to Alexey Proskuryakov from comment #8) > Easily reproducible for me like this: > > run-webkit-tests > editing/selection/programmatic-selection-on-mac-is-directionless.html > --repeat 100 -v --no-build -f --child-processes=10 This command this not reproduce the issue for me with ToT. I can still reproduce. I'm on an somewhat old laptop, so if you have a beefy machine, try raising the child process count. |