Bug 141122

Summary: editing/selection/programmatic-selection-on-mac-is-directionless.html is flaky (IPC message delivery order is unreliable)
Product: WebKit Reporter: Alexey Proskuryakov <ap>
Component: WebKit2Assignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: achristensen, andersca, buildbot, cdumez, enrica, jer.noble, rniwa, sam, thorton, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=171827
Attachments:
Description Flags
naive fix
ap: review-, buildbot: commit-queue-
Archive of layout-test-results from ews104 for mac-mavericks-wk2 none

Description Alexey Proskuryakov 2015-01-31 11:26:13 PST
editing/selection/programmatic-selection-on-mac-is-directionless.html is flaky on WebKit2, execCommand("undo") sometimes doesn't work correctly.

http://webkit-test-results.appspot.com/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=editing%2Fselection%2Fprogrammatic-selection-on-mac-is-directionless.html

This is a CoreIPC bug. When we have a DispatchMessageEvenWhenWaitingForSyncReply message sent while handling a sync message, it sometimes gets delivered out of order with the reply to that sync message.
Comment 1 Alexey Proskuryakov 2015-01-31 11:45:24 PST
Created attachment 245789 [details]
naive fix

This definitely fixes the test for me locally. But I don't understand this code well - there are so many mutexes there, and the fix makes us lock two mutexes at once. I'm concerned about introducing deadlocks.
Comment 2 Build Bot 2015-01-31 12:33:11 PST
Comment on attachment 245789 [details]
naive fix

Attachment 245789 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.appspot.com/results/5114670889304064

Number of test failures exceeded the failure limit.
Comment 3 Build Bot 2015-01-31 12:33:13 PST
Created attachment 245792 [details]
Archive of layout-test-results from ews104 for mac-mavericks-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews104  Port: mac-mavericks-wk2  Platform: Mac OS X 10.9.5
Comment 4 Alexey Proskuryakov 2015-01-31 18:05:06 PST
Comment on attachment 245789 [details]
naive fix

Clearly, there is a deadlock :(
Comment 5 Alexey Proskuryakov 2015-01-31 20:27:07 PST
A deadlock that I see in debugging is easy to resolve - it's just an attempt to recursively lock m_syncReplyStateMutex (with the patch, we call the client while holding a lock on m_syncReplyStateMutex, and the client can attempt to send a message, taking this lock again).

However, looking at this, it appears that we have the opposite issue too - a DispatchMessageEvenWhenWaitingForSyncReply message can sneak in before a sync reply. This is because we process all such messages that are in m_messagesToDispatchWhileWaitingForSyncReply before returning a sync reply. That seems harder to fix.
Comment 6 Radar WebKit Bug Importer 2015-02-17 13:45:34 PST
<rdar://problem/19865231>
Comment 7 Alexey Proskuryakov 2015-04-26 23:01:12 PDT
Marked the test as flaky in r183386. Unfortunately, there are almost certainly other tests affected by this issue.
Comment 8 Alexey Proskuryakov 2016-12-05 15:41:18 PST
Easily reproducible for me like this:

run-webkit-tests editing/selection/programmatic-selection-on-mac-is-directionless.html --repeat 100 -v --no-build -f --child-processes=10
Comment 9 Chris Dumez 2019-08-01 09:22:36 PDT
The test uses EventSender, which puts our IPC::Connection in a special mode where async IPC messages get sent synchronously (not just the EventSender-related ones but ALL of them). As a result of this behavior, you can imagine:
1. Send an async IPC (a)
2. EventSender does a click which puts our IPC::Connection in SyncModeForTesting
3. Send an async IPC (b) -> becomes sync and may get received before IPC (a)
Comment 10 Chris Dumez 2019-08-05 12:13:37 PDT
(In reply to Alexey Proskuryakov from comment #8)
> Easily reproducible for me like this:
> 
> run-webkit-tests
> editing/selection/programmatic-selection-on-mac-is-directionless.html
> --repeat 100 -v --no-build -f --child-processes=10

This command this not reproduce the issue for me with ToT.
Comment 11 Alexey Proskuryakov 2019-08-05 17:21:28 PDT
I can still reproduce. I'm on an somewhat old laptop, so if you have a beefy machine, try raising the child process count.