Description: editing/pasteboard/paste-without-nesting.html It should be <div>Hello</div><div>world</div>. Was <div><div>Hello</div><div>world</div></div>. The is a flaky failure that I can find the first occurrence 04/08/20 and appears to have been a flaky failure since. History: https://results.webkit.org/?suite=layout-tests&test=editing%2Fpasteboard%2Fpaste-without-nesting.html&platform=ios&flavor=wk2&style=release&limit=50000 Diff Log: --- /Volumes/Data/worker/iOS-13-Simulator-WK2-Tests-EWS/build/layout-test-results/editing/pasteboard/paste-without-nesting-expected.txt +++ /Volumes/Data/worker/iOS-13-Simulator-WK2-Tests-EWS/build/layout-test-results/editing/pasteboard/paste-without-nesting-actual.txt @@ -3,7 +3,7 @@ PASS confirmedMarkup is 'Hello' PASS confirmedMarkup is 'Hello<br>world' PASS confirmedMarkup is '<div>Hello</div><div>world</div>' -PASS confirmedMarkup is '<div>Hello</div><div>world</div>' +FAIL confirmedMarkup should be <div>Hello</div><div>world</div>. Was <div><div>Hello</div><div>world</div></div>. PASS confirmedMarkup is '<div><b><i>Hello</i></b></div><div><b><i>world</i></b></div>' PASS confirmedMarkup is '<div style="text-align: center;"><div><font color="#ff0000"><b><i>hello</i></b></font></div><div><font color="#ff0000"><b><i>world</i></b></font></div></div>' PASS confirmedMarkup is '<p>Line One</p><p>Line Two</p>'
<rdar://problem/66628493>
I have marked this tests as failing while this issue is investigated. https://trac.webkit.org/changeset/265329/webkit
Unable to reproduce this locally :( (I tried running for 1000 iterations on 4 simulators in parallel)
Created attachment 406178 [details] Repro attempt #1
Created attachment 406179 [details] Repro attempt #2
Created attachment 406180 [details] Repro attempt #3
Created attachment 406181 [details] Repro attempt #4
Created attachment 406182 [details] Repro attempt #5
> Unexpected flakiness: text-only failures (4) > editing/pasteboard/paste-without-nesting-229.html [ Failure Pass ] > editing/pasteboard/paste-without-nesting-346.html [ Failure Pass ] > editing/pasteboard/paste-without-nesting-81.html [ Failure Pass ] > media/now-playing-status-without-media.html [ Failure Pass ] > Unexpected flakiness: image-only failures (1) > css3/filters/backdrop/backdrop-filter-with-border-radius-and-reflection-remove.html [ ImageOnlyFailure Pass ] It looks like EWS is at least able to reproduce this!
Created attachment 406193 [details] Repro v2
From <https://ews-build.webkit.org/#/builders/24/builds/23457/steps/10/logs/stdio>, sample logging excerpt from failed test run: 11:47:34.525 12247 (CONSOLE) <WK>: ================== 11:47:34.525 12247 (CONSOLE) <WK>: Starting test (4). 11:47:34.525 12247 2020-08-07 11:47:34.156 WebKitTestRunnerApp[12268:1448157] PBItemCollectionServicer connection disconnected. 11:47:34.525 12247 (CONSOLE) <WK>: FAILURE! 11:47:34.525 12247 (CONSOLE) <WK>: - EXP: "<div>Hello</div><div>world</div>" 11:47:34.526 12247 (CONSOLE) <WK>: - GOT: "<div><div>Hello</div><div>world</div></div>" 11:47:34.526 12247 (CONSOLE) <WK>: ================== …compared to a successful run: 11:50:26.765 12247 (CONSOLE) <WK>: ================== 11:50:26.765 12247 (CONSOLE) <WK>: Starting test (4). 11:50:26.765 12247 2020-08-07 11:50:26.369 WebKitTestRunnerApp[12268:1459245] PBItemCollectionServicer connection disconnected. 11:50:26.765 12247 <WK>: Remove redundant blocks? 11:50:26.765 12247 <WK>: - node <DIV 0x1248ac1c0 id='' class=''> 11:50:26.765 12247 <WK>: - rootNode <DIV 0x1248ac0e0 id='container4' class=''> 11:50:26.765 12247 <WK>: *** REMOVING <DIV 0x1248ac1c0 id='' class=''> *** 11:50:26.765 12247 <WK>: *** REMOVING <DIV 0x1248ac150 id='' class=''> *** 11:50:26.765 12247 <WK>: 11:50:26.765 12247 (CONSOLE) <WK>: PASS! 11:50:26.765 12247 (CONSOLE) <WK>: ================== It looks like we never even call into DeleteSelectionCommand::removeRedundantBlocks() in the case where this test fails.
Created attachment 406200 [details] Repro v3
Created attachment 406210 [details] Repro v4
Created attachment 406212 [details] Repro v5 (is this an IPC race?)
(In reply to Wenson Hsieh from comment #14) > Created attachment 406212 [details] > Repro v5 (is this an IPC race?) From: <https://ews-build.webkit.org/#/builders/24/builds/23475/steps/10/logs/stdio>... > 15:06:44.680 3621 (CONSOLE) <WK>: ================== > 15:06:44.680 3621 (CONSOLE) <WK>: Starting test (4). > 15:06:44.680 3621 <WK>: Writing web content to platform pasteboard! (web archive data: 1222 bytes) > 15:06:44.680 3621 2020-08-07 15:06:44.281 WebKitTestRunnerApp[3643:164560] PBItemCollectionServicer connection disconnected. > 15:06:44.681 3621 <WK>: Reading pasteboard web content. > 15:06:44.681 3621 <WK>: => No info > 15:06:44.681 3621 (CONSOLE) <WK>: FAILURE! > 15:06:44.681 3621 (CONSOLE) <WK>: - EXP: "<div>Hello</div><div>world</div>" > 15:06:44.681 3621 (CONSOLE) <WK>: - GOT: "<div><div>Hello</div><div>world</div></div>" > 15:06:44.681 3621 (CONSOLE) <WK>: ================== What’s probably happening is that the changeCount is bumped up from underneath us, due to the copy command’s IPC message being handled in the UI process after asking for the changeCount (which is a sync IPC message).
Created attachment 406248 [details] Repro v6
Created attachment 406252 [details] Repro v7
Created attachment 406256 [details] Test a potential fix
Created attachment 406257 [details] Test a potential fix
Created attachment 406259 [details] Fixes the bug
Created attachment 406260 [details] Remove flaky test expectation too
Comment on attachment 406260 [details] Remove flaky test expectation too View in context: https://bugs.webkit.org/attachment.cgi?id=406260&action=review > Source/WebKit/UIProcess/Cocoa/WebPasteboardProxyCocoa.mm:457 > + if (auto* process = webProcessProxyForConnection(connection)) Writing just auto instead of auto* seems better for refactoring if we ever make this return a smart pointer instead of a raw pointer. > Source/WebKit/UIProcess/Cocoa/WebPasteboardProxyCocoa.mm:458 > + process->send(Messages::WebProcess::DidWriteToPasteboardAsynchronously(pasteboardName), 0); What gives us test coverage to make sure we didn’t miss any cases? I assume the flaky test only exercises one of these. > Source/WebKit/WebProcess/WebCoreSupport/WebPlatformStrategies.cpp:173 > + WebProcess::singleton().waitForPendingPasteboardWritesToFinish(pasteboardName); What gives us test coverage to make sure we didn’t miss any cases? I assume the flaky test only exercises one of these. > Source/WebKit/WebProcess/WebProcess.h:658 > + HashMap<String, unsigned> m_pendingPasteboardWriteCounts; Should use HashCountedSet instead. It’s designed for this kind of use. > Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:1095 > + while (m_pendingPasteboardWriteCounts.get(pasteboardName)) { This does unnecessary extra work and should just check contains. We never have a hash table entry with a 0 in it. (Will change if you use HashCountedSet.) > Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:1096 > + if (!parentProcessConnection()->waitForAndDispatchImmediately<Messages::WebProcess::DidWriteToPasteboardAsynchronously>(0, 1_s, IPC::WaitForOption::InterruptWaitingIfSyncMessageArrives)) How did we select a 1 second timeout? Because of the way this is written it will loop forever anyway, so it’s not clear why it’s OK to time out and what we will do in that case. And this case is untested. Sending extra messages seems a bit dangerous? Not sure. > Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:1099 > + m_pendingPasteboardWriteCounts.remove(pasteboardName); This seems like a no op. The count should already have been removed by didWriteToPasteboardAsynchronously.
Comment on attachment 406260 [details] Remove flaky test expectation too View in context: https://bugs.webkit.org/attachment.cgi?id=406260&action=review >> Source/WebKit/UIProcess/Cocoa/WebPasteboardProxyCocoa.mm:457 >> + if (auto* process = webProcessProxyForConnection(connection)) > > Writing just auto instead of auto* seems better for refactoring if we ever make this return a smart pointer instead of a raw pointer. Sounds good — changed to auto here, and in the 3 other places below. >> Source/WebKit/UIProcess/Cocoa/WebPasteboardProxyCocoa.mm:458 >> + process->send(Messages::WebProcess::DidWriteToPasteboardAsynchronously(pasteboardName), 0); > > What gives us test coverage to make sure we didn’t miss any cases? I assume the flaky test only exercises one of these. This flaky test in particular only exercises `writeWebContentToPasteboard`, but there are many other tests in editing/pasteboard that attempt to copy other types of content (such as links and plain text) and immediately paste, which I suspect are also flaky (just much, much more rarely). >> Source/WebKit/WebProcess/WebCoreSupport/WebPlatformStrategies.cpp:173 >> + WebProcess::singleton().waitForPendingPasteboardWritesToFinish(pasteboardName); > > What gives us test coverage to make sure we didn’t miss any cases? I assume the flaky test only exercises one of these. All native pasteboard access on Cocoa platforms is preceded by grabbing the changeCount when creating the Pasteboard object, so I think this should be sufficient. >> Source/WebKit/WebProcess/WebProcess.h:658 >> + HashMap<String, unsigned> m_pendingPasteboardWriteCounts; > > Should use HashCountedSet instead. It’s designed for this kind of use. Nice! Switched to use HashCountedSet. >> Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:1095 >> + while (m_pendingPasteboardWriteCounts.get(pasteboardName)) { > > This does unnecessary extra work and should just check contains. We never have a hash table entry with a 0 in it. > > (Will change if you use HashCountedSet.) Changed to use HashCountedSet::contains()! >> Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:1096 >> + if (!parentProcessConnection()->waitForAndDispatchImmediately<Messages::WebProcess::DidWriteToPasteboardAsynchronously>(0, 1_s, IPC::WaitForOption::InterruptWaitingIfSyncMessageArrives)) > > How did we select a 1 second timeout? Because of the way this is written it will loop forever anyway, so it’s not clear why it’s OK to time out and what we will do in that case. And this case is untested. Sending extra messages seems a bit dangerous? Not sure. The 1 second timeout was (mostly) arbitrarily chosen to match the timeout we use for most of the other places where we waitForAndDispatchImmediately. The behavior I implemented here will attempt to wait for IPC responses from the UI process until the named pasteboard is no longer being written to, but if the IPC message times out or fails to arrive for any reason, instead of looping infinitely, the break statement will bail us out of the while loop and the named pasteboard will be removed from the pending pasteboard writes map below so that we get stuck in a bad state. Unfortunately, I’m not quite sure how I would test this scenario... >> Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:1099 >> + m_pendingPasteboardWriteCounts.remove(pasteboardName); > > This seems like a no op. The count should already have been removed by didWriteToPasteboardAsynchronously. This is only a no-op in the case where the IPC responses were received; it is here to handle the scenario where we failed when waiting above. I’ll move it into the break statement instead, and make it use removeAll().
Created attachment 406262 [details] Patch
Committed r265410: <https://trac.webkit.org/changeset/265410> All reviewed patches have been landed. Closing bug and clearing flags on attachment 406262 [details].