Bug 215218 - [ iOS wk2 ] editing/pasteboard/paste-without-nesting.html is a flaky failure
Summary: [ iOS wk2 ] editing/pasteboard/paste-without-nesting.html is a flaky failure
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: WebKit Nightly Build
Hardware: All All
: P2 Normal
Assignee: Wenson Hsieh
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-08-06 08:40 PDT by Karl Rackler
Modified: 2020-08-08 20:23 PDT (History)
14 users (show)

See Also:


Attachments
Repro attempt #1 (1.64 MB, patch)
2020-08-07 09:11 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Repro attempt #2 (1.64 MB, patch)
2020-08-07 09:12 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Repro attempt #3 (1.64 MB, patch)
2020-08-07 09:12 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Repro attempt #4 (1.64 MB, patch)
2020-08-07 09:13 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Repro attempt #5 (1.64 MB, patch)
2020-08-07 09:13 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Repro v2 (2.22 MB, patch)
2020-08-07 11:29 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Repro v3 (2.22 MB, patch)
2020-08-07 12:07 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Repro v4 (2.23 MB, patch)
2020-08-07 14:07 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Repro v5 (is this an IPC race?) (2.23 MB, patch)
2020-08-07 14:18 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Repro v6 (4.21 MB, patch)
2020-08-07 22:55 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Repro v7 (4.21 MB, patch)
2020-08-07 23:57 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Test a potential fix (1.75 MB, patch)
2020-08-08 11:40 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Test a potential fix (1.75 MB, patch)
2020-08-08 11:42 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Fixes the bug (12.25 KB, patch)
2020-08-08 14:01 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Remove flaky test expectation too (13.70 KB, patch)
2020-08-08 14:38 PDT, Wenson Hsieh
darin: review+
Details | Formatted Diff | Diff
Patch (13.60 KB, patch)
2020-08-08 18:26 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Karl Rackler 2020-08-06 08:40:24 PDT
Description:
editing/pasteboard/paste-without-nesting.html

It should be <div>Hello</div><div>world</div>. Was <div><div>Hello</div><div>world</div></div>.

The is a flaky failure that I can find the first occurrence 04/08/20 and appears to have been a flaky failure since.

History:
https://results.webkit.org/?suite=layout-tests&test=editing%2Fpasteboard%2Fpaste-without-nesting.html&platform=ios&flavor=wk2&style=release&limit=50000

Diff Log:
--- /Volumes/Data/worker/iOS-13-Simulator-WK2-Tests-EWS/build/layout-test-results/editing/pasteboard/paste-without-nesting-expected.txt
+++ /Volumes/Data/worker/iOS-13-Simulator-WK2-Tests-EWS/build/layout-test-results/editing/pasteboard/paste-without-nesting-actual.txt
@@ -3,7 +3,7 @@
 PASS confirmedMarkup is 'Hello'
 PASS confirmedMarkup is 'Hello<br>world'
 PASS confirmedMarkup is '<div>Hello</div><div>world</div>'
-PASS confirmedMarkup is '<div>Hello</div><div>world</div>'
+FAIL confirmedMarkup should be <div>Hello</div><div>world</div>. Was <div><div>Hello</div><div>world</div></div>.
 PASS confirmedMarkup is '<div><b><i>Hello</i></b></div><div><b><i>world</i></b></div>'
 PASS confirmedMarkup is '<div style="text-align: center;"><div><font color="#ff0000"><b><i>hello</i></b></font></div><div><font color="#ff0000"><b><i>world</i></b></font></div></div>'
 PASS confirmedMarkup is '<p>Line One</p><p>Line Two</p>'
Comment 1 Radar WebKit Bug Importer 2020-08-06 08:41:06 PDT
<rdar://problem/66628493>
Comment 2 Karl Rackler 2020-08-06 08:48:51 PDT
I have marked this tests as failing while this issue is investigated.
https://trac.webkit.org/changeset/265329/webkit
Comment 3 Wenson Hsieh 2020-08-07 09:06:02 PDT
Unable to reproduce this locally :(

(I tried running for 1000 iterations on 4 simulators in parallel)
Comment 4 Wenson Hsieh 2020-08-07 09:11:27 PDT Comment hidden (obsolete)
Comment 5 Wenson Hsieh 2020-08-07 09:12:14 PDT Comment hidden (obsolete)
Comment 6 Wenson Hsieh 2020-08-07 09:12:38 PDT Comment hidden (obsolete)
Comment 7 Wenson Hsieh 2020-08-07 09:13:29 PDT Comment hidden (obsolete)
Comment 8 Wenson Hsieh 2020-08-07 09:13:31 PDT Comment hidden (obsolete)
Comment 9 Wenson Hsieh 2020-08-07 10:12:45 PDT
> Unexpected flakiness: text-only failures (4)
>   editing/pasteboard/paste-without-nesting-229.html [ Failure Pass ]
>   editing/pasteboard/paste-without-nesting-346.html [ Failure Pass ]
>   editing/pasteboard/paste-without-nesting-81.html [ Failure Pass ]
>   media/now-playing-status-without-media.html [ Failure Pass ]
> Unexpected flakiness: image-only failures (1)
>   css3/filters/backdrop/backdrop-filter-with-border-radius-and-reflection-remove.html [ ImageOnlyFailure Pass ]

It looks like EWS is at least able to reproduce this!
Comment 10 Wenson Hsieh 2020-08-07 11:29:27 PDT
Created attachment 406193 [details]
Repro v2
Comment 11 Wenson Hsieh 2020-08-07 11:57:31 PDT
From <https://ews-build.webkit.org/#/builders/24/builds/23457/steps/10/logs/stdio>, sample logging excerpt from failed test run:

11:47:34.525 12247   (CONSOLE) <WK>: ==================
11:47:34.525 12247   (CONSOLE) <WK>: Starting test (4).
11:47:34.525 12247   2020-08-07 11:47:34.156 WebKitTestRunnerApp[12268:1448157] PBItemCollectionServicer connection disconnected.
11:47:34.525 12247   (CONSOLE) <WK>: FAILURE!
11:47:34.525 12247   (CONSOLE) <WK>: - EXP: "<div>Hello</div><div>world</div>"
11:47:34.526 12247   (CONSOLE) <WK>: - GOT: "<div><div>Hello</div><div>world</div></div>"
11:47:34.526 12247   (CONSOLE) <WK>: ==================

…compared to a successful run:

11:50:26.765 12247   (CONSOLE) <WK>: ==================
11:50:26.765 12247   (CONSOLE) <WK>: Starting test (4).
11:50:26.765 12247   2020-08-07 11:50:26.369 WebKitTestRunnerApp[12268:1459245] PBItemCollectionServicer connection disconnected.
11:50:26.765 12247   <WK>: Remove redundant blocks?
11:50:26.765 12247   <WK>: - node      <DIV 0x1248ac1c0 id='' class=''>
11:50:26.765 12247   <WK>: - rootNode  <DIV 0x1248ac0e0 id='container4' class=''>
11:50:26.765 12247   <WK>: *** REMOVING <DIV 0x1248ac1c0 id='' class=''> ***
11:50:26.765 12247   <WK>: *** REMOVING <DIV 0x1248ac150 id='' class=''> ***
11:50:26.765 12247   <WK>: 
11:50:26.765 12247   (CONSOLE) <WK>: PASS!
11:50:26.765 12247   (CONSOLE) <WK>: ==================

It looks like we never even call into DeleteSelectionCommand::removeRedundantBlocks() in the case where this test fails.
Comment 12 Wenson Hsieh 2020-08-07 12:07:38 PDT
Created attachment 406200 [details]
Repro v3
Comment 13 Wenson Hsieh 2020-08-07 14:07:30 PDT
Created attachment 406210 [details]
Repro v4
Comment 14 Wenson Hsieh 2020-08-07 14:18:52 PDT
Created attachment 406212 [details]
Repro v5 (is this an IPC race?)
Comment 15 Wenson Hsieh 2020-08-07 18:22:18 PDT
(In reply to Wenson Hsieh from comment #14)
> Created attachment 406212 [details]
> Repro v5 (is this an IPC race?)

From: <https://ews-build.webkit.org/#/builders/24/builds/23475/steps/10/logs/stdio>...

> 15:06:44.680 3621   (CONSOLE) <WK>: ==================
> 15:06:44.680 3621   (CONSOLE) <WK>: Starting test (4).
> 15:06:44.680 3621   <WK>: Writing web content to platform pasteboard! (web archive data: 1222 bytes)
> 15:06:44.680 3621   2020-08-07 15:06:44.281 WebKitTestRunnerApp[3643:164560] PBItemCollectionServicer connection disconnected.
> 15:06:44.681 3621   <WK>: Reading pasteboard web content.
> 15:06:44.681 3621   <WK>: => No info
> 15:06:44.681 3621   (CONSOLE) <WK>: FAILURE!
> 15:06:44.681 3621   (CONSOLE) <WK>: - EXP: "<div>Hello</div><div>world</div>"
> 15:06:44.681 3621   (CONSOLE) <WK>: - GOT: "<div><div>Hello</div><div>world</div></div>"
> 15:06:44.681 3621   (CONSOLE) <WK>: ==================

What’s probably happening is that the changeCount is bumped up from underneath us, due to the copy command’s IPC message being handled in the UI process after asking for the changeCount (which is a sync IPC message).
Comment 16 Wenson Hsieh 2020-08-07 22:55:34 PDT
Created attachment 406248 [details]
Repro v6
Comment 17 Wenson Hsieh 2020-08-07 23:57:22 PDT
Created attachment 406252 [details]
Repro v7
Comment 18 Wenson Hsieh 2020-08-08 11:40:29 PDT Comment hidden (obsolete)
Comment 19 Wenson Hsieh 2020-08-08 11:42:45 PDT
Created attachment 406257 [details]
Test a potential fix
Comment 20 Wenson Hsieh 2020-08-08 14:01:05 PDT
Created attachment 406259 [details]
Fixes the bug
Comment 21 Wenson Hsieh 2020-08-08 14:38:20 PDT
Created attachment 406260 [details]
Remove flaky test expectation too
Comment 22 Darin Adler 2020-08-08 15:13:59 PDT
Comment on attachment 406260 [details]
Remove flaky test expectation too

View in context: https://bugs.webkit.org/attachment.cgi?id=406260&action=review

> Source/WebKit/UIProcess/Cocoa/WebPasteboardProxyCocoa.mm:457
> +    if (auto* process = webProcessProxyForConnection(connection))

Writing just auto instead of auto* seems better for refactoring if we ever make this return a smart pointer instead of a raw pointer.

> Source/WebKit/UIProcess/Cocoa/WebPasteboardProxyCocoa.mm:458
> +        process->send(Messages::WebProcess::DidWriteToPasteboardAsynchronously(pasteboardName), 0);

What gives us test coverage to make sure we didn’t miss any cases? I assume the flaky test only exercises one of these.

> Source/WebKit/WebProcess/WebCoreSupport/WebPlatformStrategies.cpp:173
> +    WebProcess::singleton().waitForPendingPasteboardWritesToFinish(pasteboardName);

What gives us test coverage to make sure we didn’t miss any cases? I assume the flaky test only exercises one of these.

> Source/WebKit/WebProcess/WebProcess.h:658
> +    HashMap<String, unsigned> m_pendingPasteboardWriteCounts;

Should use HashCountedSet instead. It’s designed for this kind of use.

> Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:1095
> +    while (m_pendingPasteboardWriteCounts.get(pasteboardName)) {

This does unnecessary extra work and should just check contains. We never have a hash table entry with a 0 in it.

(Will change if you use HashCountedSet.)

> Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:1096
> +        if (!parentProcessConnection()->waitForAndDispatchImmediately<Messages::WebProcess::DidWriteToPasteboardAsynchronously>(0, 1_s, IPC::WaitForOption::InterruptWaitingIfSyncMessageArrives))

How did we select a 1 second timeout? Because of the way this is written it will loop forever anyway, so it’s not clear why it’s OK to time out and what we will do in that case. And this case is untested. Sending extra messages seems a bit dangerous? Not sure.

> Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:1099
> +    m_pendingPasteboardWriteCounts.remove(pasteboardName);

This seems like a no op. The count should already have been removed by didWriteToPasteboardAsynchronously.
Comment 23 Wenson Hsieh 2020-08-08 15:49:07 PDT
Comment on attachment 406260 [details]
Remove flaky test expectation too

View in context: https://bugs.webkit.org/attachment.cgi?id=406260&action=review

>> Source/WebKit/UIProcess/Cocoa/WebPasteboardProxyCocoa.mm:457
>> +    if (auto* process = webProcessProxyForConnection(connection))
> 
> Writing just auto instead of auto* seems better for refactoring if we ever make this return a smart pointer instead of a raw pointer.

Sounds good — changed to auto here, and in the 3 other places below.

>> Source/WebKit/UIProcess/Cocoa/WebPasteboardProxyCocoa.mm:458
>> +        process->send(Messages::WebProcess::DidWriteToPasteboardAsynchronously(pasteboardName), 0);
> 
> What gives us test coverage to make sure we didn’t miss any cases? I assume the flaky test only exercises one of these.

This flaky test in particular only exercises `writeWebContentToPasteboard`, but there are many other tests in editing/pasteboard that attempt to copy other types of content (such as links and plain text) and immediately paste, which I suspect are also flaky (just much, much more rarely).

>> Source/WebKit/WebProcess/WebCoreSupport/WebPlatformStrategies.cpp:173
>> +    WebProcess::singleton().waitForPendingPasteboardWritesToFinish(pasteboardName);
> 
> What gives us test coverage to make sure we didn’t miss any cases? I assume the flaky test only exercises one of these.

All native pasteboard access on Cocoa platforms is preceded by grabbing the changeCount when creating the Pasteboard object, so I think this should be sufficient.

>> Source/WebKit/WebProcess/WebProcess.h:658
>> +    HashMap<String, unsigned> m_pendingPasteboardWriteCounts;
> 
> Should use HashCountedSet instead. It’s designed for this kind of use.

Nice! Switched to use HashCountedSet.

>> Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:1095
>> +    while (m_pendingPasteboardWriteCounts.get(pasteboardName)) {
> 
> This does unnecessary extra work and should just check contains. We never have a hash table entry with a 0 in it.
> 
> (Will change if you use HashCountedSet.)

Changed to use HashCountedSet::contains()!

>> Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:1096
>> +        if (!parentProcessConnection()->waitForAndDispatchImmediately<Messages::WebProcess::DidWriteToPasteboardAsynchronously>(0, 1_s, IPC::WaitForOption::InterruptWaitingIfSyncMessageArrives))
> 
> How did we select a 1 second timeout? Because of the way this is written it will loop forever anyway, so it’s not clear why it’s OK to time out and what we will do in that case. And this case is untested. Sending extra messages seems a bit dangerous? Not sure.

The 1 second timeout was (mostly) arbitrarily chosen to match the timeout we use for most of the other places where we waitForAndDispatchImmediately. The behavior I implemented here will attempt to wait for IPC responses from the UI process until the named pasteboard is no longer being written to, but if the IPC message times out or fails to arrive for any reason, instead of looping infinitely, the break statement will bail us out of the while loop and the named pasteboard will be removed from the pending pasteboard writes map below so that we get stuck in a bad state. Unfortunately, I’m not quite sure how I would test this scenario...

>> Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:1099
>> +    m_pendingPasteboardWriteCounts.remove(pasteboardName);
> 
> This seems like a no op. The count should already have been removed by didWriteToPasteboardAsynchronously.

This is only a no-op in the case where the IPC responses were received; it is here to handle the scenario where we failed when waiting above. I’ll move it into the break statement instead, and make it use removeAll().
Comment 24 Wenson Hsieh 2020-08-08 18:26:59 PDT
Created attachment 406262 [details]
Patch
Comment 25 EWS 2020-08-08 20:23:29 PDT
Committed r265410: <https://trac.webkit.org/changeset/265410>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 406262 [details].