Bug 235733 - REGRESSION(288052): editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html makes subsequent test fail, as DumpRenderTree compares to wrong expected result
Summary: REGRESSION(288052): editing/execCommand/paste-as-quotation-disconnected-parag...
Status: RESOLVED CONFIGURATION CHANGED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Mac (Apple Silicon) Unspecified
: P1 Critical
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2022-01-27 14:21 PST by Robert Jenner
Modified: 2022-09-05 20:56 PDT (History)
12 users (show)

See Also:


Attachments
Patch -- check for window.testRunner (3.21 KB, patch)
2022-02-02 00:20 PST, Frédéric Wang (:fredw)
no flags Details | Formatted Diff | Diff
Patch -- check for window.testRunner ; better sync postMessage & execCommands (4.54 KB, patch)
2022-02-02 02:30 PST, Frédéric Wang (:fredw)
no flags Details | Formatted Diff | Diff
Patch (7.11 KB, patch)
2022-02-02 05:01 PST, Frédéric Wang (:fredw)
no flags Details | Formatted Diff | Diff
another attempt... (4.63 KB, patch)
2022-02-04 06:52 PST, Frédéric Wang (:fredw)
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (7.20 KB, patch)
2022-02-04 11:30 PST, Frédéric Wang (:fredw)
no flags Details | Formatted Diff | Diff
Patch -- more attempts (4.21 KB, patch)
2022-02-07 06:22 PST, Frédéric Wang (:fredw)
no flags Details | Formatted Diff | Diff
Patch -- more attempts (5.33 KB, patch)
2022-02-07 06:26 PST, Frédéric Wang (:fredw)
ews-feeder: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Jenner 2022-01-27 14:21:38 PST
Apple Silicon Big Sur and Monterey Release bots running wk1 tests are flaky exiting testing after 500 layout-test failures. 


It doesn't happen every time, but it is happening fairly often. You can see that with the following links showing the history of test runs on those queues:

BigSur:
https://build.webkit.org/#/builders/Apple-BigSur-Release-AppleSilicon-WK1-Tests?numbuilds=50

Monterey:
https://build.webkit.org/#/builders/Apple-Monterey-Release-AppleSilicon-WK1-Tests?numbuilds=50

It is not occurring on Debug wk1, or anywhere on wk2.
Comment 1 Radar WebKit Bug Importer 2022-01-27 16:33:24 PST
<rdar://problem/88159747>
Comment 2 Robert Jenner 2022-01-27 16:43:04 PST
The first revision ran where these failures were first reported appears to be at r288053:

https://trac.webkit.org/changeset/288053/webkit

I am working on a repo case on my Apple Silicon Mac, but so far have yet to come up with anything.
Comment 3 Robert Jenner 2022-01-27 18:11:55 PST
Here's a page that references the 500 layout-tests that are failing:

https://build.webkit.org/results/Apple-Monterey-Release-AppleSilicon-WK1-Tests/r288696%20(1402)/results.html

They all appear to be either editing/execCommand tests, or fast tests.
Comment 4 Truitt Savell 2022-01-28 10:15:17 PST
To me, it looks like this is an issue with the webkit test runner. many of these diffs have - lines that do not match at all to the + lines in theme and expectation. It appears to me that the runner is comparing a tests results to another tests expected output.
Comment 5 Robert Jenner 2022-01-28 10:46:27 PST
I am able to reproduce the failures of the editing/execCommand, but not any of the fast tests. Still, I agree with Truitt's comment above. It looks like there's an issue with WebKit test runner. I was able to reproduce the failures by just running tests only with the editing/execCommand directory with the following command at Monterey Release ToT:

run-webkit-tests --no-build --no-show-results /OpenSource/LayoutTests/editing --no-new-test-results --clobber-old-results --release --debug-rwt-logging --no-retry-failures -1
Comment 6 Robert Jenner 2022-01-28 11:58:36 PST
(In reply to Truitt Savell from comment #4)
> To me, it looks like this is an issue with the webkit test runner. many of
> these diffs have - lines that do not match at all to the + lines in theme
> and expectation. It appears to me that the runner is comparing a tests
> results to another tests expected output.

To Truitt's point, we are seeing a lot of the failures where it looks like there's something going on with the referencing of the test results. Here are a few example diffs:

editing/execCommand/primitive-value-cleanup-minimal.html:
+++ /Volumes/Data/worker/monterey-release-applesilicon-tests-wk1/build/layout-test-results/editing/execCommand/primitive-value-cleanup-minimal-actual.txt
@@ -1,2 +1 @@
-This test checks passes if there if no crash observed.
 

editing/execCommand/remove-formatting-from-iframe-in-button.html:
-PASSED - this test didn't fire ASSERT, bug 121791.
+EDITING DELEGATE: shouldBeginEditingInDOMRange:range from 0 of DIV > BODY > HTML > #document to 2 of DIV > BODY > HTML > #document
+EDITING DELEGATE: webViewDidBeginEditing:WebViewDidBeginEditingNotification
+EDITING DELEGATE: webViewDidChangeSelection:WebViewDidChangeSelectionNotification
+EDITING DELEGATE: webViewDidChangeSelection:WebViewDidChangeSelectionNotification
+EDITING DELEGATE: shouldChangeSelectedDOMRange:range from 4 of #text > DIV > BODY > HTML > #document to 4 of #text > DIV > BODY > HTML > #document toDOMRange:range from 0 of #text > DIV > BODY > HTML > #document to 4 of #text > DIV > BODY > HTML > #document affinity:NSSelectionAffinityDownstream stillSelecting:FALSE
+EDITING DELEGATE: webViewDidChangeSelection:WebViewDidChangeSelectionNotification
+EDITING DELEGATE: webViewDidChange:WebViewDidChangeNotification
+EDITING DELEGATE: webViewDidEndEditing:WebViewDidEndEditingNotification
+This tests that RemoveFormat not only removes style from the selected part of the DOM, but that it also applies the document default style to the selection if that's necessary in order to leave the selected text unstyled.
+| "<#selection-anchor>This<#selection-focus>"
+| " text should look the same as the text above."

fast/backgrounds/background-clip-text-on-body.html:
-Text should remain legible when window is resized. No background artifacts should appear.
+layer at (0,0) size 800x600
+  RenderView at (0,0) size 800x600
+layer at (0,0) size 800x334
+  RenderBlock {HTML} at (0,0) size 800x334
+    RenderBody {BODY} at (8,10) size 784x314
+layer at (18,10) size 502x152 clip at (19,11) size 500x150 scrollHeight 500
+  RenderBlock {DIV} at (10,0) size 502x152 [border: (1px solid #C0C0C0)]
+    RenderBlock {DIV} at (1,1) size 500x500
+layer at (18,172) size 502x152 clip at (19,173) size 500x150 scrollHeight 500
+  RenderBlock {DIV} at (10,162) size 502x152 [border: (1px solid #C0C0C0)]
+    RenderBlock {DIV} at (1,1) size 500x500



All of these test failures appear to have the correct result, but are missing just one single line that usually is just stating something to the effect of "This test passes if it doesn't crash or assert."
Comment 7 Robert Jenner 2022-01-28 18:16:25 PST
I found the test that was causing all the others to fail, and have skipped it here:

https://trac.webkit.org/changeset/288775/webkit
Comment 8 Alexey Proskuryakov 2022-01-29 17:18:05 PST
The test looks super suspicious, as it performs a lot of operations after onload, and doesn't use waitUntilDone.

We probably need to the test AND fix the harness to be resilient to issues of this kind.
Comment 9 Robert Jenner 2022-01-31 17:30:59 PST
My finding above was actually incorrect. After disabling the test and letting it run through the weekend, we still observed the 500+ layout-test failures occurring. After digging deeper, and testing. I believe it is actually this change that is the cause of this issue:

https://trac.webkit.org/changeset/288052/webkit

The timing certainly lines up to when it started happening on the bots. The test introduced there is also in the test list attached to this bug. Running through my test list and removing the “editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html” did stop the failures that I was able to reproduce. 

So that appears to be a more accurate cause than the first test I was blaming.I have updated my expectation to [ skip ] here for both tests:

https://trac.webkit.org/changeset/288861/webkit
Comment 10 Robert Jenner 2022-02-01 11:09:48 PST
After disabling that test, and observing the bots behavior from overnight it does appear that the 500+ layout-test failures have stopped occurring. So it does appear to be that r288052 did cause this issue.
Comment 11 Robert Jenner 2022-02-01 13:18:10 PST
I have assigned this bug to Frédéric Wang, who introduced r288052 that caused this breakage.
Comment 12 Frédéric Wang (:fredw) 2022-02-02 00:05:51 PST
Thanks Robert. I had noticed this issue on windows bot and skipped the test in my original patch.

I had a hard time writing this crash test. It relies on these parallel PasteAsQuotation stuff in

LayoutTests/editing/execCommand/resources/paste-as-quotation-disconnected-paragraph-ancestor-crash-iframe.html

but these two are producing flaky text output. That's why I wrapped the test in a iframe.

Contrary to comment 8 for editing/execCommand/insert-newline-in-quoted-content-crash.html I've tried hard to use waitUntilDone/notifyDone. But tweaking a bit the test makes the crash unreproducible, so it's not easy.

I'll give a try again...
Comment 13 Frédéric Wang (:fredw) 2022-02-02 00:20:15 PST
Created attachment 450624 [details]
Patch -- check for window.testRunner
Comment 14 Frédéric Wang (:fredw) 2022-02-02 02:30:04 PST
Created attachment 450629 [details]
Patch -- check for window.testRunner ; better sync postMessage & execCommands
Comment 15 Frédéric Wang (:fredw) 2022-02-02 05:01:06 PST
Created attachment 450635 [details]
Patch
Comment 16 Frédéric Wang (:fredw) 2022-02-02 05:05:53 PST
(In reply to Frédéric Wang (:fredw) from comment #12)
> I'll give a try again...

I'm still failing to prevent this test expectation mismatch with DumpRenderTree.

Since the original crash requires window.caches to be reproducible, which is not available on DumpRenderTree, I guess it's not a big deal to keep skipping the test anyway.
Comment 17 Robert Jenner 2022-02-02 10:37:52 PST
(In reply to Frédéric Wang (:fredw) from comment #16)
> (In reply to Frédéric Wang (:fredw) from comment #12)
> > I'll give a try again...
> 
> I'm still failing to prevent this test expectation mismatch with
> DumpRenderTree.
> 
> Since the original crash requires window.caches to be reproducible, which is
> not available on DumpRenderTree, I guess it's not a big deal to keep
> skipping the test anyway.

Skipping has stopped the issue, so yeah we can keep it skipped until a fix can be made. Thanks for working on this!
Comment 18 Alexey Proskuryakov 2022-02-02 17:20:05 PST
Comment on attachment 450635 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=450635&action=review

> LayoutTests/ChangeLog:3
> +        REGRESSION(288052): editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html

Ideally, it would be a change to DumpRenderTree so that a test can never cause a catastrophic consequence like this. Is that infeasible for some reason?
Comment 19 Frédéric Wang (:fredw) 2022-02-04 06:52:44 PST
Created attachment 450891 [details]
another attempt...
Comment 20 Frédéric Wang (:fredw) 2022-02-04 11:30:37 PST
Created attachment 450923 [details]
Patch
Comment 21 Frédéric Wang (:fredw) 2022-02-04 11:31:44 PST
Comment on attachment 450635 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=450635&action=review

>> LayoutTests/ChangeLog:3
>> +        REGRESSION(288052): editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html
> 
> Ideally, it would be a change to DumpRenderTree so that a test can never cause a catastrophic consequence like this. Is that infeasible for some reason?

I'm not really familiar with DumpRenderTree so someone else would be in a better position to reply. AFAIK, with my latest changes the test cannot perform any further operations after notifyDone() is called so I have no idea why DumpRenderTree is shifting all the results that way...
Comment 22 Darin Adler 2022-02-05 21:57:56 PST
Comment on attachment 450923 [details]
Patch

I’m not sure how comfortable I am with this workaround. Here we have a test that does not work in legacy WebKit. That would normally be dealt with by adding a skip expectation in a legacy-WebKit-specific test expectation file, grouping it with all other tests that don’t work in legacy WebKit, particularly the ones that have bad enough problems that they screw up subsequent tests. But this change instead makes the test silently skip itself, but write output as if it passed. That seems like a hacky pattern. But I suppose we can do it if it’s a lot easier to do correctly than something more straightforward.

I think I would feel a little better if the test wrote out output indicating it had failed, but that would make this patch bigger because we’d have to check in the expected failures.
Comment 23 Alexey Proskuryakov 2022-02-05 22:58:26 PST
> AFAIK, with my latest changes the test cannot
> perform any further operations after notifyDone() is called so I have no
> idea why DumpRenderTree is shifting all the results that way...

I think that the next step would be to run it in DumpRenderTree without run-webkit-tests, to see what it actually prints out. It's not something that anyone is deeply familiar with at this point, this code hasn't been modified recently AFAIK.
Comment 24 Alexey Proskuryakov 2022-02-05 23:51:10 PST
When running a test 10 times in a row, one would expect Content-Type and EOF pairs to be dumped 10 times. But I got 12 on first attempt, and 15 on the second attempt.

It's probably dispatching the load event multiple times, and calling notifyDone multiple times - wouldn't be too surprising, given all that the subframe does. Seems like a condition that must be possible to detect in the harness.

DYLD_LIBRARY_PATH=$PWD DYLD_FRAMEWORK_PATH=$PWD ./DumpRenderTree  ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html 2>/dev/null
CONSOLE MESSAGE: This test PASS if it does not crash.
Content-Type: text/plain
DumpMalloc: 133988352

#EOF
#EOF
CONSOLE MESSAGE: This test PASS if it does not crash.
Content-Type: text/plain
DumpMalloc: 137117696

#EOF
#EOF
Content-Type: text/plain
DumpMalloc: 137166848

#EOF
#EOF
CONSOLE MESSAGE: This test PASS if it does not crash.
Content-Type: text/plain
DumpMalloc: 138838016

#EOF
#EOF
CONSOLE MESSAGE: This test PASS if it does not crash.
Content-Type: text/plain
DumpMalloc: 140525568

#EOF
#EOF
CONSOLE MESSAGE: This test PASS if it does not crash.
Content-Type: text/plain
DumpMalloc: 142180352

#EOF
#EOF
CONSOLE MESSAGE: This test PASS if it does not crash.
Content-Type: text/plain
DumpMalloc: 143523840

#EOF
#EOF
CONSOLE MESSAGE: This test PASS if it does not crash.
Content-Type: text/plain
DumpMalloc: 143572992

#EOF
#EOF
CONSOLE MESSAGE: This test PASS if it does not crash.
Content-Type: text/plain
DumpMalloc: 143572992

#EOF
#EOF
Content-Type: text/plain
DumpMalloc: 143572992

#EOF
#EOF
CONSOLE MESSAGE: This test PASS if it does not crash.
Content-Type: text/plain
DumpMalloc: 143671296

#EOF
#EOF
CONSOLE MESSAGE: This test PASS if it does not crash.
Content-Type: text/plain
DumpMalloc: 143818752

#EOF
#EOF
Comment 25 Frédéric Wang (:fredw) 2022-02-07 06:22:58 PST
Created attachment 451086 [details]
Patch -- more attempts
Comment 26 Frédéric Wang (:fredw) 2022-02-07 06:25:03 PST
(In reply to Alexey Proskuryakov from comment #24)
> When running a test 10 times in a row, one would expect Content-Type and EOF
> pairs to be dumped 10 times. But I got 12 on first attempt, and 15 on the
> second attempt.

I cannot reproduce this with DumpRenderTree. I always get 10 Content-Type and 20 EOF (even testing without the further tweaks from my patch).

> It's probably dispatching the load event multiple times, and calling
> notifyDone multiple times - wouldn't be too surprising, given all that the
> subframe does. Seems like a condition that must be possible to detect in the
> harness.

I don't quite follow the connection between the things that the subframe does and that the possibility that the load event is dispatched several times. Anyway, it's easy to make the load listeners {once:true} and I verified this still makes the original crash reproducible.

My other guess was that <script> element were copied an executed. So I'm also trying to avoid that by resetting their text before copy & paste operations.

Finally, I was thinking that maybe async/await has issues in WK1. However,  using equivalent promises does not seem to allow reproducing the original crash.

Anyway, I uploaded a new patch, let's see if that makes EWS happy.
Comment 27 Frédéric Wang (:fredw) 2022-02-07 06:26:07 PST
Created attachment 451088 [details]
Patch -- more attempts
Comment 28 Alexey Proskuryakov 2022-02-07 08:59:10 PST
I don't understand what makes tweaking the test valuable. If it's not useful on WK1, it can be just skipped, and there is no reason to modify it. Am I missing something here?

What is important about this issue is that it can be prevented for good by fixing DumpRenderTree, so that catastrophic failures of this kind don't happen in the future.
Comment 29 Frédéric Wang (:fredw) 2022-02-09 02:12:15 PST
I think people may have different perspectives here. I'm the one who initially wrote this test for a security bug. It has been suggested that the test was maybe not properly written, so I thought it would be important to fix any of these mistakes (or at least try to), and not just waiting that DumpRenderTree becomes more resilient. After all, this could well hide flakiness on other ports too... 

Anyway, after all the tweaks to make the test more stable in attachment 451088 [details], I still see the same failure on wk1 bot, so from now on I'll consider the issue is not my fault and unassign myself from this bug. Since the original bug is not reproducible on wk1, it's not a big deal to keep it skipped. If anyone is interested in fixing the issue related to the test infrastructure, then they should feel free to work on this...
Comment 30 Brent Fulgham 2022-06-23 15:01:30 PDT
Robert Jenner confirmed this is no longer an issue as of 2/25/2022.
Comment 31 Alexey Proskuryakov 2022-09-05 15:01:39 PDT
Looks like it's happening again, see bug 244395.

> I think people may have different perspectives here.

Some of the perspectives would be clearly counter-productive. Destabilizing tests for everyone because a new test ran into a pre-existing problem is not OK.
Comment 32 Frédéric Wang (:fredw) 2022-09-05 20:56:23 PDT
(In reply to Alexey Proskuryakov from comment #31)
> Looks like it's happening again, see bug 244395.
> 
> > I think people may have different perspectives here.
> 
> Some of the perspectives would be clearly counter-productive. Destabilizing
> tests for everyone because a new test ran into a pre-existing problem is not
> OK.

I don't think I said the contrary? I was replying to your question about why it's worth trying to tweak the test to make it work on wk1. But AFAIK, I didn't land this patch and said it was ok for me to keep it skipped on wk1, so not sure who re-introduced the problem...