235733 – REGRESSION(288052): editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html makes subsequent test fail, as DumpRenderTree compares to wrong expected result

RESOLVED CONFIGURATION CHANGED 235733

REGRESSION(288052): editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html makes subsequent test fail, as DumpRenderTree compares to wrong expected result

https://bugs.webkit.org/show_bug.cgi?id=235733

Summary REGRESSION(288052): editing/execCommand/paste-as-quotation-disconnected-parag...

Robert Jenner

Reported 2022-01-27 14:21:38 PST

Apple Silicon Big Sur and Monterey Release bots running wk1 tests are flaky exiting testing after 500 layout-test failures. It doesn't happen every time, but it is happening fairly often. You can see that with the following links showing the history of test runs on those queues: BigSur: https://build.webkit.org/#/builders/Apple-BigSur-Release-AppleSilicon-WK1-Tests?numbuilds=50 Monterey: https://build.webkit.org/#/builders/Apple-Monterey-Release-AppleSilicon-WK1-Tests?numbuilds=50 It is not occurring on Debug wk1, or anywhere on wk2.

Attachments
Patch -- check for window.testRunner (3.21 KB, patch) 2022-02-02 00:20 PST, Frédéric Wang (:fredw)	no flags	Details Formatted Diff Diff
Patch -- check for window.testRunner ; better sync postMessage & execCommands (4.54 KB, patch) 2022-02-02 02:30 PST, Frédéric Wang (:fredw)	no flags	Details Formatted Diff Diff
Patch (7.11 KB, patch) 2022-02-02 05:01 PST, Frédéric Wang (:fredw)	no flags	Details Formatted Diff Diff
another attempt... (4.63 KB, patch) 2022-02-04 06:52 PST, Frédéric Wang (:fredw)	ews-feeder: commit-queue-	Details Formatted Diff Diff
Patch (7.20 KB, patch) 2022-02-04 11:30 PST, Frédéric Wang (:fredw)	no flags	Details Formatted Diff Diff
Patch -- more attempts (4.21 KB, patch) 2022-02-07 06:22 PST, Frédéric Wang (:fredw)	no flags	Details Formatted Diff Diff
Patch -- more attempts (5.33 KB, patch) 2022-02-07 06:26 PST, Frédéric Wang (:fredw)	ews-feeder: commit-queue-	Details Formatted Diff Diff
Show Obsolete (6) View All Add attachment proposed patch, testcase, etc.

Radar WebKit Bug Importer

Comment 1 2022-01-27 16:33:24 PST

<rdar://problem/88159747>

Robert Jenner

Comment 2 2022-01-27 16:43:04 PST

The first revision ran where these failures were first reported appears to be at r288053: https://trac.webkit.org/changeset/288053/webkit I am working on a repo case on my Apple Silicon Mac, but so far have yet to come up with anything.

Robert Jenner

Comment 3 2022-01-27 18:11:55 PST

Here's a page that references the 500 layout-tests that are failing: https://build.webkit.org/results/Apple-Monterey-Release-AppleSilicon-WK1-Tests/r288696%20(1402)/results.html They all appear to be either editing/execCommand tests, or fast tests.

Truitt Savell

Comment 4 2022-01-28 10:15:17 PST

To me, it looks like this is an issue with the webkit test runner. many of these diffs have - lines that do not match at all to the + lines in theme and expectation. It appears to me that the runner is comparing a tests results to another tests expected output.

Robert Jenner

Comment 5 2022-01-28 10:46:27 PST

I am able to reproduce the failures of the editing/execCommand, but not any of the fast tests. Still, I agree with Truitt's comment above. It looks like there's an issue with WebKit test runner. I was able to reproduce the failures by just running tests only with the editing/execCommand directory with the following command at Monterey Release ToT: run-webkit-tests --no-build --no-show-results /OpenSource/LayoutTests/editing --no-new-test-results --clobber-old-results --release --debug-rwt-logging --no-retry-failures -1

Robert Jenner

Comment 6 2022-01-28 11:58:36 PST

(In reply to Truitt Savell from comment #4) > To me, it looks like this is an issue with the webkit test runner. many of > these diffs have - lines that do not match at all to the + lines in theme > and expectation. It appears to me that the runner is comparing a tests > results to another tests expected output. To Truitt's point, we are seeing a lot of the failures where it looks like there's something going on with the referencing of the test results. Here are a few example diffs: editing/execCommand/primitive-value-cleanup-minimal.html: +++ /Volumes/Data/worker/monterey-release-applesilicon-tests-wk1/build/layout-test-results/editing/execCommand/primitive-value-cleanup-minimal-actual.txt @@ -1,2 +1 @@ -This test checks passes if there if no crash observed. editing/execCommand/remove-formatting-from-iframe-in-button.html: -PASSED - this test didn't fire ASSERT, bug 121791. +EDITING DELEGATE: shouldBeginEditingInDOMRange:range from 0 of DIV > BODY > HTML > #document to 2 of DIV > BODY > HTML > #document +EDITING DELEGATE: webViewDidBeginEditing:WebViewDidBeginEditingNotification +EDITING DELEGATE: webViewDidChangeSelection:WebViewDidChangeSelectionNotification +EDITING DELEGATE: webViewDidChangeSelection:WebViewDidChangeSelectionNotification +EDITING DELEGATE: shouldChangeSelectedDOMRange:range from 4 of #text > DIV > BODY > HTML > #document to 4 of #text > DIV > BODY > HTML > #document toDOMRange:range from 0 of #text > DIV > BODY > HTML > #document to 4 of #text > DIV > BODY > HTML > #document affinity:NSSelectionAffinityDownstream stillSelecting:FALSE +EDITING DELEGATE: webViewDidChangeSelection:WebViewDidChangeSelectionNotification +EDITING DELEGATE: webViewDidChange:WebViewDidChangeNotification +EDITING DELEGATE: webViewDidEndEditing:WebViewDidEndEditingNotification +This tests that RemoveFormat not only removes style from the selected part of the DOM, but that it also applies the document default style to the selection if that's necessary in order to leave the selected text unstyled. +| "<#selection-anchor>This<#selection-focus>" +| " text should look the same as the text above." fast/backgrounds/background-clip-text-on-body.html: -Text should remain legible when window is resized. No background artifacts should appear. +layer at (0,0) size 800x600 + RenderView at (0,0) size 800x600 +layer at (0,0) size 800x334 + RenderBlock {HTML} at (0,0) size 800x334 + RenderBody {BODY} at (8,10) size 784x314 +layer at (18,10) size 502x152 clip at (19,11) size 500x150 scrollHeight 500 + RenderBlock {DIV} at (10,0) size 502x152 [border: (1px solid #C0C0C0)] + RenderBlock {DIV} at (1,1) size 500x500 +layer at (18,172) size 502x152 clip at (19,173) size 500x150 scrollHeight 500 + RenderBlock {DIV} at (10,162) size 502x152 [border: (1px solid #C0C0C0)] + RenderBlock {DIV} at (1,1) size 500x500 All of these test failures appear to have the correct result, but are missing just one single line that usually is just stating something to the effect of "This test passes if it doesn't crash or assert."

Robert Jenner

Comment 7 2022-01-28 18:16:25 PST

I found the test that was causing all the others to fail, and have skipped it here: https://trac.webkit.org/changeset/288775/webkit

Alexey Proskuryakov

Comment 8 2022-01-29 17:18:05 PST

The test looks super suspicious, as it performs a lot of operations after onload, and doesn't use waitUntilDone. We probably need to the test AND fix the harness to be resilient to issues of this kind.

Robert Jenner

Comment 9 2022-01-31 17:30:59 PST

My finding above was actually incorrect. After disabling the test and letting it run through the weekend, we still observed the 500+ layout-test failures occurring. After digging deeper, and testing. I believe it is actually this change that is the cause of this issue: https://trac.webkit.org/changeset/288052/webkit The timing certainly lines up to when it started happening on the bots. The test introduced there is also in the test list attached to this bug. Running through my test list and removing the “editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html” did stop the failures that I was able to reproduce. So that appears to be a more accurate cause than the first test I was blaming.I have updated my expectation to [ skip ] here for both tests: https://trac.webkit.org/changeset/288861/webkit

Robert Jenner

Comment 10 2022-02-01 11:09:48 PST

After disabling that test, and observing the bots behavior from overnight it does appear that the 500+ layout-test failures have stopped occurring. So it does appear to be that r288052 did cause this issue.

Robert Jenner

Comment 11 2022-02-01 13:18:10 PST

I have assigned this bug to Frédéric Wang, who introduced r288052 that caused this breakage.

Frédéric Wang (:fredw)

Comment 12 2022-02-02 00:05:51 PST

Thanks Robert. I had noticed this issue on windows bot and skipped the test in my original patch. I had a hard time writing this crash test. It relies on these parallel PasteAsQuotation stuff in LayoutTests/editing/execCommand/resources/paste-as-quotation-disconnected-paragraph-ancestor-crash-iframe.html but these two are producing flaky text output. That's why I wrapped the test in a iframe. Contrary to comment 8 for editing/execCommand/insert-newline-in-quoted-content-crash.html I've tried hard to use waitUntilDone/notifyDone. But tweaking a bit the test makes the crash unreproducible, so it's not easy. I'll give a try again...

Frédéric Wang (:fredw)

Comment 13 2022-02-02 00:20:15 PST

Created attachment 450624 [details] Patch -- check for window.testRunner

Frédéric Wang (:fredw)

Comment 14 2022-02-02 02:30:04 PST

Created attachment 450629 [details] Patch -- check for window.testRunner ; better sync postMessage & execCommands

Frédéric Wang (:fredw)

Comment 15 2022-02-02 05:01:06 PST

Created attachment 450635 [details] Patch

Frédéric Wang (:fredw)

Comment 16 2022-02-02 05:05:53 PST

(In reply to Frédéric Wang (:fredw) from comment #12) > I'll give a try again... I'm still failing to prevent this test expectation mismatch with DumpRenderTree. Since the original crash requires window.caches to be reproducible, which is not available on DumpRenderTree, I guess it's not a big deal to keep skipping the test anyway.

Robert Jenner

Comment 17 2022-02-02 10:37:52 PST

(In reply to Frédéric Wang (:fredw) from comment #16) > (In reply to Frédéric Wang (:fredw) from comment #12) > > I'll give a try again... > > I'm still failing to prevent this test expectation mismatch with > DumpRenderTree. > > Since the original crash requires window.caches to be reproducible, which is > not available on DumpRenderTree, I guess it's not a big deal to keep > skipping the test anyway. Skipping has stopped the issue, so yeah we can keep it skipped until a fix can be made. Thanks for working on this!

Alexey Proskuryakov

Comment 18 2022-02-02 17:20:05 PST

Comment on attachment 450635 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=450635&action=review > LayoutTests/ChangeLog:3 > + REGRESSION(288052): editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html Ideally, it would be a change to DumpRenderTree so that a test can never cause a catastrophic consequence like this. Is that infeasible for some reason?

Frédéric Wang (:fredw)

Comment 19 2022-02-04 06:52:44 PST

Created attachment 450891 [details] another attempt...

Frédéric Wang (:fredw)

Comment 20 2022-02-04 11:30:37 PST

Created attachment 450923 [details] Patch

Frédéric Wang (:fredw)

Comment 21 2022-02-04 11:31:44 PST

Comment on attachment 450635 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=450635&action=review >> LayoutTests/ChangeLog:3 >> + REGRESSION(288052): editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html > > Ideally, it would be a change to DumpRenderTree so that a test can never cause a catastrophic consequence like this. Is that infeasible for some reason? I'm not really familiar with DumpRenderTree so someone else would be in a better position to reply. AFAIK, with my latest changes the test cannot perform any further operations after notifyDone() is called so I have no idea why DumpRenderTree is shifting all the results that way...

Darin Adler

Comment 22 2022-02-05 21:57:56 PST

Comment on attachment 450923 [details] Patch I’m not sure how comfortable I am with this workaround. Here we have a test that does not work in legacy WebKit. That would normally be dealt with by adding a skip expectation in a legacy-WebKit-specific test expectation file, grouping it with all other tests that don’t work in legacy WebKit, particularly the ones that have bad enough problems that they screw up subsequent tests. But this change instead makes the test silently skip itself, but write output as if it passed. That seems like a hacky pattern. But I suppose we can do it if it’s a lot easier to do correctly than something more straightforward. I think I would feel a little better if the test wrote out output indicating it had failed, but that would make this patch bigger because we’d have to check in the expected failures.

Alexey Proskuryakov

Comment 23 2022-02-05 22:58:26 PST

> AFAIK, with my latest changes the test cannot > perform any further operations after notifyDone() is called so I have no > idea why DumpRenderTree is shifting all the results that way... I think that the next step would be to run it in DumpRenderTree without run-webkit-tests, to see what it actually prints out. It's not something that anyone is deeply familiar with at this point, this code hasn't been modified recently AFAIK.

Alexey Proskuryakov

Comment 24 2022-02-05 23:51:10 PST

When running a test 10 times in a row, one would expect Content-Type and EOF pairs to be dumped 10 times. But I got 12 on first attempt, and 15 on the second attempt. It's probably dispatching the load event multiple times, and calling notifyDone multiple times - wouldn't be too surprising, given all that the subframe does. Seems like a condition that must be possible to detect in the harness. DYLD_LIBRARY_PATH=$PWD DYLD_FRAMEWORK_PATH=$PWD ./DumpRenderTree ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html ~/Safari/OpenSource/LayoutTests/editing/execCommand/paste-as-quotation-disconnected-paragraph-ancestor-crash.html 2>/dev/null CONSOLE MESSAGE: This test PASS if it does not crash. Content-Type: text/plain DumpMalloc: 133988352 #EOF #EOF CONSOLE MESSAGE: This test PASS if it does not crash. Content-Type: text/plain DumpMalloc: 137117696 #EOF #EOF Content-Type: text/plain DumpMalloc: 137166848 #EOF #EOF CONSOLE MESSAGE: This test PASS if it does not crash. Content-Type: text/plain DumpMalloc: 138838016 #EOF #EOF CONSOLE MESSAGE: This test PASS if it does not crash. Content-Type: text/plain DumpMalloc: 140525568 #EOF #EOF CONSOLE MESSAGE: This test PASS if it does not crash. Content-Type: text/plain DumpMalloc: 142180352 #EOF #EOF CONSOLE MESSAGE: This test PASS if it does not crash. Content-Type: text/plain DumpMalloc: 143523840 #EOF #EOF CONSOLE MESSAGE: This test PASS if it does not crash. Content-Type: text/plain DumpMalloc: 143572992 #EOF #EOF CONSOLE MESSAGE: This test PASS if it does not crash. Content-Type: text/plain DumpMalloc: 143572992 #EOF #EOF Content-Type: text/plain DumpMalloc: 143572992 #EOF #EOF CONSOLE MESSAGE: This test PASS if it does not crash. Content-Type: text/plain DumpMalloc: 143671296 #EOF #EOF CONSOLE MESSAGE: This test PASS if it does not crash. Content-Type: text/plain DumpMalloc: 143818752 #EOF #EOF

Frédéric Wang (:fredw)

Comment 25 2022-02-07 06:22:58 PST

Created attachment 451086 [details] Patch -- more attempts

Frédéric Wang (:fredw)

Comment 26 2022-02-07 06:25:03 PST

(In reply to Alexey Proskuryakov from comment #24) > When running a test 10 times in a row, one would expect Content-Type and EOF > pairs to be dumped 10 times. But I got 12 on first attempt, and 15 on the > second attempt. I cannot reproduce this with DumpRenderTree. I always get 10 Content-Type and 20 EOF (even testing without the further tweaks from my patch). > It's probably dispatching the load event multiple times, and calling > notifyDone multiple times - wouldn't be too surprising, given all that the > subframe does. Seems like a condition that must be possible to detect in the > harness. I don't quite follow the connection between the things that the subframe does and that the possibility that the load event is dispatched several times. Anyway, it's easy to make the load listeners {once:true} and I verified this still makes the original crash reproducible. My other guess was that <script> element were copied an executed. So I'm also trying to avoid that by resetting their text before copy & paste operations. Finally, I was thinking that maybe async/await has issues in WK1. However, using equivalent promises does not seem to allow reproducing the original crash. Anyway, I uploaded a new patch, let's see if that makes EWS happy.

Frédéric Wang (:fredw)

Comment 27 2022-02-07 06:26:07 PST

Created attachment 451088 [details] Patch -- more attempts

Alexey Proskuryakov

Comment 28 2022-02-07 08:59:10 PST

I don't understand what makes tweaking the test valuable. If it's not useful on WK1, it can be just skipped, and there is no reason to modify it. Am I missing something here? What is important about this issue is that it can be prevented for good by fixing DumpRenderTree, so that catastrophic failures of this kind don't happen in the future.

Frédéric Wang (:fredw)

Comment 29 2022-02-09 02:12:15 PST

I think people may have different perspectives here. I'm the one who initially wrote this test for a security bug. It has been suggested that the test was maybe not properly written, so I thought it would be important to fix any of these mistakes (or at least try to), and not just waiting that DumpRenderTree becomes more resilient. After all, this could well hide flakiness on other ports too... Anyway, after all the tweaks to make the test more stable in attachment 451088 [details], I still see the same failure on wk1 bot, so from now on I'll consider the issue is not my fault and unassign myself from this bug. Since the original bug is not reproducible on wk1, it's not a big deal to keep it skipped. If anyone is interested in fixing the issue related to the test infrastructure, then they should feel free to work on this...

Brent Fulgham

Comment 30 2022-06-23 15:01:30 PDT

Robert Jenner confirmed this is no longer an issue as of 2/25/2022.

Alexey Proskuryakov

Comment 31 2022-09-05 15:01:39 PDT

Looks like it's happening again, see bug 244395. > I think people may have different perspectives here. Some of the perspectives would be clearly counter-productive. Destabilizing tests for everyone because a new test ran into a pre-existing problem is not OK.

Frédéric Wang (:fredw)

Comment 32 2022-09-05 20:56:23 PDT

(In reply to Alexey Proskuryakov from comment #31) > Looks like it's happening again, see bug 244395. > > > I think people may have different perspectives here. > > Some of the perspectives would be clearly counter-productive. Destabilizing > tests for everyone because a new test ran into a pre-existing problem is not > OK. I don't think I said the contrary? I was replying to your question about why it's worth trying to tweak the test to make it work on wk1. But AFAIK, I didn't land this patch and said it was ok for me to keep it skipped on wk1, so not sure who re-introduced the problem...

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution CONFIGURATION CHANGED

Priority P1

Severity Critical

Classification Unclassified

Version WebKit Nightly Build

Hardware Mac (Apple Silicon)

OS Unspecified

Product WebKit

Component New Bugs

Assignee

Nobody

Reported

2022-01-27 14:21 PST

Modified

2024-05-17 14:18 PDT History

CC List

12 users Show

URL

Keywords InRadar

Depends on

Blocks