RESOLVED FIXED Bug 230513
Broken pipes during iOS simulator testing
https://bugs.webkit.org/show_bug.cgi?id=230513
Summary Broken pipes during iOS simulator testing
ayumi_kojima
Reported 2021-09-20 14:46:40 PDT
fast/viewport/scroll-delegates-switch-on-page-with-no-composition-mode-asserts.html Is flaky crashing on iOS. History: https://results.webkit.org/?suite=layout-tests&test=fast%2Fviewport%2Fscroll-delegates-switch-on-page-with-no-composition-mode-asserts.html Crash log: No crash log found for WebKitTestRunnerApp:39389.
Attachments
test lists (90.77 KB, text/plain)
2021-09-21 18:07 PDT, ayumi_kojima
no flags
Revert Patch (7.73 KB, patch)
2021-09-23 13:54 PDT, ayumi_kojima
no flags
Patch (4.31 KB, patch)
2022-02-14 08:21 PST, Jonathan Bedard
no flags
Patch (1.58 KB, patch)
2022-02-14 11:09 PST, Jonathan Bedard
no flags
Patch (1.92 KB, patch)
2022-02-15 16:50 PST, Jonathan Bedard
ews-feeder: commit-queue-
Radar WebKit Bug Importer
Comment 1 2021-09-20 14:46:57 PDT
ayumi_kojima
Comment 2 2021-09-20 14:51:58 PDT
Alexey Proskuryakov
Comment 3 2021-09-21 13:55:49 PDT
13:33:36.641 57433 worker/4 fast/url/user-visible/rf.html passed 13:33:36.760 57433 worker/4 fast/url/user-visible/srb.html passed 13:33:36.760 57433 worker/4 finished test group 13:33:36.887 57433 worker/4 killed pid 60933 13:33:36.888 57433 worker/4 This test marked as a crash because of a broken pipe when writing to stdin of the server process. 13:34:16.905 57433 worker/4 worker/4 fast/viewport/scroll-delegates-switch-on-page-with-no-composition-mode-asserts.html crashed, (no stderr) 13:34:16.906 57433 worker/4 killing driver 13:34:16.918 57433 worker/4 fast/viewport/scroll-delegates-switch-on-page-with-no-composition-mode-asserts.html failed: 13:34:16.918 57433 worker/4 WebKitTestRunnerApp crashed [pid=60933]
ayumi_kojima
Comment 4 2021-09-21 18:06:52 PDT
Though, it might not be able to reproduce the issue locally, tried it to make sure. First, I tried to reproduce it by running all tests with multiple workers (run-webkit-tests --force --exit-after-n-crashes-or-timeouts 1000 --exit-after-n-failures 1000 --no-build --no-show-results --no-new-test-results --clobber-old-results --ios-simulator --debug --child-processes 8), but got (most likely) unrelated error, "OSError: [Errno 9] Bad file descriptor (from worker/6)", and it stopped running tests. Next, tried with test lists (See attached list) with a single worker. Got the same error (OSError: [Errno 9] Bad file descriptor). Next, tried with test lists with multiple workers. Tests ran successfully, but the test in question (fast/viewport/scroll-delegates-switch-on-page-with-no-composition-mode-asserts.html) didn't "crash".
ayumi_kojima
Comment 5 2021-09-21 18:07:18 PDT
Created attachment 438898 [details] test lists
ayumi_kojima
Comment 6 2021-09-22 09:50:56 PDT
Another tests that "crashed" with broken pipes in worker/3,4,5: fast/shadow-dom/DocumentOrShadowRoot-prototype-elementFromPoint.html imported/blink/fast/multicol/basic-rtl.html imported/w3c/i18n/bidi/bidi-isolate-override-004.html imported/w3c/web-platform-tests/css/css-backgrounds/background-size/vector/background-size-vector-001.html imported/w3c/web-platform-tests/css/css-backgrounds/background-size/vector/zero-height-ratio-contain.html imported/w3c/web-platform-tests/css/css-sizing/aspect-ratio/abspos-001.html imported/w3c/web-platform-tests/css/css-sizing/aspect-ratio/grid-aspect-ratio-035.html imported/w3c/web-platform-tests/css/css-transforms/transform-input-012.html imported/w3c/web-platform-tests/css/css-ui/appearance-auto-001.html imported/w3c/web-platform-tests/fetch/api/redirect/redirect-mode.any.worker.html imported/w3c/web-platform-tests/html/semantics/scripting-1/the-script-element/module/currentScript-null.html https://build.webkit.org/#/builders/28/builds/3310
ayumi_kojima
Comment 7 2021-09-22 09:51:19 PDT
Another one: worker/0,1,3 fast/shadow-dom/DocumentOrShadowRoot-prototype-elementFromPoint.html imported/w3c/web-platform-tests/IndexedDB/abort-in-initial-upgradeneeded.html imported/w3c/web-platform-tests/css/css-transforms/preserve3d-and-filter-no-perspective.html imported/w3c/web-platform-tests/css/css-transforms/transform-inherit-001.html https://build.webkit.org/#/builders/28/builds/3295
ayumi_kojima
Comment 8 2021-09-22 11:10:57 PDT
*** Bug 230099 has been marked as a duplicate of this bug. ***
ayumi_kojima
Comment 9 2021-09-22 11:12:27 PDT
*** Bug 230060 has been marked as a duplicate of this bug. ***
ayumi_kojima
Comment 10 2021-09-22 11:14:20 PDT
*** Bug 230418 has been marked as a duplicate of this bug. ***
ayumi_kojima
Comment 11 2021-09-22 11:14:33 PDT
*** Bug 230325 has been marked as a duplicate of this bug. ***
ayumi_kojima
Comment 12 2021-09-22 11:14:43 PDT
Pasting comments from Bug 230060 Alexey Proskuryakov 2021-09-12 13:55:30 PDT Broken pipe is weird, but also not sure why it took 40 seconds between "This test marked as a crash because of a broken pipe" and "crashed". Simon Fraser (smfr) 2021-09-22 10:24:58 PDT Seems like some kinds of process unresponsiveness/lack of output result in us calling something a "crash" when it's not. Would be nice to fix this.
ayumi_kojima
Comment 13 2021-09-23 13:53:26 PDT
I was able to reproduce the bug by the following steps: 1. Apply the attached patch locally that reverts all of the protection placed for the OSError exception. 2. Run all tests using Python2 (python OpenSource/Tools/Scripts/run-webkit-tests --exit-after-n-crashes-or-timeouts 1000 --exit-after-n-failures 1000 --no-build --no-show-results --no-new-test-results --clobber-old-results --ios-simulator --debug --debug-rwt-logging --child-processes 8 ;sudo sysdiagnose -l) 3. Test run stops at around fast/css-intrinsic-dimensions. Observe many timed-out tests. 4. Run tests again using run-webkit-tests <same build> --ios-simulator fast 5. Receives stdout: [1/12583] fast/check-layout-error-no-attributes.html failed unexpectedly (test timed out, no output from test) [2/12583] fast/js-promise-from-detached-iframe.html failed unexpectedly (test timed out, no output from test) [3/12583] fast/animation/animation-display-style-adjustment.html failed unexpectedly (test timed out) OSError raised: [Errno 9] Bad file descriptor ...
ayumi_kojima
Comment 14 2021-09-23 13:54:53 PDT
Created attachment 439088 [details] Revert Patch
Jonathan Bedard
Comment 15 2022-02-10 11:33:32 PST
This is caused by some of our http servers, trying to determine which ones
Jonathan Bedard
Comment 16 2022-02-11 13:15:12 PST
It's the WebSocket server, and we don't even need to use the server to run into the problem.
Jonathan Bedard
Comment 17 2022-02-11 13:25:16 PST
This is basically https://bugs.webkit.org/show_bug.cgi?id=206546. Apparently an explicit `python2` invocation is what breaks things.
Jonathan Bedard
Comment 18 2022-02-11 14:34:10 PST
Was able to get a reproduction of this without any web servers. I think we've found some problems with our websocket server, but it doesn't appear to be the only problem.
Jonathan Bedard
Comment 19 2022-02-14 08:21:27 PST
Jonathan Bedard
Comment 20 2022-02-14 11:09:12 PST
Jonathan Bedard
Comment 21 2022-02-14 11:24:39 PST
(In reply to Jonathan Bedard from comment #20) > Created attachment 451925 [details] > Patch I borrowed one of our bots and was able to reliably reproduce this issue before this change, and unable to do so after. I don't know what I was thinking with the comment I removed, all of my new testing indicates that comment is just wrong.
dewei_zhu
Comment 22 2022-02-15 15:23:21 PST
Comment on attachment 451925 [details] Patch r=me
Alexey Proskuryakov
Comment 23 2022-02-15 16:33:41 PST
Comment on attachment 451925 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=451925&action=review > Tools/Scripts/webkitpy/port/simulator_process.py:76 > # If the file descriptor is bad, we don't have to worry about closing it Don't we still need to close the socket when self._file.close() raises an exception?
Jonathan Bedard
Comment 24 2022-02-15 16:36:10 PST
Comment on attachment 451925 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=451925&action=review >> Tools/Scripts/webkitpy/port/simulator_process.py:76 >> # If the file descriptor is bad, we don't have to worry about closing it > > Don't we still need to close the socket when self._file.close() raises an exception? Although hopefully a rare circumstance, yes, we probably should.
Jonathan Bedard
Comment 25 2022-02-15 16:50:13 PST
Sam Sneddon [:gsnedders]
Comment 26 2022-02-16 07:33:38 PST
Comment on attachment 452111 [details] Patch Can you also fix the comment above (L59, referencing Python 2?). That said, I think this might still be unsafe: I think we need to ensure it's unbuffered otherwise the non-blocking behaviour might end up with nonsense being returned from the buffer? (i.e., buffered=0 to fdopen)
Jonathan Bedard
Comment 27 2022-02-16 08:23:50 PST
(In reply to Sam Sneddon [:gsnedders] from comment #26) > Comment on attachment 452111 [details] > Patch > > Can you also fix the comment above (L59, referencing Python 2?). Will fix it! > > That said, I think this might still be unsafe: I think we need to ensure > it's unbuffered otherwise the non-blocking behaviour might end up with > nonsense being returned from the buffer? (i.e., buffered=0 to fdopen) Third argument to `fdopen` is `buffering`, which is set to 0 (I think for the reason you've outlined, but it's been years since I looked at this code, so I don't totally remember)
EWS
Comment 28 2022-02-16 18:01:10 PST
Committed r289988 (247374@main): <https://commits.webkit.org/247374@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 452111 [details].
Note You need to log in before you can comment on or make changes to this bug.