RESOLVED CONFIGURATION CHANGED 224523
[WinCairo][curl] flaky http test failures due to the connection limit on 8 cores CPU
https://bugs.webkit.org/show_bug.cgi?id=224523
Summary [WinCairo][curl] flaky http test failures due to the connection limit on 8 co...
Fujii Hironori
Reported 2021-04-13 19:27:44 PDT
[WinCairo][curl] flaky http test failures under heavy CPU load http tests are randomly failing for WinCairo WebKit1 and WebKit2. This issue can be reproduced by the following command. > python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release --no-new-test-results --no-retry-failures http/tests/css/border-image-loading.html --iterations=10000 --exit-after-n-failures=20 -f --no-show-results [11/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [11/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [12/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [15/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [15/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [27/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [31/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [32/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [32/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [32/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [39/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [44/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [52/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [57/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [58/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [60/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [61/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [65/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [67/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [69/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff) Exiting early after 20 failures. 54 tests run.
Attachments
Fujii Hironori
Comment 1 2021-04-14 00:42:58 PDT
Running a single DRT/WTR is stable, no flaky failures. > python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release --no-retry-failures --iterations=10000 --exit-after-n-failures=20 --no-show-results http/tests/css/border-image-loading.html However, running multiple curl commands in the background to stress the httpd makes the WTR fail. > seq 10000 | xargs -n 1 -P 16 curl -s http://127.0.0.1:8000/css/border-image-loading.html -o In this case, all generated curl output files are exactly same. I think this is not an Apache bug.
Fujii Hironori
Comment 2 2021-04-14 14:27:39 PDT
This issue can be reproduced with MiniBrowser. 1. Start MiniBrowser in a debugger devenv -debugexe .\WebKitBuild\Debug\bin64\MiniBrowser.exe --wk1 http://127.0.0.1:8000/css/border-image-loading.html 2. Put a bread point in the error case of CurlRequest::didCompleteTransfer https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/network/curl/CurlRequest.cpp#L475 3. Invoke run-webkit-httpd as admin 4. Invoke multiple curl commands to stress the httpd yes http://127.0.0.1:8000/css/border-image-loading.html | xargs -n 1 -P 16 curl -s -O 5. Repeat clicking 'Reload' button until the break point is hit 'result' was CURLE_COULDNT_CONNECT (0x00000007) in the case.
Fujii Hironori
Comment 3 2021-04-14 14:44:44 PDT
This issue? CURLE_COULDNT_CONNECT during heavy workload: tens of thousands of requests per minute - Stack Overflow https://stackoverflow.com/q/32212207 However, enabling CURLOPT_FORBID_REUSE seems no luck. diff --git a/Source/WebCore/platform/network/curl/CurlContext.cpp b/Source/WebCore/platform/network/curl/CurlContext.cpp index 2f6556ee05b1..a4df0b19b595 100644 --- a/Source/WebCore/platform/network/curl/CurlContext.cpp +++ b/Source/WebCore/platform/network/curl/CurlContext.cpp @@ -399,6 +399,8 @@ void CurlHandle::setUrl(const URL& url) // url is in ASCII so latin1() will only convert it to char* without character translation. curl_easy_setopt(m_handle, CURLOPT_URL, curlUrl.string().latin1().data()); + curl_easy_setopt(m_handle, CURLOPT_FORBID_REUSE, 1L); + if (url.protocolIs("https")) enableSSLForHost(m_url.host().toString()); }
Fujii Hironori
Comment 4 2021-04-14 17:18:57 PDT
I'm using Intel Core i9 (8 cores, 16 threads). This issue can be reproduced just by running 16 WTR instances for 16 iterarions simultaneously. > python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release --no-retry-failures --iterations=16 --no-show-results http/tests/css/border-image-loading.html -f Running 16 WebKitTestRunners in parallel. [10/16] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [15/16] http/tests/css/border-image-loading.html failed unexpectedly (text diff) [16/16] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
Fujii Hironori
Comment 5 2021-04-14 19:52:47 PDT
another test case that is flaky under the heavy load. python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release --no-retry-failures --iterations=10000 --exit-after-n-failures=20 -f http/tests/xmlhttprequest/send-data-view.html
Fujii Hironori
Comment 6 2021-04-14 22:39:16 PDT
Oops. http/tests/css/border-image-loading.html isn't a example of this issue. The test can't work in parallel loads.
Fujii Hironori
Comment 7 2021-04-14 23:09:10 PDT
Other test cases reproducing this issue: http/tests/xmlhttprequest/supported-xml-content-types.html http/tests/xmlhttprequest/upload-onloadend-event-after-sync-requests.html
Fujii Hironori
Comment 8 2021-04-15 17:47:57 PDT
This issue can be reproduced with Windows curl.exe, curl on WSL1 and wget on WSL1. 1. Run 16 DRT instances in the background python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release --no-retry-failures --iterations=30000 -1 -f http/tests/xmlhttprequest/send-data-view.html 2. Invoke Windows curl.exe repeatedly 1..1000 |% { curl.exe -O http://127.0.0.1:8000/xmlhttprequest/send-data-view.html } Connections intermittently failed. > % Total % Received % Xferd Average Speed Time Time Time Current > Dload Upload Total Spent Left Speed > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to 127.0.0.1 port 8000: Address already in use > % Total % Received % Xferd Average Speed Time Time Time Current > Dload Upload Total Spent Left Speed > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to 127.0.0.1 port 8000: Address already in use > % Total % Received % Xferd Average Speed Time Time Time Current > Dload Upload Total Spent Left Speed > 100 630 100 630 0 0 630 0 0:00:01 --:--:-- 0:00:01 10161 > % Total % Received % Xferd Average Speed Time Time Time Current > Dload Upload Total Spent Left Speed > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to 127.0.0.1 port 8000: Address already in use > % Total % Received % Xferd Average Speed Time Time Time Current > Dload Upload Total Spent Left Speed > 100 630 100 630 0 0 630 0 0:00:01 --:--:-- 0:00:01 6702 I observed the intermittent connection failures with curl and wget on WSL1. repeat 1000 curl http://127.0.0.1:8000/xmlhttprequest/send-data-view.html > curl: (7) Failed to connect to 127.0.0.1 port 8000: Connection refused repeat 1000 wget http://127.0.0.1:8000/xmlhttprequest/send-data-view.html > Connecting to 127.0.0.1:8000... failed: Address already in use. WinCairo MiniBrowser also reproduced the connection failures by repeatedly reloading http://127.0.0.1:8000/xmlhttprequest/send-data-view.html
Fujii Hironori
Comment 9 2021-04-15 20:54:54 PDT
Increasing user port range solves this issue. > netsh int ipv4 set dynamicport tcp start=1025 num=64511 Settings that can be Modified to Improve Network Performance - BizTalk Server | Microsoft Docs https://docs.microsoft.com/en-us/biztalk/technical-guides/settings-that-can-be-modified-to-improve-network-performance
Note You need to log in before you can comment on or make changes to this bug.