Bug 224523
Summary: | [WinCairo][curl] flaky http test failures due to the connection limit on 8 cores CPU | ||
---|---|---|---|
Product: | WebKit | Reporter: | Fujii Hironori <Hironori.Fujii> |
Component: | Platform | Assignee: | Nobody <webkit-unassigned> |
Status: | RESOLVED CONFIGURATION CHANGED | ||
Severity: | Normal | ||
Priority: | P2 | ||
Version: | WebKit Nightly Build | ||
Hardware: | Unspecified | ||
OS: | Unspecified |
Fujii Hironori
[WinCairo][curl] flaky http test failures under heavy CPU load
http tests are randomly failing for WinCairo WebKit1 and WebKit2.
This issue can be reproduced by the following command.
> python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release --no-new-test-results --no-retry-failures http/tests/css/border-image-loading.html --iterations=10000 --exit-after-n-failures=20 -f --no-show-results
[11/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[11/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[12/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[15/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[15/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[27/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[31/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[32/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[32/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[32/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[39/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[44/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[52/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[57/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[58/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[60/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[61/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[65/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[67/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[69/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
Exiting early after 20 failures. 54 tests run.
Attachments | ||
---|---|---|
Add attachment proposed patch, testcase, etc. |
Fujii Hironori
Running a single DRT/WTR is stable, no flaky failures.
> python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release --no-retry-failures --iterations=10000 --exit-after-n-failures=20 --no-show-results http/tests/css/border-image-loading.html
However, running multiple curl commands in the background to stress the httpd makes the WTR fail.
> seq 10000 | xargs -n 1 -P 16 curl -s http://127.0.0.1:8000/css/border-image-loading.html -o
In this case, all generated curl output files are exactly same. I think this is not an Apache bug.
Fujii Hironori
This issue can be reproduced with MiniBrowser.
1. Start MiniBrowser in a debugger
devenv -debugexe .\WebKitBuild\Debug\bin64\MiniBrowser.exe --wk1 http://127.0.0.1:8000/css/border-image-loading.html
2. Put a bread point in the error case of CurlRequest::didCompleteTransfer
https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/network/curl/CurlRequest.cpp#L475
3. Invoke run-webkit-httpd as admin
4. Invoke multiple curl commands to stress the httpd
yes http://127.0.0.1:8000/css/border-image-loading.html | xargs -n 1 -P 16 curl -s -O
5. Repeat clicking 'Reload' button until the break point is hit
'result' was CURLE_COULDNT_CONNECT (0x00000007) in the case.
Fujii Hironori
This issue?
CURLE_COULDNT_CONNECT during heavy workload: tens of thousands of requests per minute - Stack Overflow
https://stackoverflow.com/q/32212207
However, enabling CURLOPT_FORBID_REUSE seems no luck.
diff --git a/Source/WebCore/platform/network/curl/CurlContext.cpp b/Source/WebCore/platform/network/curl/CurlContext.cpp
index 2f6556ee05b1..a4df0b19b595 100644
--- a/Source/WebCore/platform/network/curl/CurlContext.cpp
+++ b/Source/WebCore/platform/network/curl/CurlContext.cpp
@@ -399,6 +399,8 @@ void CurlHandle::setUrl(const URL& url)
// url is in ASCII so latin1() will only convert it to char* without character translation.
curl_easy_setopt(m_handle, CURLOPT_URL, curlUrl.string().latin1().data());
+ curl_easy_setopt(m_handle, CURLOPT_FORBID_REUSE, 1L);
+
if (url.protocolIs("https"))
enableSSLForHost(m_url.host().toString());
}
Fujii Hironori
I'm using Intel Core i9 (8 cores, 16 threads).
This issue can be reproduced just by running 16 WTR instances for 16 iterarions simultaneously.
> python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release --no-retry-failures --iterations=16 --no-show-results http/tests/css/border-image-loading.html -f
Running 16 WebKitTestRunners in parallel.
[10/16] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[15/16] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[16/16] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
Fujii Hironori
another test case that is flaky under the heavy load.
python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release --no-retry-failures --iterations=10000 --exit-after-n-failures=20 -f http/tests/xmlhttprequest/send-data-view.html
Fujii Hironori
Oops. http/tests/css/border-image-loading.html isn't a example of this issue. The test can't work in parallel loads.
Fujii Hironori
Other test cases reproducing this issue:
http/tests/xmlhttprequest/supported-xml-content-types.html
http/tests/xmlhttprequest/upload-onloadend-event-after-sync-requests.html
Fujii Hironori
This issue can be reproduced with Windows curl.exe, curl on WSL1 and wget on WSL1.
1. Run 16 DRT instances in the background
python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release --no-retry-failures --iterations=30000 -1 -f http/tests/xmlhttprequest/send-data-view.html
2. Invoke Windows curl.exe repeatedly
1..1000 |% { curl.exe -O http://127.0.0.1:8000/xmlhttprequest/send-data-view.html }
Connections intermittently failed.
> % Total % Received % Xferd Average Speed Time Time Time Current
> Dload Upload Total Spent Left Speed
> 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to 127.0.0.1 port 8000: Address already in use
> % Total % Received % Xferd Average Speed Time Time Time Current
> Dload Upload Total Spent Left Speed
> 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to 127.0.0.1 port 8000: Address already in use
> % Total % Received % Xferd Average Speed Time Time Time Current
> Dload Upload Total Spent Left Speed
> 100 630 100 630 0 0 630 0 0:00:01 --:--:-- 0:00:01 10161
> % Total % Received % Xferd Average Speed Time Time Time Current
> Dload Upload Total Spent Left Speed
> 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to 127.0.0.1 port 8000: Address already in use
> % Total % Received % Xferd Average Speed Time Time Time Current
> Dload Upload Total Spent Left Speed
> 100 630 100 630 0 0 630 0 0:00:01 --:--:-- 0:00:01 6702
I observed the intermittent connection failures with curl and wget on WSL1.
repeat 1000 curl http://127.0.0.1:8000/xmlhttprequest/send-data-view.html
> curl: (7) Failed to connect to 127.0.0.1 port 8000: Connection refused
repeat 1000 wget http://127.0.0.1:8000/xmlhttprequest/send-data-view.html
> Connecting to 127.0.0.1:8000... failed: Address already in use.
WinCairo MiniBrowser also reproduced the connection failures by repeatedly reloading http://127.0.0.1:8000/xmlhttprequest/send-data-view.html
Fujii Hironori
Increasing user port range solves this issue.
> netsh int ipv4 set dynamicport tcp start=1025 num=64511
Settings that can be Modified to Improve Network Performance - BizTalk Server | Microsoft Docs
https://docs.microsoft.com/en-us/biztalk/technical-guides/settings-that-can-be-modified-to-improve-network-performance