Bug 224523 - [WinCairo][curl] flaky http test failures due to the connection limit on 8 cores CPU
Summary: [WinCairo][curl] flaky http test failures due to the connection limit on 8 co...
Status: RESOLVED CONFIGURATION CHANGED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Platform (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-13 19:27 PDT by Fujii Hironori
Modified: 2021-04-15 20:55 PDT (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fujii Hironori 2021-04-13 19:27:44 PDT
[WinCairo][curl] flaky http test failures under heavy CPU load

http tests are randomly failing for WinCairo WebKit1 and WebKit2.

This issue can be reproduced by the following command.

> python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release  --no-new-test-results --no-retry-failures http/tests/css/border-image-loading.html --iterations=10000 --exit-after-n-failures=20 -f --no-show-results

[11/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[11/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[12/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[15/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[15/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[27/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[31/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[32/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[32/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[32/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[39/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[44/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[52/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[57/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[58/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[60/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[61/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[65/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[67/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[69/10000] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
Exiting early after 20 failures. 54 tests run.
Comment 1 Fujii Hironori 2021-04-14 00:42:58 PDT
Running a single DRT/WTR is stable, no flaky failures.

> python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release --no-retry-failures --iterations=10000 --exit-after-n-failures=20 --no-show-results http/tests/css/border-image-loading.html

However, running multiple curl commands in the background to stress the httpd makes the WTR fail.

> seq 10000 | xargs -n 1 -P 16 curl -s http://127.0.0.1:8000/css/border-image-loading.html -o

In this case, all generated curl output files are exactly same. I think this is not an Apache bug.
Comment 2 Fujii Hironori 2021-04-14 14:27:39 PDT
This issue can be reproduced with MiniBrowser.

1. Start MiniBrowser in a debugger
   devenv -debugexe .\WebKitBuild\Debug\bin64\MiniBrowser.exe --wk1 http://127.0.0.1:8000/css/border-image-loading.html
2. Put a bread point in the error case of CurlRequest::didCompleteTransfer
   https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/network/curl/CurlRequest.cpp#L475
3. Invoke run-webkit-httpd as admin
4. Invoke multiple curl commands to stress the httpd
   yes http://127.0.0.1:8000/css/border-image-loading.html | xargs -n 1 -P 16 curl -s -O
5. Repeat clicking 'Reload' button until the break point is hit

'result' was CURLE_COULDNT_CONNECT (0x00000007) in the case.
Comment 3 Fujii Hironori 2021-04-14 14:44:44 PDT
This issue?

CURLE_COULDNT_CONNECT during heavy workload: tens of thousands of requests per minute - Stack Overflow
https://stackoverflow.com/q/32212207


However, enabling CURLOPT_FORBID_REUSE seems no luck.

diff --git a/Source/WebCore/platform/network/curl/CurlContext.cpp b/Source/WebCore/platform/network/curl/CurlContext.cpp
index 2f6556ee05b1..a4df0b19b595 100644
--- a/Source/WebCore/platform/network/curl/CurlContext.cpp
+++ b/Source/WebCore/platform/network/curl/CurlContext.cpp
@@ -399,6 +399,8 @@ void CurlHandle::setUrl(const URL& url)
     // url is in ASCII so latin1() will only convert it to char* without character translation.
     curl_easy_setopt(m_handle, CURLOPT_URL, curlUrl.string().latin1().data());
 
+    curl_easy_setopt(m_handle, CURLOPT_FORBID_REUSE, 1L);
+
     if (url.protocolIs("https"))
         enableSSLForHost(m_url.host().toString());
 }
Comment 4 Fujii Hironori 2021-04-14 17:18:57 PDT
I'm using Intel Core i9 (8 cores, 16 threads).
This issue can be reproduced just by running 16 WTR instances for 16 iterarions simultaneously.

> python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release --no-retry-failures --iterations=16 --no-show-results http/tests/css/border-image-loading.html -f

Running 16 WebKitTestRunners in parallel.

[10/16] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[15/16] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
[16/16] http/tests/css/border-image-loading.html failed unexpectedly (text diff)
Comment 5 Fujii Hironori 2021-04-14 19:52:47 PDT
another test case that is flaky under the heavy load.

python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release --no-retry-failures --iterations=10000 --exit-after-n-failures=20 -f http/tests/xmlhttprequest/send-data-view.html
Comment 6 Fujii Hironori 2021-04-14 22:39:16 PDT
Oops. http/tests/css/border-image-loading.html isn't a example of this issue. The test can't work in parallel loads.
Comment 7 Fujii Hironori 2021-04-14 23:09:10 PDT
Other test cases reproducing this issue:

http/tests/xmlhttprequest/supported-xml-content-types.html
http/tests/xmlhttprequest/upload-onloadend-event-after-sync-requests.html
Comment 8 Fujii Hironori 2021-04-15 17:47:57 PDT
This issue can be reproduced with Windows curl.exe, curl on WSL1 and wget on WSL1.

1. Run 16 DRT instances in the background
 python.exe ./Tools/Scripts/run-webkit-tests --wincairo --release --no-retry-failures --iterations=30000 -1 -f http/tests/xmlhttprequest/send-data-view.html
2. Invoke Windows curl.exe repeatedly
 1..1000 |% { curl.exe -O http://127.0.0.1:8000/xmlhttprequest/send-data-view.html }

Connections intermittently failed.

>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
>   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to 127.0.0.1 port 8000: Address already in use
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
>   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to 127.0.0.1 port 8000: Address already in use
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> 100   630  100   630    0     0    630      0  0:00:01 --:--:--  0:00:01 10161
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
>   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to 127.0.0.1 port 8000: Address already in use
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> 100   630  100   630    0     0    630      0  0:00:01 --:--:--  0:00:01  6702

I observed the intermittent connection failures with curl and wget on WSL1. 

repeat 1000 curl http://127.0.0.1:8000/xmlhttprequest/send-data-view.html

> curl: (7) Failed to connect to 127.0.0.1 port 8000: Connection refused

repeat 1000 wget http://127.0.0.1:8000/xmlhttprequest/send-data-view.html

> Connecting to 127.0.0.1:8000... failed: Address already in use.

WinCairo MiniBrowser also reproduced the connection failures by repeatedly reloading http://127.0.0.1:8000/xmlhttprequest/send-data-view.html
Comment 9 Fujii Hironori 2021-04-15 20:54:54 PDT
Increasing user port range solves this issue.

> netsh int ipv4 set dynamicport tcp start=1025 num=64511

Settings that can be Modified to Improve Network Performance - BizTalk Server | Microsoft Docs
https://docs.microsoft.com/en-us/biztalk/technical-guides/settings-that-can-be-modified-to-improve-network-performance