Bug 227118

Summary: [Curl] Add curl option CURLOPT_NOSIGNAL to omit numerous sigaction calls
Product: WebKit Reporter: Souju TANAKA <sojulibra>
Component: WebCore Misc.Assignee: Nobody <webkit-unassigned>
Severity: Minor CC: Basuke.Suzuki, chris.reid, don.olmstead, ews-watchlist, galpeter, Hironori.Fujii, stephan.szabo, takashi.komori, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Other   
OS: Other   
Description Flags
Patch none

Description Souju TANAKA 2021-06-17 04:53:02 PDT
When large number of files are added to download queue at once, the download speed gets significantly slower on a platform we are working for. We found that the most part of time are consumed for sigaction system call in curl.
- https://github.com/curl/curl/blob/84d2839740ca78041ac7419d9aaeac55c1e1c729/lib/sigpipe.h#L52
- https://github.com/curl/curl/blob/84d2839740ca78041ac7419d9aaeac55c1e1c729/lib/sigpipe.h#L56
- https://github.com/curl/curl/blob/84d2839740ca78041ac7419d9aaeac55c1e1c729/lib/sigpipe.h#L69

The number of times of sigaction call becomes enormously as we increase file handles registered by curl_multi_add_handle(). Here is the test results that show the total count of sigaction calls with different number of file handles. Each file size is 64KB.
  * 512 files (32MB in total): 348144 times (of sigaction call)
  * 1024 files (64MB in total): 1344192 times
  * 2048 files (128MB in total): 5300697 times
  * 4096 files (256MB in total): 21088311 times
It seems the times of sigaction-call grow along O(N^2) where N is the number of file handles.

The sigaction is there to ignore SIGPIPE signals that is triggered when the other end of socket are closed. However for the platform we are targeting, we have little concern about the unexpected signals since the socket are configured with SO_NOSIGPIPE option with setsockopt(), and use send() with MSG_NOSIGNAL as its forth argument. 
- https://github.com/curl/curl/blob/ee97f176970c9667bfbdaab89bc637e77099d1df/lib/connect.c#L1085-L1086
- https://github.com/curl/curl/blob/ee97f176970c9667bfbdaab89bc637e77099d1df/lib/curl_setup_once.h#L116
Also curl with c-ares or threaded async DNS resolver doesn't rely on SIGALRM signal to detect timeouts.

The cost of syscall is relatively high because of user-kernel context switching. These sigaction can be avoided by a curl option, CURLOPT_NOSIGNAL (https://curl.se/libcurl/c/CURLOPT_NOSIGNAL.html). With the option enabled, curl omits to install signal handlers by assuming no signals triggered for the process.
Comment 1 Souju TANAKA 2021-06-17 23:12:25 PDT
Created attachment 431756 [details]
Comment 2 Fujii Hironori 2021-06-18 13:17:52 PDT
Why does curl block SIGPIPE even on platforms supporting MSG_NOSIGNAL and SO_NOSIGPIPE?
Increasing sigaction calls as O(N^2) seems like a curl bug.
Comment 3 EWS 2021-06-18 13:41:44 PDT
Committed r279046 (238966@main): <https://commits.webkit.org/238966@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 431756 [details].
Comment 4 Radar WebKit Bug Importer 2021-06-18 13:42:31 PDT
Comment 5 Souju TANAKA 2021-06-20 18:18:51 PDT
Not all systems provide enough way to avoid SIGPIPE. SO_NOSIGPIPE socket option is available in BSD based OS. MSG_NOSIGNAL flag has been brought in since Linux 2.2.

Thank you for your review.