RESOLVED FIXED 187427
[Curl] Move HTTP Setup logic from CurlRequest to CurlHandle for reuse.
https://bugs.webkit.org/show_bug.cgi?id=187427
Summary [Curl] Move HTTP Setup logic from CurlRequest to CurlHandle for reuse.
Basuke Suzuki
Reported 2018-07-06 21:14:24 PDT
Move logic to setup CurlHandle for HTTP/HTTPS communication from CurlRequest to CurlHandle. Upcoming WebSocket impementation doesn't use CurlRequest which is for regular HTTP transaction.
Attachments
PATCH (20.86 KB, patch)
2018-07-09 15:24 PDT, Basuke Suzuki
no flags
Fix style (21.17 KB, patch)
2018-07-09 15:33 PDT, Basuke Suzuki
no flags
FIX (20.77 KB, patch)
2018-07-09 16:58 PDT, Basuke Suzuki
no flags
FIX (21.55 KB, patch)
2018-07-12 10:47 PDT, Basuke Suzuki
no flags
Basuke Suzuki
Comment 1 2018-07-09 15:24:22 PDT
EWS Watchlist
Comment 2 2018-07-09 15:27:01 PDT
Attachment 344628 [details] did not pass style-queue: ERROR: Source/WebCore/ChangeLog:8: You should remove the 'No new tests' and either add and list tests, or explain why no new tests were possible. [changelog/nonewtests] [5] Total errors found: 1 in 6 files If any of these errors are false positives, please file a bug against check-webkit-style.
Basuke Suzuki
Comment 3 2018-07-09 15:28:23 PDT
Comment on attachment 344628 [details] PATCH View in context: https://bugs.webkit.org/attachment.cgi?id=344628&action=review > Source/WebCore/platform/network/curl/CurlContext.cpp:808 > + curl_easy_setopt(m_handle, CURLOPT_STDERR, log); This is a bug fix.
Basuke Suzuki
Comment 4 2018-07-09 15:33:01 PDT
Created attachment 344630 [details] Fix style
Basuke Suzuki
Comment 5 2018-07-09 16:58:48 PDT
Fujii Hironori
Comment 6 2018-07-09 22:55:47 PDT
Comment on attachment 344640 [details] FIX View in context: https://bugs.webkit.org/attachment.cgi?id=344640&action=review I'm not a reviewer. This is an informal review. > Source/WebCore/ChangeLog:9 > + in CurlRequest, which is only for regular HTTP/HTTPS transacction. This patch transacction → transaction > Source/WebCore/platform/network/curl/CurlContext.h:286 > + std::optional<CertificateInfo> certificateInfo() const; This returns a copy of CertificateInfo. In general, getters return the reference of memeber variables. Do you need to copy? std::optional<CertificateInfo>& certificateInfo() const; > Source/WebCore/platform/network/curl/CurlRequest.cpp:149 > + if (CurlRequestClient* client = m_client) Do you nee this variable? if (m_client) > Source/WebCore/platform/network/curl/CurlRequest.cpp:309 > + m_certificateInfo = *info; This code copies CertificateInfo. I think CurlRequest don't need to have m_certificateInfo because m_curlHandle has m_sslVerifier and m_sslVerifier has CertificateInfo. > Source/WebCore/platform/network/curl/CurlSSLVerifier.cpp:77 > + bool ok = CurlContext::singleton().sslHandle().isAllowedHTTPSCertificateHost(m_curlHandle.url().host().toString()); This callback is called in the curl thread. Is this thread-safe to access m_curlHandle.url()? > Source/WebCore/platform/network/curl/CurlSSLVerifier.cpp:79 > + bool ok = CurlContext::singleton().sslHandle().canIgnoredHTTPSCertificate(m_curlHandle.url().host().toString(), m_certificateInfo.certificateChain()); Ditto.
Basuke Suzuki
Comment 7 2018-07-11 14:43:58 PDT
Comment on attachment 344640 [details] FIX View in context: https://bugs.webkit.org/attachment.cgi?id=344640&action=review Thanks for comment. >> Source/WebCore/platform/network/curl/CurlContext.h:286 >> + std::optional<CertificateInfo> certificateInfo() const; > > This returns a copy of CertificateInfo. > In general, getters return the reference of memeber variables. > Do you need to copy? > std::optional<CertificateInfo>& certificateInfo() const; That's right. Actually as you mentioned in other bug, In the upcoming patch, CertificateInfo will be moved to ResourceError and ResourceResponse. I'll open the bug right now for that improvement and wanna keep this part as is. The main purpose of this patch is moving the logic to share with others. >> Source/WebCore/platform/network/curl/CurlRequest.cpp:149 >> + if (CurlRequestClient* client = m_client) > > Do you nee this variable? > if (m_client) This is std::atomic<CurlRequestClient*> so that this avoid fetching data from storage twice. I see this worth doing. https://godbolt.org/g/UNQ3A5 >> Source/WebCore/platform/network/curl/CurlSSLVerifier.cpp:77 >> + bool ok = CurlContext::singleton().sslHandle().isAllowedHTTPSCertificateHost(m_curlHandle.url().host().toString()); > > This callback is called in the curl thread. Is this thread-safe to access m_curlHandle.url()? It is isolatedCopied by setUrl, which runs in curl thread.
Fujii Hironori
Comment 8 2018-07-12 02:45:15 PDT
Comment on attachment 344640 [details] FIX View in context: https://bugs.webkit.org/attachment.cgi?id=344640&action=review >>> Source/WebCore/platform/network/curl/CurlRequest.cpp:149 >>> + if (CurlRequestClient* client = m_client) >> >> Do you nee this variable? >> if (m_client) > > This is std::atomic<CurlRequestClient*> so that this avoid fetching data from storage twice. I see this worth doing. https://godbolt.org/g/UNQ3A5 I didn't know m_client is an atomic. This is a bad pattern. You can't use atomic in this case. If m_client is modified in other thread concurrently, it would be destructed after the null-check and before accessing. As far as I read the code, m_client is set only in the main thread. You shouldn't use atomic for m_client. >>> Source/WebCore/platform/network/curl/CurlSSLVerifier.cpp:77 >>> + bool ok = CurlContext::singleton().sslHandle().isAllowedHTTPSCertificateHost(m_curlHandle.url().host().toString()); >> >> This callback is called in the curl thread. Is this thread-safe to access m_curlHandle.url()? > > It is isolatedCopied by setUrl, which runs in curl thread. It seems that CurlHandle::setUrl is called both in the main thread and the worker thread. In the main thread CurlRequest::start → CurlRequest::setupTransfer → CurlHandle::setUrl In the worker thread CurlRequestScheduler::executeTasks → lambda of CurlRequestScheduler::startTransfer → CurlRequest::setupTransfer → CurlHandle::setUrl It is a synchronization problem if CurlHandle::m_url is written in one thread, and it is read in other thread. URL::isolatedCopied doesn't help in this case. But, I didn't check carefully. It might not be a problem due to synchronization piggybacking.
Basuke Suzuki
Comment 9 2018-07-12 10:34:55 PDT
Comment on attachment 344640 [details] FIX View in context: https://bugs.webkit.org/attachment.cgi?id=344640&action=review >>>> Source/WebCore/platform/network/curl/CurlRequest.cpp:149 >>>> + if (CurlRequestClient* client = m_client) >>> >>> Do you nee this variable? >>> if (m_client) >> >> This is std::atomic<CurlRequestClient*> so that this avoid fetching data from storage twice. I see this worth doing. https://godbolt.org/g/UNQ3A5 > > I didn't know m_client is an atomic. > This is a bad pattern. You can't use atomic in this case. > If m_client is modified in other thread concurrently, it would be destructed after the null-check and before accessing. > As far as I read the code, m_client is set only in the main thread. You shouldn't use atomic for m_client. This code is not for thread safety. Thread safety is guarantied by the convention. It is passed as an argument of constructor and can be invaldated vie invalidClient() method. All usage is in this callClient which is executed in main thread (thanks to runOnMainThread). All access is serialized in main thread. That's the design. I don't know why it's stored in std::::atomic. I guess that came from old implementation by changing its name and location. I agree to remove atomic at this moment. And also noticed I forgot to add main thread assertion in invalidateClient() method. Thank you. >>>> Source/WebCore/platform/network/curl/CurlSSLVerifier.cpp:77 >>>> + bool ok = CurlContext::singleton().sslHandle().isAllowedHTTPSCertificateHost(m_curlHandle.url().host().toString()); >>> >>> This callback is called in the curl thread. Is this thread-safe to access m_curlHandle.url()? >> >> It is isolatedCopied by setUrl, which runs in curl thread. > > It seems that CurlHandle::setUrl is called both in the main thread and the worker thread. > > In the main thread > CurlRequest::start > → CurlRequest::setupTransfer > → CurlHandle::setUrl > > In the worker thread > CurlRequestScheduler::executeTasks > → lambda of CurlRequestScheduler::startTransfer > → CurlRequest::setupTransfer > → CurlHandle::setUrl > > It is a synchronization problem if CurlHandle::m_url is written in one thread, and it is read in other thread. > URL::isolatedCopied doesn't help in this case. > > But, I didn't check carefully. It might not be a problem due to synchronization piggybacking. Right those difference happen in sync and async difference. For sync transfer, it stays in main thread. In async transfer it set in curl thread by the scheduler and used entirely in that thread.
Basuke Suzuki
Comment 10 2018-07-12 10:47:56 PDT
Fujii Hironori
Comment 11 2018-07-13 05:00:15 PDT
(In reply to Basuke Suzuki from comment #9) > Right those difference happen in sync and async difference. For sync > transfer, it stays in main thread. In async transfer it set in curl thread > by the scheduler and used entirely in that thread. Thank you for the explanation. I get the idea. IMO, this is badly designed. CurlContext and CurlSSLHandle, which is actually a context even thought it's named 'Handle', are shared among the main thread and the worker thread without locking mutex. For example, CurlRequest::didReceiveHeader has a following code: > CurlContext::singleton().setProxyAuthMethod(m_response.availableProxyAuth); This setter might be called concurrently in other thread.No mutex locked. I think the main thread should post a sync message to the worker thread for sync requests.
Fujii Hironori
Comment 12 2018-07-13 05:15:04 PDT
Comment on attachment 344860 [details] FIX View in context: https://bugs.webkit.org/attachment.cgi?id=344860&action=review > Source/WebCore/platform/network/curl/CurlContext.cpp:360 > + m_url = url.isolatedCopy(); IIUC, This isolatedCopy is not needed because CurlRequest::m_request is an isolated copy. > Source/WebCore/platform/network/curl/CurlSSLVerifier.cpp:85 > + m_curlHandle.setSslVerifyPeer(CurlHandle::VerifyPeer::Disable); Just out of curiosity, Why is this code needed?
Basuke Suzuki
Comment 13 2018-07-13 10:14:38 PDT
Thanks for r+ and congrats for being a reviewer! (In reply to Fujii Hironori from comment #12) > Comment on attachment 344860 [details] > FIX > > View in context: > https://bugs.webkit.org/attachment.cgi?id=344860&action=review > > > Source/WebCore/platform/network/curl/CurlContext.cpp:360 > > + m_url = url.isolatedCopy(); > > IIUC, This isolatedCopy is not needed because CurlRequest::m_request is an > isolated copy. That's right. Our intention is CurlHandle should be isolated from other implementation detail and should hold information required to be used. Next step may be moving out url from CurlRequest. > > Source/WebCore/platform/network/curl/CurlSSLVerifier.cpp:85 > > + m_curlHandle.setSslVerifyPeer(CurlHandle::VerifyPeer::Disable); > > Just out of curiosity, Why is this code needed? It came from old code and I didn't sure the intent at this moment. While I was figuring out the mis-implementation of Curl port at this bug https://bugs.webkit.org/show_bug.cgi?id=187611, It was a left over of old implementation that raw curl handle was setup to handle redirect by itself. If redirect happens, following peer check was bypassed by setting this flag. Current Curl port handle redirect in WebCore and never reuse curl handle. So this code can be removed safely. I'll do that on the patch.
Basuke Suzuki
Comment 14 2018-07-13 10:21:51 PDT
(In reply to Fujii Hironori from comment #11) > (In reply to Basuke Suzuki from comment #9) > > Right those difference happen in sync and async difference. For sync > > transfer, it stays in main thread. In async transfer it set in curl thread > > by the scheduler and used entirely in that thread. > > Thank you for the explanation. I get the idea. > > IMO, this is badly designed. CurlContext and CurlSSLHandle, which is > actually a context even thought it's named 'Handle', are shared > among the main thread and the worker thread without locking mutex. > > For example, CurlRequest::didReceiveHeader has a following code: > > > CurlContext::singleton().setProxyAuthMethod(m_response.availableProxyAuth); > > This setter might be called concurrently in other thread.No mutex locked. > > I think the main thread should post a sync message to the worker thread for > sync requests. This part always makes us uncomfortable. Yes, it's very hard to design the architecture for both sync and async. We thought about your idea, but main thread has to handle callback request while waiting the result. With ResourceHandle implementation, that's too complected than doing everything in main thread. But we see there must be better way to make the code simpler. We will figure out the way soon.
WebKit Commit Bot
Comment 15 2018-07-13 10:28:37 PDT
Comment on attachment 344860 [details] FIX Clearing flags on attachment: 344860 Committed r233803: <https://trac.webkit.org/changeset/233803>
WebKit Commit Bot
Comment 16 2018-07-13 10:28:39 PDT
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 17 2018-07-13 10:29:19 PDT
Fujii Hironori
Comment 18 2018-07-13 16:19:29 PDT
(In reply to Basuke Suzuki from comment #14) > We thought about your idea, but main thread has to handle callback request > while waiting the result. With ResourceHandle implementation, that's too > complected than doing everything in main thread. You can process other (restricted) messages while waiting sync-reply. > But we see there must be better way to make the code simpler. We will figure > out the way soon. Great.
Note You need to log in before you can comment on or make changes to this bug.