Bug 187984

Summary: [Curl] Use shared single thread for WebSocket connections
Product: WebKit Reporter: Basuke Suzuki <basuke>
Component: PlatformAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: annulen, basuke, commit-queue, ews-watchlist, galpeter, gyuyoung.kim, Hironori.Fujii, ryuan.choi, sergio, takashi.komori, toyoshim, webkit-bug-importer, yutak
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=173449
Bug Depends on: 172630    
Bug Blocks:    
Attachments:
Description Flags
Handling connections by one thread.
none
Handling connections by one thread.
none
Handling connections by one thread.
none
Handling connections by one thread.
none
Handling connections by one thread.
none
Patch
none
Patch
none
Patch none

Basuke Suzuki
Reported 2018-07-24 19:14:40 PDT
Current implementation launches new thread for each connection. Create a new scheduler for socket connection. It's not possible to share with normal scheduler for HTTP connection because of the difference of the architecture. One for HTTP and another for WebSocket.
Attachments
Handling connections by one thread. (50.51 KB, patch)
2020-01-28 15:42 PST, Takashi Komori
no flags
Handling connections by one thread. (35.65 KB, patch)
2020-01-30 00:28 PST, Takashi Komori
no flags
Handling connections by one thread. (34.98 KB, patch)
2020-02-02 14:15 PST, Takashi Komori
no flags
Handling connections by one thread. (35.34 KB, patch)
2020-02-05 05:40 PST, Takashi Komori
no flags
Handling connections by one thread. (35.11 KB, patch)
2020-02-11 16:38 PST, Takashi Komori
no flags
Patch (39.39 KB, patch)
2020-02-12 22:59 PST, Takashi Komori
no flags
Patch (39.30 KB, patch)
2020-02-17 00:29 PST, Takashi Komori
no flags
Patch (39.26 KB, patch)
2020-02-17 02:50 PST, Takashi Komori
no flags
Fujii Hironori
Comment 1 2019-05-27 06:39:51 PDT
You need to address Bug 173449.
Takashi Komori
Comment 2 2020-01-28 15:42:32 PST
Created attachment 389079 [details] Handling connections by one thread. This patch is for proof of concept.
Takashi Komori
Comment 3 2020-01-28 15:43:38 PST
On curl port SocketStreamHanle creates a thread for handling each connection. (SocketStreamHandleImplCurl.cpp) But this implementation is not resource friendly. This patch aims to suppress creating threads. The main idea is handling all connections by one thread. For this purpose we propose introducing a manager class 'CurlSocketStreamWorkerManager' in this patch.
Basuke Suzuki
Comment 4 2020-01-28 19:23:17 PST
Comment on attachment 389079 [details] Handling connections by one thread. View in context: https://bugs.webkit.org/attachment.cgi?id=389079&action=review This is an informal review. So to describe the overview of the changes, - CurlSocketHandle code was moved to CurlSocketStreamWorker, - SocketStreamHandleImpl code was moved to CurlSocketWorkerManager and CurlSocketStreamHandle The instance of CurlSocketStreamHandle exists only in main thread and those of CurlSocketStreamWorker are in worker thread. This is a good separation and very effective way to prevent thread related memory issue. I think CurlSocketStreamHandle do a little bit too much. It should be thinner layer as name implies and tasks should be asked to WorkerManager with an argument of Handle. Stop registering itself to manager, but let manager create Handle and register it by itself. Last thing, class names are too long and hard to get the meaning. I suggest renaming like this: - CurlSocketStreamWorkerManager -> CurlStreamScheduler (corresponding to CurlRequestSchedular) - CurlSocketStreamWorker -> CurlStream - CurlSocketStreamHandle -> CurlStream::Handle - CurlSocketStreamHandleClient -> CurlStream::Client Because this is inside curl network layer implementation, there's nothing to point to Stream since today. We can safely use simple term "Stream" for the meaning of SocketStream. Also as WinCairo also has an implementation of RemoteInspectorSocket and the term "Socket" is used everywhere around that, I feel better not seeing that term in Curl layer. > Source/WebCore/platform/network/curl/CurlContext.cpp:120 > + m_socketStreamWorkerManager = makeUnique<CurlSocketStreamWorkerManager>(); It should be lazy initialization. > Source/WebCore/platform/network/curl/CurlContext.cpp:259 > +CURLMcode CurlMultiHandle::poll(curl_waitfd* extraFds, unsigned numOfExtraFds, int timeoutMS, int& numFds) This isn't used in this patch. Please remove it. > Source/WebCore/platform/network/curl/CurlContext.cpp:-874 > -#ifndef NDEBUG These changes happens because of the layout change of functions. Please try minimizing the diff size. > Source/WebCore/platform/network/curl/CurlContext.cpp:886 > + errorHandler(errorCode); errorHandler is used just for returning errorCode. There must be better way to do that. i.e. WTF::Expected > Source/WebCore/platform/network/curl/CurlSocketStreamHandleClient.h:32 > +class CurlSocketStreamHandleClient { This class is very close to CurlSocketStreamHandle and recent implementation tend to put that as a embedded class. Let's move it inside CurlSocketStreamHandle . like: class CurlSocketStreamHandle { ... public: ... class Client { ... } Then this file can be deleted and the reader can understand the relationship easily afterword. > Source/WebCore/platform/network/curl/CurlSocketStreamWorkerClient.h:35 > +class CurlSocketStreamWorkerClient { I don't think this abstraction is required. It's too much. WorkerManager should handle Workers directly. > Source/WebCore/platform/network/curl/SocketStreamHandleImpl.h:-82 > - bool m_hasPendingWriteData { false }; I understand this patch tries to move the logic from this class to CurlSocketStreamXXX, but it seems a little bit too much. m_hasPendingWriteData is actually the mechanizm of WebKit side. It is okay to leave them here and make the change less in the platform layer side.
Takashi Komori
Comment 5 2020-01-30 00:28:41 PST
Created attachment 389232 [details] Handling connections by one thread.
Takashi Komori
Comment 6 2020-01-30 00:49:26 PST
(In reply to Basuke Suzuki from comment #4) > Comment on attachment 389079 [details] > Handling connections by one thread. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=389079&action=review > > This is an informal review. So to describe the overview of the changes, > - CurlSocketHandle code was moved to CurlSocketStreamWorker, > - SocketStreamHandleImpl code was moved to CurlSocketWorkerManager and > CurlSocketStreamHandle > > The instance of CurlSocketStreamHandle exists only in main thread and those > of CurlSocketStreamWorker are in worker thread. This is a good separation > and very effective way to prevent thread related memory issue. > > I think CurlSocketStreamHandle do a little bit too much. It should be > thinner layer as name implies and tasks should be asked to WorkerManager > with an argument of Handle. Stop registering itself to manager, but let > manager create Handle and register it by itself. > > Last thing, class names are too long and hard to get the meaning. I suggest > renaming like this: > - CurlSocketStreamWorkerManager -> CurlStreamScheduler (corresponding to > CurlRequestSchedular) > - CurlSocketStreamWorker -> CurlStream > - CurlSocketStreamHandle -> CurlStream::Handle > - CurlSocketStreamHandleClient -> CurlStream::Client > > Because this is inside curl network layer implementation, there's nothing to > point to Stream since today. We can safely use simple term "Stream" for the > meaning of SocketStream. > > Also as WinCairo also has an implementation of RemoteInspectorSocket and the > term "Socket" is used everywhere around that, I feel better not seeing that > term in Curl layer. > > > Source/WebCore/platform/network/curl/CurlContext.cpp:120 > > + m_socketStreamWorkerManager = makeUnique<CurlSocketStreamWorkerManager>(); > > It should be lazy initialization. FIXED. > > > Source/WebCore/platform/network/curl/CurlContext.cpp:259 > > +CURLMcode CurlMultiHandle::poll(curl_waitfd* extraFds, unsigned numOfExtraFds, int timeoutMS, int& numFds) > > This isn't used in this patch. Please remove it. REMOVED. > > > Source/WebCore/platform/network/curl/CurlContext.cpp:-874 > > -#ifndef NDEBUG > > These changes happens because of the layout change of functions. Please try > minimizing the diff size. FIXED. > > > Source/WebCore/platform/network/curl/CurlContext.cpp:886 > > + errorHandler(errorCode); > > errorHandler is used just for returning errorCode. There must be better way > to do that. i.e. WTF::Expected FIXED. > > > Source/WebCore/platform/network/curl/CurlSocketStreamHandleClient.h:32 > > +class CurlSocketStreamHandleClient { > > This class is very close to CurlSocketStreamHandle and recent implementation > tend to put that as a embedded class. Let's move it inside > CurlSocketStreamHandle . like: > class CurlSocketStreamHandle { > ... > public: > ... > class Client { > ... > } > > Then this file can be deleted and the reader can understand the relationship > easily afterword. > > > Source/WebCore/platform/network/curl/CurlSocketStreamWorkerClient.h:35 > > +class CurlSocketStreamWorkerClient { > > I don't think this abstraction is required. It's too much. WorkerManager > should handle Workers directly. Reduced files by introducing CurlStream::Client class.
Fujii Hironori
Comment 7 2020-01-30 02:03:03 PST
Comment on attachment 389232 [details] Handling connections by one thread. View in context: https://bugs.webkit.org/attachment.cgi?id=389232&action=review > Source/WebCore/platform/network/curl/CurlStream.cpp:82 > +void CurlStream::send(UniqueArray<uint8_t>&& buffer, size_t length) 'length' is redundant. buffer has the size. > Source/WebCore/platform/network/curl/CurlStream.cpp:191 > +void CurlStream::notifyFailWithError(CURLcode errorCode) notifyFailWithError should be renamed because 'fail' is a verb, the phase 'notify fail' sounds weird. I think you refereed 'curlDidFailWithError'. the phase 'did fail' is not weird. > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:172 > + struct timeval timeout { 0, selectTimeoutMS * 1000}; You should implement self-piping technique here to wake up. (Bug 173449) Then, you can sleep forever here instead of waking up every 20ms.
Fujii Hironori
Comment 8 2020-01-30 02:27:59 PST
Comment on attachment 389232 [details] Handling connections by one thread. View in context: https://bugs.webkit.org/attachment.cgi?id=389232&action=review > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:48 > + m_nextStreamID = (m_nextStreamID + 1 != invalidCurlStreamID) ? m_nextStreamID + 1 : 1; You don't check m_nextStreamID is not used. Can you use the pointer of client instead of m_nextStreamID? > Source/WebCore/platform/network/curl/CurlStreamScheduler.h:42 > + void destory(CurlStreamID); 'create' is typically used to create the instance of the class. Rename createStream and destroyStream, for example.
Fujii Hironori
Comment 9 2020-01-30 02:34:40 PST
Comment on attachment 389232 [details] Handling connections by one thread. View in context: https://bugs.webkit.org/attachment.cgi?id=389232&action=review > Source/WebCore/platform/network/curl/CurlStream.cpp:168 > + while (m_writeBufferOffset < m_writeBufferSize) { Do you know the reason why 'while' loop is used here even though 'tryToRead' doesn't?
Fujii Hironori
Comment 10 2020-01-30 02:38:01 PST
Comment on attachment 389232 [details] Handling connections by one thread. View in context: https://bugs.webkit.org/attachment.cgi?id=389232&action=review > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:137 > +void CurlStreamScheduler::stopThread() Is stopThread called from anywhere? > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:139 > + m_runThread = false; Don't you need to lock m_mutex?
Basuke Suzuki
Comment 11 2020-01-30 16:04:18 PST
Comment on attachment 389232 [details] Handling connections by one thread. View in context: https://bugs.webkit.org/attachment.cgi?id=389232&action=review This is informal review. > Source/WebCore/platform/network/curl/CurlStream.cpp:54 > + auto errorCode = m_curlHandle->perform(); You cannot try connecting here. It blocks until it connect. > Source/WebCore/platform/network/curl/CurlStream.cpp:87 > + if (!m_curlHandle || m_writeBuffer) This is wrong. If m_writeBuffer is not null, the contents of buffer is gone. And there's no way the client can know whether it can send or not. Add canSend() method or append data in the end of writeBuffer. Or m_writeBuffer can be Vector<std::tuple<UniqueArray<uint_t>, size_t>>, then appending is just a simple task. > Source/WebCore/platform/network/curl/CurlStream.cpp:153 > + destroyHandle(); Don't you need to start any closing sequence here? >> Source/WebCore/platform/network/curl/CurlStream.cpp:168 >> + while (m_writeBufferOffset < m_writeBufferSize) { > > Do you know the reason why 'while' loop is used here even though 'tryToRead' doesn't? Yeah, this is wrong. The write() will block if send buffer is full. It was okay in a dedicate thread mode, but not okay in this patch. We can try just one timing after we get a permission to write after status check. >> Source/WebCore/platform/network/curl/CurlStream.cpp:191 >> +void CurlStream::notifyFailWithError(CURLcode errorCode) > > notifyFailWithError should be renamed because 'fail' is a verb, the phase 'notify fail' sounds weird. > I think you refereed 'curlDidFailWithError'. the phase 'did fail' is not weird. The verb `notify` is important for this method, simply using noun `Failure` and `notifyFailure` is enough, isn't it? `WithError` is apparent from the parameter. > Source/WebCore/platform/network/curl/CurlStream.cpp:199 > + client.didFailWithError(SocketStreamError(errorCode, url, CurlHandle::errorDescription(errorCode))); Is this CurlScheduler's responsibility to instantiate SocketStreamError? It is good idea to be as ignorant as possible for WebKit implementation. Just return errorCode and url is enough, I think. > Source/WebCore/platform/network/curl/CurlStream.cpp:203 > +void CurlStream::callClient(Function<void(Client&)>&& task) Do we need this method? Each occasion to call client callback can call m_scheduler's method directly. That makes code a little bit busy, but more straight forward. > Source/WebCore/platform/network/curl/CurlStream.h:46 > + virtual void didOpen() = 0; These interface method should have caller's instance as a first argument. virtual void didOpen(CurlStream&) = 0; That is similar to Cocoa's delegation pattern. Adding that to each method gives more context to implementation side, i.e. who will require them. Also it may solve future method name conflict. >> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:48 >> + m_nextStreamID = (m_nextStreamID + 1 != invalidCurlStreamID) ? m_nextStreamID + 1 : 1; > > You don't check m_nextStreamID is not used. > Can you use the pointer of client instead of m_nextStreamID? The timing of issuing ID and object allocation is different. We cannot simply use the address as an id. Also memory region can be reused for same object type and it is safe not to use address. But existence check is required. > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:52 > + auto task = [this, streamID, url = url.isolatedCopy()]() mutable { This might be just a preference, but for me, assigning to temporal variable and use it in next line is not good rhythm to read. Calling `callOnWorkerThread()` with direct lambda is easy to read. > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:70 > + ASSERT(!isMainThread()); These assertion should be responsibility of callOnWorkerThead() or executeTask() method and no need to check in every places. > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:97 > +void CurlStreamScheduler::callOnMainThread(CurlStreamID streamID, WTF::Function<void(CurlStream::Client&)>&& task) This is not general purpose invocation method. This function do more. How about callClientOnMainThread? > Source/WebCore/platform/network/curl/CurlStreamScheduler.h:65 > + HashMap<CurlStreamID, CurlStream::Client*> m_clientLists; The value of item cannot be nullptr, so is should be CurlStream::Client&
Takashi Komori
Comment 12 2020-02-02 14:15:39 PST
Created attachment 389481 [details] Handling connections by one thread.
Takashi Komori
Comment 13 2020-02-02 14:36:10 PST
(In reply to Fujii Hironori from comment #7) > Comment on attachment 389232 [details] > Handling connections by one thread. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=389232&action=review > > > Source/WebCore/platform/network/curl/CurlStream.cpp:82 > > +void CurlStream::send(UniqueArray<uint8_t>&& buffer, size_t length) > > 'length' is redundant. buffer has the size. UniqueArray doesn't have length property. > > Source/WebCore/platform/network/curl/CurlStream.cpp:191 > > +void CurlStream::notifyFailWithError(CURLcode errorCode) > > notifyFailWithError should be renamed because 'fail' is a verb, the phase > 'notify fail' sounds weird. > I think you refereed 'curlDidFailWithError'. the phase 'did fail' is not > weird. Changed to notifyFailure. > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:172 > > + struct timeval timeout { 0, selectTimeoutMS * 1000}; > > You should implement self-piping technique here to wake up. (Bug 173449) > Then, you can sleep forever here instead of waking up every 20ms. We feel like implementing that in another ticket, because already we have had many changes. And we think that we might be able to implement similar function by using curl_multi_poll and curl_multi_wakeup which are supported above curl 7.68.0 How do you think about using curl_multi_poo/curl_multi_wakeup?
Takashi Komori
Comment 14 2020-02-02 14:38:59 PST
(In reply to Fujii Hironori from comment #9) > Comment on attachment 389232 [details] > Handling connections by one thread. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=389232&action=review > > > Source/WebCore/platform/network/curl/CurlStream.cpp:168 > > + while (m_writeBufferOffset < m_writeBufferSize) { > > Do you know the reason why 'while' loop is used here even though 'tryToRead' > doesn't? We haven't fixed it, we will start soon.
Takashi Komori
Comment 15 2020-02-02 14:40:04 PST
(In reply to Fujii Hironori from comment #10) > Comment on attachment 389232 [details] > Handling connections by one thread. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=389232&action=review > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:137 > > +void CurlStreamScheduler::stopThread() > > Is stopThread called from anywhere? Removed. > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:139 > > + m_runThread = false; > > Don't you need to lock m_mutex? Fixed.
Takashi Komori
Comment 16 2020-02-02 14:50:18 PST
(In reply to Fujii Hironori from comment #8) > Comment on attachment 389232 [details] > Handling connections by one thread. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=389232&action=review > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:48 > > + m_nextStreamID = (m_nextStreamID + 1 != invalidCurlStreamID) ? m_nextStreamID + 1 : 1; > > You don't check m_nextStreamID is not used. > Can you use the pointer of client instead of m_nextStreamID? Fixed not to use in-use IDs. As stand in comment11 we cannot substitute IDs by pointers. > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.h:42 > > + void destory(CurlStreamID); > > 'create' is typically used to create the instance of the class. > Rename createStream and destroyStream, for example. Fixed.
Takashi Komori
Comment 17 2020-02-02 15:37:39 PST
(In reply to Basuke Suzuki from comment #11) > Comment on attachment 389232 [details] > Handling connections by one thread. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=389232&action=review > > This is informal review. > > > Source/WebCore/platform/network/curl/CurlStream.cpp:54 > > + auto errorCode = m_curlHandle->perform(); > > You cannot try connecting here. It blocks until it connect. We tried to avoid blocking by using curl_multi_perform, but it causes some failures of layout tests (timeout). Connections made by curl_multi_perform seem to fail detecting server close. If you know about this, please let us know. > > > Source/WebCore/platform/network/curl/CurlStream.cpp:87 > > + if (!m_curlHandle || m_writeBuffer) > > This is wrong. If m_writeBuffer is not null, the contents of buffer is gone. > And there's no way the client can know whether it can send or not. Add > canSend() method or append data in the end of writeBuffer. > > Or m_writeBuffer can be Vector<std::tuple<UniqueArray<uint_t>, size_t>>, > then appending is just a simple task. We haven't fixed. Client can know if there is pending data by checking SocketStreamHandleImpl::m_hasPendingSendData. Do you mean that SocketStreamHandleImpl::platformSendInternal should have vector of sending buffer and accept buffer regardless state of m_hasPendingSendData? > > > Source/WebCore/platform/network/curl/CurlStream.cpp:153 > > + destroyHandle(); > > Don't you need to start any closing sequence here? We don't need to start closing here. When WebSocketChannel::didReceiveSocketStreamData receives 0 byte buffer, it starts disconnecting. > >> Source/WebCore/platform/network/curl/CurlStream.cpp:191 > >> +void CurlStream::notifyFailWithError(CURLcode errorCode) > > > > notifyFailWithError should be renamed because 'fail' is a verb, the phase 'notify fail' sounds weird. > > I think you refereed 'curlDidFailWithError'. the phase 'did fail' is not weird. > > The verb `notify` is important for this method, simply using noun `Failure` > and `notifyFailure` is enough, isn't it? `WithError` is apparent from the > parameter. > Fixed. > > Source/WebCore/platform/network/curl/CurlStream.cpp:199 > > + client.didFailWithError(SocketStreamError(errorCode, url, CurlHandle::errorDescription(errorCode))); > > Is this CurlScheduler's responsibility to instantiate SocketStreamError? It > is good idea to be as ignorant as possible for WebKit implementation. Just > return errorCode and url is enough, I think. Fixed. > > > Source/WebCore/platform/network/curl/CurlStream.cpp:203 > > +void CurlStream::callClient(Function<void(Client&)>&& task) > > Do we need this method? Each occasion to call client callback can call > m_scheduler's method directly. That makes code a little bit busy, but more > straight forward. Fixed.
Takashi Komori
Comment 18 2020-02-02 16:28:56 PST
(In reply to Basuke Suzuki from comment #11) > > > Source/WebCore/platform/network/curl/CurlStream.h:46 > > + virtual void didOpen() = 0; > > These interface method should have caller's instance as a first argument. > > virtual void didOpen(CurlStream&) = 0; > > That is similar to Cocoa's delegation pattern. Adding that to each method > gives more context to implementation side, i.e. who will require them. Also > it may solve future method name conflict. CurlStreamHandleImpl's lifetime is different from CurlStream's, so it is possible that when CurlStreamHandleImple::didFail is invoked, the CurlStream instance already had been destructed. To implement that we should consider not to access destructed instances too. > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:52 > > + auto task = [this, streamID, url = url.isolatedCopy()]() mutable { > > This might be just a preference, but for me, assigning to temporal variable > and use it in next line is not good rhythm to read. Calling > `callOnWorkerThread()` with direct lambda is easy to read. Fixed. > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:70 > > + ASSERT(!isMainThread()); > > These assertion should be responsibility of callOnWorkerThead() or > executeTask() method and no need to check in every places. Fixed. > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:97 > > +void CurlStreamScheduler::callOnMainThread(CurlStreamID streamID, WTF::Function<void(CurlStream::Client&)>&& task) > > This is not general purpose invocation method. This function do more. How > about callClientOnMainThread? Fixed. > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.h:65 > > + HashMap<CurlStreamID, CurlStream::Client*> m_clientLists; > > The value of item cannot be nullptr, so is should be CurlStream::Client& We can't use reference here because CurlStream::Client is abstract class as we can't get the instance by calling m_clientLists.get
Basuke Suzuki
Comment 19 2020-02-03 16:16:38 PST
(In reply to Takashi Komori from comment #13) > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:172 > > > + struct timeval timeout { 0, selectTimeoutMS * 1000}; > > > > You should implement self-piping technique here to wake up. (Bug 173449) > > Then, you can sleep forever here instead of waking up every 20ms. > > > We feel like implementing that in another ticket, because already we have > had many changes. > > And we think that we might be able to implement similar function by using > curl_multi_poll and curl_multi_wakeup which are supported above curl 7.68.0 > How do you think about using curl_multi_poo/curl_multi_wakeup? I feel that is very right direction. I agree with postponing this improvement later.
Basuke Suzuki
Comment 20 2020-02-03 16:35:12 PST
(In reply to Takashi Komori from comment #17) > > You cannot try connecting here. It blocks until it connect. > > We tried to avoid blocking by using curl_multi_perform, but it causes some > failures of layout tests (timeout). > Connections made by curl_multi_perform seem to fail detecting server close. > If you know about this, please let us know. > > > > > Source/WebCore/platform/network/curl/CurlStream.cpp:153 > > > + destroyHandle(); > > > > Don't you need to start any closing sequence here? > > We don't need to start closing here. > When WebSocketChannel::didReceiveSocketStreamData receives 0 byte buffer, it > starts disconnecting. Got it.
Basuke Suzuki
Comment 21 2020-02-03 16:40:14 PST
(In reply to Takashi Komori from comment #18) > > That is similar to Cocoa's delegation pattern. Adding that to each method > > gives more context to implementation side, i.e. who will require them. Also > > it may solve future method name conflict. > > CurlStreamHandleImpl's lifetime is different from CurlStream's, so it is > possible that when CurlStreamHandleImple::didFail is invoked, the CurlStream > instance already had been destructed. > To implement that we should consider not to access destructed instances too. Good point. Then use CurlStreamId instead. That represents the CurlStream itself and it's just an identifier, but client code can be distinguishable by that argument. > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.h:65 > > > + HashMap<CurlStreamID, CurlStream::Client*> m_clientLists; > > > > The value of item cannot be nullptr, so is should be CurlStream::Client& > > We can't use reference here because CurlStream::Client is abstract class as > we can't get the instance by calling m_clientLists.get Ah, okay. Sorry about this.
Fujii Hironori
Comment 22 2020-02-03 19:25:52 PST
Comment on attachment 389232 [details] Handling connections by one thread. View in context: https://bugs.webkit.org/attachment.cgi?id=389232&action=review >>>> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:48 >>>> + m_nextStreamID = (m_nextStreamID + 1 != invalidCurlStreamID) ? m_nextStreamID + 1 : 1; >>> >>> You don't check m_nextStreamID is not used. >>> Can you use the pointer of client instead of m_nextStreamID? >> >> The timing of issuing ID and object allocation is different. We cannot simply use the address as an id. Also memory region can be reused for same object type and it is safe not to use address. But existence check is required. > > Fixed not to use in-use IDs. > As stand in comment11 we cannot substitute IDs by pointers. I don't agree on this. SocketStreamHandleImpl always lives longer than CurlStream. Even if CurlStream would be alive longer than SocketStreamHandleImpl, I think using WeakPtr is better because HashMap is expensive.
Basuke Suzuki
Comment 23 2020-02-04 11:10:26 PST
Comment on attachment 389232 [details] Handling connections by one thread. View in context: https://bugs.webkit.org/attachment.cgi?id=389232&action=review >>>>> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:48 >>>>> + m_nextStreamID = (m_nextStreamID + 1 != invalidCurlStreamID) ? m_nextStreamID + 1 : 1; >>>> >>>> You don't check m_nextStreamID is not used. >>>> Can you use the pointer of client instead of m_nextStreamID? >>> >>> The timing of issuing ID and object allocation is different. We cannot simply use the address as an id. Also memory region can be reused for same object type and it is safe not to use address. But existence check is required. >> >> Fixed not to use in-use IDs. >> As stand in comment11 we cannot substitute IDs by pointers. > > I don't agree on this. SocketStreamHandleImpl always lives longer than CurlStream. > Even if CurlStream would be alive longer than SocketStreamHandleImpl, I think using WeakPtr is better because HashMap is expensive. No, he's talking about the timing difference between issuing ID and object instantiation. ID has to be issued at the timing of CurlStreamScheduler::create() and CurStream object will be created in the worker thread. That's the design of this patch. As of WeakPtr, it has to be created in main thread and that's also the approach we cannot take.
Takashi Komori
Comment 24 2020-02-05 05:40:23 PST
Created attachment 389797 [details] Handling connections by one thread.
Takashi Komori
Comment 25 2020-02-05 05:40:54 PST
(In reply to Takashi Komori from comment #14) > (In reply to Fujii Hironori from comment #9) > > Comment on attachment 389232 [details] > > Handling connections by one thread. > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=389232&action=review > > > > > Source/WebCore/platform/network/curl/CurlStream.cpp:168 > > > + while (m_writeBufferOffset < m_writeBufferSize) { > > > > Do you know the reason why 'while' loop is used here even though 'tryToRead' > > doesn't? > > We haven't fixed it, we will start soon. Fixed.
Takashi Komori
Comment 26 2020-02-05 05:44:49 PST
(In reply to Basuke Suzuki from comment #11) > Comment on attachment 389232 [details] > Handling connections by one thread. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=389232&action=review > > This is informal review. > > > Source/WebCore/platform/network/curl/CurlStream.cpp:54 > > + auto errorCode = m_curlHandle->perform(); > > You cannot try connecting here. It blocks until it connect. > > > Source/WebCore/platform/network/curl/CurlStream.cpp:87 > > + if (!m_curlHandle || m_writeBuffer) > > This is wrong. If m_writeBuffer is not null, the contents of buffer is gone. > And there's no way the client can know whether it can send or not. Add > canSend() method or append data in the end of writeBuffer. > > Or m_writeBuffer can be Vector<std::tuple<UniqueArray<uint_t>, size_t>>, > then appending is just a simple task. > Simplified m_writeBuffer by using Vector<std::tuple<UniqueArray<uint_t>, size_t>> m_writeBuffer -> m_sendBuffer
Takashi Komori
Comment 27 2020-02-05 05:45:37 PST
(In reply to Basuke Suzuki from comment #21) > (In reply to Takashi Komori from comment #18) > > > That is similar to Cocoa's delegation pattern. Adding that to each method > > > gives more context to implementation side, i.e. who will require them. Also > > > it may solve future method name conflict. > > > > CurlStreamHandleImpl's lifetime is different from CurlStream's, so it is > > possible that when CurlStreamHandleImple::didFail is invoked, the CurlStream > > instance already had been destructed. > > To implement that we should consider not to access destructed instances too. > > Good point. Then use CurlStreamId instead. That represents the CurlStream > itself and it's just an identifier, but client code can be distinguishable > by that argument. > We tried to let CurlStream::Client::didXXX accept CurlStreamID as the first argument. But this is not used at all for now. I think we could make this change when we really need.
Basuke Suzuki
Comment 28 2020-02-07 12:25:11 PST
Comment on attachment 389797 [details] Handling connections by one thread. View in context: https://bugs.webkit.org/attachment.cgi?id=389797&action=review This is another informal review from the previous implementer of this logic. This time, checking the thread lock area. > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:118 > + auto locker = holdLock(m_mutex); Just after waitForCompletion(), it is ensured the thread is not running here. No need to lock. > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:126 > + m_runThread = false; workerThread() never return from looping until m_runThread == false. I think there's no need to set the variable here. This makes entire life cycle unclear and very confusing. > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:131 > +{ Add assert to ensure called in worker thread. And let code reader (me) know about that explicitly. > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:133 > + if (m_taskQueue.size() || m_streamLists.size()) Move m_streamLists.size() check before locking to avoid unneeded lock. > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:162 > + auto locker = holdLock(m_mutex); m_runThread is set to false only within worker thread. There's no need to lock. > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:163 > + if (!m_runThread) Then this check can be move into `while`'s condition.
Basuke Suzuki
Comment 29 2020-02-07 12:40:24 PST
Comment on attachment 389797 [details] Handling connections by one thread. View in context: https://bugs.webkit.org/attachment.cgi?id=389797&action=review (In reply to Takashi Komori from comment #27) > Source/WebCore/platform/network/curl/SocketStreamHandleImpl.h:65 > + void didFail(CurlStreamID, CURLcode, URL&&) final; > We tried to let CurlStream::Client::didXXX accept CurlStreamID as the first argument. > But this is not used at all for now. I think we could make this change when we really need. You said this is not used, but that's not true. This signature express that these are implemented in interfaces related to CurlStream. That is very important to read a code by others. That's the thing I feel very important.
Basuke Suzuki
Comment 30 2020-02-10 13:15:06 PST
Comment on attachment 389797 [details] Handling connections by one thread. View in context: https://bugs.webkit.org/attachment.cgi?id=389797&action=review > Source/WebCore/platform/network/curl/CurlStreamScheduler.h:64 > + HashMap<CurlStreamID, std::unique_ptr<CurlStream>> m_streamLists; If HashMap is the concern because of the cost, we can switch to StdMap. If that's not enough, use Vector<std::tuple<CurlStreamID, CurlStream::Client*>>.
Fujii Hironori
Comment 31 2020-02-10 14:08:29 PST
Comment on attachment 389797 [details] Handling connections by one thread. View in context: https://bugs.webkit.org/attachment.cgi?id=389797&action=review >> Source/WebCore/platform/network/curl/CurlStreamScheduler.h:64 >> + HashMap<CurlStreamID, std::unique_ptr<CurlStream>> m_streamLists; > > If HashMap is the concern because of the cost, we can switch to StdMap. If that's not enough, use Vector<std::tuple<CurlStreamID, CurlStream::Client*>>. Why do you prefer the inefficient approach? m_streamLists simply can be removed.
Basuke Suzuki
Comment 32 2020-02-10 23:47:16 PST
Comment on attachment 389797 [details] Handling connections by one thread. View in context: https://bugs.webkit.org/attachment.cgi?id=389797&action=review >>> Source/WebCore/platform/network/curl/CurlStreamScheduler.h:64 >>> + HashMap<CurlStreamID, std::unique_ptr<CurlStream>> m_streamLists; >> >> If HashMap is the concern because of the cost, we can switch to StdMap. If that's not enough, use Vector<std::tuple<CurlStreamID, CurlStream::Client*>>. > > Why do you prefer the inefficient approach? m_streamLists simply can be removed. Can you provide how to do that? We need bi-directional reference from Client to Stream and Stream to Client. I don't understand what you are proposing.
Takashi Komori
Comment 33 2020-02-11 16:38:23 PST
Created attachment 390463 [details] Handling connections by one thread. The former patch had a flaw (CurlStream::send wrongly resets m_sendBufferOffset) This patch also aims to fix this.
Takashi Komori
Comment 34 2020-02-11 16:40:38 PST
(In reply to Basuke Suzuki from comment #28) > Comment on attachment 389797 [details] > Handling connections by one thread. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=389797&action=review > > This is another informal review from the previous implementer of this logic. > This time, checking the thread lock area. > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:118 > > + auto locker = holdLock(m_mutex); > > Just after waitForCompletion(), it is ensured the thread is not running > here. No need to lock. Removed. > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:126 > > + m_runThread = false; > > workerThread() never return from looping until m_runThread == false. I think > there's no need to set the variable here. This makes entire life cycle > unclear and very confusing. Fixed. > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:131 > > +{ > > Add assert to ensure called in worker thread. And let code reader (me) know > about that explicitly. Added. > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:133 > > + if (m_taskQueue.size() || m_streamLists.size()) > > Move m_streamLists.size() check before locking to avoid unneeded lock. Moved. > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:162 > > + auto locker = holdLock(m_mutex); > > m_runThread is set to false only within worker thread. There's no need to > lock. > > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:163 > > + if (!m_runThread) > > Then this check can be move into `while`'s condition. Fixed.
Fujii Hironori
Comment 35 2020-02-11 19:11:58 PST
Comment on attachment 389797 [details] Handling connections by one thread. View in context: https://bugs.webkit.org/attachment.cgi?id=389797&action=review >>>> Source/WebCore/platform/network/curl/CurlStreamScheduler.h:64 >>>> + HashMap<CurlStreamID, std::unique_ptr<CurlStream>> m_streamLists; >>> >>> If HashMap is the concern because of the cost, we can switch to StdMap. If that's not enough, use Vector<std::tuple<CurlStreamID, CurlStream::Client*>>. >> >> Why do you prefer the inefficient approach? m_streamLists simply can be removed. > > Can you provide how to do that? We need bi-directional reference from Client to Stream and Stream to Client. I don't understand what you are proposing. By having an internal discussion, I understand why the raw address of SocketStreamHandleImpl can't be use as the ID. But, I still think using WeakPtr or making SocketStreamHandleImpl ThreadSafeRefCounted is better than using double HashMap. Do you know WeakPtrImpl is a ThreadSafeRefCounted<WeakPtrImpl>? it can be safely increment the ref-counter in the worker thread. Anyway, you are the curl port architecture, I buy your architectural decision.
Takashi Komori
Comment 36 2020-02-12 22:59:53 PST
Fujii Hironori
Comment 37 2020-02-14 04:08:39 PST
Comment on attachment 390619 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=390619&action=review > Source/WebCore/platform/network/curl/CurlStream.cpp:161 > + auto& sendBuffer = m_sendBuffers.first(); auto& [buffer, length] = m_sendBuffers.first(); > Source/WebCore/platform/network/curl/CurlStream.cpp:188 > + client.didFail(streamID, errorCode, WTFMove(url)); Remove url of didFail becasue SocketStreamHandleImpl knows own m_url. m_url.isolatedCopy() is redundant.
Fujii Hironori
Comment 38 2020-02-14 07:07:51 PST
Comment on attachment 390619 [details] Patch What will happen in the invaid cert scenario?
Takashi Komori
Comment 39 2020-02-17 00:29:32 PST
Takashi Komori
Comment 40 2020-02-17 00:30:39 PST
(In reply to Fujii Hironori from comment #37) > Comment on attachment 390619 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=390619&action=review > > > Source/WebCore/platform/network/curl/CurlStream.cpp:161 > > + auto& sendBuffer = m_sendBuffers.first(); > > auto& [buffer, length] = m_sendBuffers.first(); Fixed. > > > Source/WebCore/platform/network/curl/CurlStream.cpp:188 > > + client.didFail(streamID, errorCode, WTFMove(url)); > > Remove url of didFail becasue SocketStreamHandleImpl knows own m_url. > m_url.isolatedCopy() is redundant. Removed.
Takashi Komori
Comment 41 2020-02-17 00:31:23 PST
(In reply to Fujii Hironori from comment #38) > Comment on attachment 390619 [details] > Patch > > What will happen in the invaid cert scenario? CURLE_PEER_FAILED_VERIFICATION error occurs in libcurl, and the connection is terminated. NetworkSocketStream doesn't have a mechanism like handling 'Authentication Challenge' in NetworkDataTask.
Fujii Hironori
Comment 42 2020-02-17 01:31:09 PST
Comment on attachment 390905 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=390905&action=review > Source/WebCore/platform/network/curl/CurlStream.cpp:39 > + , m_url(WTFMove(url)) m_url is not used outside of the ctor. Remove it.
Takashi Komori
Comment 43 2020-02-17 02:50:41 PST
Takashi Komori
Comment 44 2020-02-17 02:51:30 PST
(In reply to Fujii Hironori from comment #42) > Comment on attachment 390905 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=390905&action=review > > > Source/WebCore/platform/network/curl/CurlStream.cpp:39 > > + , m_url(WTFMove(url)) > > m_url is not used outside of the ctor. Remove it. Removed.
WebKit Commit Bot
Comment 45 2020-02-17 04:06:36 PST
The commit-queue encountered the following flaky tests while processing attachment 390907 [details]: editing/spelling/spellcheck-async-remove-frame.html bug 158401 (authors: morrita@google.com, rniwa@webkit.org, and tony@chromium.org) The commit-queue is continuing to process your patch.
WebKit Commit Bot
Comment 46 2020-02-17 04:07:14 PST
Comment on attachment 390907 [details] Patch Clearing flags on attachment: 390907 Committed r256728: <https://trac.webkit.org/changeset/256728>
WebKit Commit Bot
Comment 47 2020-02-17 04:07:16 PST
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 48 2020-02-17 04:08:18 PST
Note You need to log in before you can comment on or make changes to this bug.