Bug 187984

Summary: [Curl] Use shared single thread for WebSocket connections
Product: WebKit Reporter: Basuke Suzuki <Basuke.Suzuki>
Component: PlatformAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: annulen, Basuke.Suzuki, commit-queue, ews-watchlist, galpeter, gyuyoung.kim, Hironori.Fujii, ryuan.choi, sergio, takashi.komori, toyoshim, webkit-bug-importer, yutak
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=173449
Bug Depends on: 172630    
Bug Blocks:    
Attachments:
Description Flags
Handling connections by one thread.
none
Handling connections by one thread.
none
Handling connections by one thread.
none
Handling connections by one thread.
none
Handling connections by one thread.
none
Patch
none
Patch
none
Patch none

Description Basuke Suzuki 2018-07-24 19:14:40 PDT
Current implementation launches new thread for each connection. Create a new scheduler for socket connection. It's not possible to share with normal scheduler for HTTP connection because of the difference of the architecture. One for HTTP and another for WebSocket.
Comment 1 Fujii Hironori 2019-05-27 06:39:51 PDT
You need to address Bug 173449.
Comment 2 Takashi Komori 2020-01-28 15:42:32 PST
Created attachment 389079 [details]
Handling connections by one thread.

This patch is for proof of concept.
Comment 3 Takashi Komori 2020-01-28 15:43:38 PST
On curl port SocketStreamHanle creates a thread for handling each connection. (SocketStreamHandleImplCurl.cpp)
But this implementation is not resource friendly.

This patch aims to suppress creating threads.
The main idea is handling all connections by one thread.
For this purpose we propose introducing a manager class 'CurlSocketStreamWorkerManager' in this patch.
Comment 4 Basuke Suzuki 2020-01-28 19:23:17 PST
Comment on attachment 389079 [details]
Handling connections by one thread.

View in context: https://bugs.webkit.org/attachment.cgi?id=389079&action=review

This is an informal review. So to describe the overview of the changes,
- CurlSocketHandle code was moved to CurlSocketStreamWorker,
- SocketStreamHandleImpl code was moved to CurlSocketWorkerManager and CurlSocketStreamHandle

The instance of CurlSocketStreamHandle exists only in main thread and those of CurlSocketStreamWorker are in worker thread. This is a good separation and very effective way to prevent thread related memory issue.

I think CurlSocketStreamHandle do a little bit too much. It should be thinner layer as name implies and tasks should be asked to WorkerManager with an argument of Handle. Stop registering itself to manager, but let manager create Handle and register it by itself.

Last thing, class names are too long and hard to get the meaning. I suggest renaming like this:
- CurlSocketStreamWorkerManager -> CurlStreamScheduler (corresponding to CurlRequestSchedular)
- CurlSocketStreamWorker -> CurlStream
- CurlSocketStreamHandle -> CurlStream::Handle
- CurlSocketStreamHandleClient -> CurlStream::Client

Because this is inside curl network layer implementation, there's nothing to point to Stream since today. We can safely use simple term "Stream" for the meaning of SocketStream.

Also as WinCairo also has an implementation of RemoteInspectorSocket and the term "Socket" is used everywhere around that, I feel better not seeing that term  in Curl layer.

> Source/WebCore/platform/network/curl/CurlContext.cpp:120
> +    m_socketStreamWorkerManager = makeUnique<CurlSocketStreamWorkerManager>();

It should be lazy initialization.

> Source/WebCore/platform/network/curl/CurlContext.cpp:259
> +CURLMcode CurlMultiHandle::poll(curl_waitfd* extraFds, unsigned numOfExtraFds, int timeoutMS, int& numFds)

This isn't used in this patch. Please remove it.

> Source/WebCore/platform/network/curl/CurlContext.cpp:-874
> -#ifndef NDEBUG

These changes happens because of the layout change of functions. Please try minimizing the diff size.

> Source/WebCore/platform/network/curl/CurlContext.cpp:886
> +        errorHandler(errorCode);

errorHandler is used just for returning errorCode. There must be better way to do that. i.e. WTF::Expected

> Source/WebCore/platform/network/curl/CurlSocketStreamHandleClient.h:32
> +class CurlSocketStreamHandleClient {

This class is very close to CurlSocketStreamHandle and recent implementation tend to put that as a embedded class. Let's move it inside CurlSocketStreamHandle . like:
class CurlSocketStreamHandle {
...
public:
...
    class Client {
        ...
    }

Then this file can be deleted and the reader can understand the relationship easily afterword.

> Source/WebCore/platform/network/curl/CurlSocketStreamWorkerClient.h:35
> +class CurlSocketStreamWorkerClient {

I don't think this abstraction is required. It's too much. WorkerManager should handle Workers directly.

> Source/WebCore/platform/network/curl/SocketStreamHandleImpl.h:-82
> -    bool m_hasPendingWriteData { false };

I understand this patch tries to move the logic from this class to CurlSocketStreamXXX, but it seems a little bit too much. m_hasPendingWriteData is actually the mechanizm of WebKit side. It is okay to leave them here and make the change less in the platform layer side.
Comment 5 Takashi Komori 2020-01-30 00:28:41 PST
Created attachment 389232 [details]
Handling connections by one thread.
Comment 6 Takashi Komori 2020-01-30 00:49:26 PST

(In reply to Basuke Suzuki from comment #4)
> Comment on attachment 389079 [details]
> Handling connections by one thread.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=389079&action=review
> 
> This is an informal review. So to describe the overview of the changes,
> - CurlSocketHandle code was moved to CurlSocketStreamWorker,
> - SocketStreamHandleImpl code was moved to CurlSocketWorkerManager and
> CurlSocketStreamHandle
> 
> The instance of CurlSocketStreamHandle exists only in main thread and those
> of CurlSocketStreamWorker are in worker thread. This is a good separation
> and very effective way to prevent thread related memory issue.
> 
> I think CurlSocketStreamHandle do a little bit too much. It should be
> thinner layer as name implies and tasks should be asked to WorkerManager
> with an argument of Handle. Stop registering itself to manager, but let
> manager create Handle and register it by itself.
> 
> Last thing, class names are too long and hard to get the meaning. I suggest
> renaming like this:
> - CurlSocketStreamWorkerManager -> CurlStreamScheduler (corresponding to
> CurlRequestSchedular)
> - CurlSocketStreamWorker -> CurlStream
> - CurlSocketStreamHandle -> CurlStream::Handle
> - CurlSocketStreamHandleClient -> CurlStream::Client
> 
> Because this is inside curl network layer implementation, there's nothing to
> point to Stream since today. We can safely use simple term "Stream" for the
> meaning of SocketStream.
> 
> Also as WinCairo also has an implementation of RemoteInspectorSocket and the
> term "Socket" is used everywhere around that, I feel better not seeing that
> term  in Curl layer.
> 
> > Source/WebCore/platform/network/curl/CurlContext.cpp:120
> > +    m_socketStreamWorkerManager = makeUnique<CurlSocketStreamWorkerManager>();
> 
> It should be lazy initialization.

FIXED.

> 
> > Source/WebCore/platform/network/curl/CurlContext.cpp:259
> > +CURLMcode CurlMultiHandle::poll(curl_waitfd* extraFds, unsigned numOfExtraFds, int timeoutMS, int& numFds)
> 
> This isn't used in this patch. Please remove it.

REMOVED.

> 
> > Source/WebCore/platform/network/curl/CurlContext.cpp:-874
> > -#ifndef NDEBUG
> 
> These changes happens because of the layout change of functions. Please try
> minimizing the diff size.

FIXED.

> 
> > Source/WebCore/platform/network/curl/CurlContext.cpp:886
> > +        errorHandler(errorCode);
> 
> errorHandler is used just for returning errorCode. There must be better way
> to do that. i.e. WTF::Expected

FIXED.

> 
> > Source/WebCore/platform/network/curl/CurlSocketStreamHandleClient.h:32
> > +class CurlSocketStreamHandleClient {
> 
> This class is very close to CurlSocketStreamHandle and recent implementation
> tend to put that as a embedded class. Let's move it inside
> CurlSocketStreamHandle . like:
> class CurlSocketStreamHandle {
> ...
> public:
> ...
>     class Client {
>         ...
>     }
> 
> Then this file can be deleted and the reader can understand the relationship
> easily afterword.
> 
> > Source/WebCore/platform/network/curl/CurlSocketStreamWorkerClient.h:35
> > +class CurlSocketStreamWorkerClient {
> 
> I don't think this abstraction is required. It's too much. WorkerManager
> should handle Workers directly.

Reduced files by introducing CurlStream::Client class.
Comment 7 Fujii Hironori 2020-01-30 02:03:03 PST
Comment on attachment 389232 [details]
Handling connections by one thread.

View in context: https://bugs.webkit.org/attachment.cgi?id=389232&action=review

> Source/WebCore/platform/network/curl/CurlStream.cpp:82
> +void CurlStream::send(UniqueArray<uint8_t>&& buffer, size_t length)

'length' is redundant. buffer has the size.

> Source/WebCore/platform/network/curl/CurlStream.cpp:191
> +void CurlStream::notifyFailWithError(CURLcode errorCode)

notifyFailWithError should be renamed because 'fail' is a verb, the phase 'notify fail' sounds weird.
I think you refereed 'curlDidFailWithError'. the phase 'did fail' is not weird.

> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:172
> +    struct timeval timeout { 0, selectTimeoutMS * 1000};

You should implement self-piping technique here to wake up. (Bug 173449)
Then, you can sleep forever here instead of waking up every 20ms.
Comment 8 Fujii Hironori 2020-01-30 02:27:59 PST
Comment on attachment 389232 [details]
Handling connections by one thread.

View in context: https://bugs.webkit.org/attachment.cgi?id=389232&action=review

> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:48
> +    m_nextStreamID = (m_nextStreamID + 1 != invalidCurlStreamID) ? m_nextStreamID + 1 : 1;

You don't check m_nextStreamID is not used.
Can you use the pointer of client instead of m_nextStreamID?

> Source/WebCore/platform/network/curl/CurlStreamScheduler.h:42
> +    void destory(CurlStreamID);

'create' is typically used to create the instance of the class.
Rename createStream and destroyStream, for example.
Comment 9 Fujii Hironori 2020-01-30 02:34:40 PST
Comment on attachment 389232 [details]
Handling connections by one thread.

View in context: https://bugs.webkit.org/attachment.cgi?id=389232&action=review

> Source/WebCore/platform/network/curl/CurlStream.cpp:168
> +    while (m_writeBufferOffset < m_writeBufferSize) {

Do you know the reason why 'while' loop is used here even though 'tryToRead' doesn't?
Comment 10 Fujii Hironori 2020-01-30 02:38:01 PST
Comment on attachment 389232 [details]
Handling connections by one thread.

View in context: https://bugs.webkit.org/attachment.cgi?id=389232&action=review

> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:137
> +void CurlStreamScheduler::stopThread()

Is stopThread called from anywhere?

> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:139
> +    m_runThread = false;

Don't you need to lock m_mutex?
Comment 11 Basuke Suzuki 2020-01-30 16:04:18 PST
Comment on attachment 389232 [details]
Handling connections by one thread.

View in context: https://bugs.webkit.org/attachment.cgi?id=389232&action=review

This is informal review.

> Source/WebCore/platform/network/curl/CurlStream.cpp:54
> +    auto errorCode = m_curlHandle->perform();

You cannot try connecting here. It blocks until it connect.

> Source/WebCore/platform/network/curl/CurlStream.cpp:87
> +    if (!m_curlHandle || m_writeBuffer)

This is wrong. If m_writeBuffer is not null, the contents of buffer is gone. And there's no way the client can know whether it can send or not. Add canSend() method or append data in the end of writeBuffer.

Or m_writeBuffer can be Vector<std::tuple<UniqueArray<uint_t>, size_t>>, then appending is just a simple task.

> Source/WebCore/platform/network/curl/CurlStream.cpp:153
> +        destroyHandle();

Don't you need to start any closing sequence here?

>> Source/WebCore/platform/network/curl/CurlStream.cpp:168
>> +    while (m_writeBufferOffset < m_writeBufferSize) {
> 
> Do you know the reason why 'while' loop is used here even though 'tryToRead' doesn't?

Yeah, this is wrong. The write() will block if send buffer is full. It was okay in a dedicate thread mode, but not okay in this patch. We can try just one timing after we get a permission to write after status check.

>> Source/WebCore/platform/network/curl/CurlStream.cpp:191
>> +void CurlStream::notifyFailWithError(CURLcode errorCode)
> 
> notifyFailWithError should be renamed because 'fail' is a verb, the phase 'notify fail' sounds weird.
> I think you refereed 'curlDidFailWithError'. the phase 'did fail' is not weird.

The verb `notify` is important for this method, simply using noun `Failure` and `notifyFailure` is enough, isn't it? `WithError` is apparent from the parameter.

> Source/WebCore/platform/network/curl/CurlStream.cpp:199
> +        client.didFailWithError(SocketStreamError(errorCode, url, CurlHandle::errorDescription(errorCode)));

Is this CurlScheduler's responsibility to instantiate SocketStreamError? It is good idea to be as ignorant as possible for WebKit implementation. Just return errorCode and url is enough, I think.

> Source/WebCore/platform/network/curl/CurlStream.cpp:203
> +void CurlStream::callClient(Function<void(Client&)>&& task)

Do we need this method? Each occasion to call client callback can call m_scheduler's method directly. That makes code a little bit busy, but more straight forward.

> Source/WebCore/platform/network/curl/CurlStream.h:46
> +        virtual void didOpen() = 0;

These interface method should have caller's instance as a first argument.

    virtual void didOpen(CurlStream&) = 0;

That is similar to Cocoa's delegation pattern. Adding that to each method gives more context to implementation side, i.e. who will require them. Also it may solve future method name conflict.

>> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:48
>> +    m_nextStreamID = (m_nextStreamID + 1 != invalidCurlStreamID) ? m_nextStreamID + 1 : 1;
> 
> You don't check m_nextStreamID is not used.
> Can you use the pointer of client instead of m_nextStreamID?

The timing of issuing ID and object allocation is different. We cannot simply use the address as an id. Also memory region can be reused for same object type and it is safe not to use address. But existence check is required.

> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:52
> +    auto task = [this, streamID, url = url.isolatedCopy()]() mutable {

This might be just a preference, but for me, assigning to temporal variable and use it in next line is not good rhythm to read. Calling `callOnWorkerThread()` with direct lambda is easy to read.

> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:70
> +        ASSERT(!isMainThread());

These assertion should be responsibility of callOnWorkerThead() or executeTask() method and no need to check in every places.

> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:97
> +void CurlStreamScheduler::callOnMainThread(CurlStreamID streamID, WTF::Function<void(CurlStream::Client&)>&& task)

This is not general purpose invocation method. This function do more. How about callClientOnMainThread?

> Source/WebCore/platform/network/curl/CurlStreamScheduler.h:65
> +    HashMap<CurlStreamID, CurlStream::Client*> m_clientLists;

The value of item cannot be nullptr, so is should be CurlStream::Client&
Comment 12 Takashi Komori 2020-02-02 14:15:39 PST
Created attachment 389481 [details]
Handling connections by one thread.
Comment 13 Takashi Komori 2020-02-02 14:36:10 PST
(In reply to Fujii Hironori from comment #7)
> Comment on attachment 389232 [details]
> Handling connections by one thread.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=389232&action=review
> 
> > Source/WebCore/platform/network/curl/CurlStream.cpp:82
> > +void CurlStream::send(UniqueArray<uint8_t>&& buffer, size_t length)
> 
> 'length' is redundant. buffer has the size.

UniqueArray doesn't have length property.
 
> > Source/WebCore/platform/network/curl/CurlStream.cpp:191
> > +void CurlStream::notifyFailWithError(CURLcode errorCode)
> 
> notifyFailWithError should be renamed because 'fail' is a verb, the phase
> 'notify fail' sounds weird.
> I think you refereed 'curlDidFailWithError'. the phase 'did fail' is not
> weird.

Changed to notifyFailure.

> 
> > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:172
> > +    struct timeval timeout { 0, selectTimeoutMS * 1000};
> 
> You should implement self-piping technique here to wake up. (Bug 173449)
> Then, you can sleep forever here instead of waking up every 20ms.


We feel like implementing that in another ticket, because already we have had many changes.

And we think that we might be able to implement similar function by using curl_multi_poll and curl_multi_wakeup which are supported above curl 7.68.0
How do you think about using curl_multi_poo/curl_multi_wakeup?
Comment 14 Takashi Komori 2020-02-02 14:38:59 PST
(In reply to Fujii Hironori from comment #9)
> Comment on attachment 389232 [details]
> Handling connections by one thread.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=389232&action=review
> 
> > Source/WebCore/platform/network/curl/CurlStream.cpp:168
> > +    while (m_writeBufferOffset < m_writeBufferSize) {
> 
> Do you know the reason why 'while' loop is used here even though 'tryToRead'
> doesn't?

We haven't fixed it, we will start soon.
Comment 15 Takashi Komori 2020-02-02 14:40:04 PST
(In reply to Fujii Hironori from comment #10)
> Comment on attachment 389232 [details]
> Handling connections by one thread.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=389232&action=review
> 
> > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:137
> > +void CurlStreamScheduler::stopThread()
> 
> Is stopThread called from anywhere?

Removed.

> 
> > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:139
> > +    m_runThread = false;
> 
> Don't you need to lock m_mutex?

Fixed.
Comment 16 Takashi Komori 2020-02-02 14:50:18 PST
(In reply to Fujii Hironori from comment #8)
> Comment on attachment 389232 [details]
> Handling connections by one thread.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=389232&action=review
> 
> > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:48
> > +    m_nextStreamID = (m_nextStreamID + 1 != invalidCurlStreamID) ? m_nextStreamID + 1 : 1;
> 
> You don't check m_nextStreamID is not used.
> Can you use the pointer of client instead of m_nextStreamID?

Fixed not to use in-use IDs.
As stand in comment11 we cannot substitute IDs by pointers.

> 
> > Source/WebCore/platform/network/curl/CurlStreamScheduler.h:42
> > +    void destory(CurlStreamID);
> 
> 'create' is typically used to create the instance of the class.
> Rename createStream and destroyStream, for example.

Fixed.
Comment 17 Takashi Komori 2020-02-02 15:37:39 PST
(In reply to Basuke Suzuki from comment #11)
> Comment on attachment 389232 [details]
> Handling connections by one thread.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=389232&action=review
> 
> This is informal review.
> 
> > Source/WebCore/platform/network/curl/CurlStream.cpp:54
> > +    auto errorCode = m_curlHandle->perform();
> 
> You cannot try connecting here. It blocks until it connect.

We tried to avoid blocking by using curl_multi_perform, but it causes some failures of layout tests (timeout).
Connections made by curl_multi_perform seem to fail detecting server close.
If you know about this, please let us know.

> 
> > Source/WebCore/platform/network/curl/CurlStream.cpp:87
> > +    if (!m_curlHandle || m_writeBuffer)
> 
> This is wrong. If m_writeBuffer is not null, the contents of buffer is gone.
> And there's no way the client can know whether it can send or not. Add
> canSend() method or append data in the end of writeBuffer.
> 
> Or m_writeBuffer can be Vector<std::tuple<UniqueArray<uint_t>, size_t>>,
> then appending is just a simple task.

We haven't fixed.
Client can know if there is pending data by checking SocketStreamHandleImpl::m_hasPendingSendData.
Do you mean that SocketStreamHandleImpl::platformSendInternal should have vector of sending buffer and accept buffer regardless state of m_hasPendingSendData?


> 
> > Source/WebCore/platform/network/curl/CurlStream.cpp:153
> > +        destroyHandle();
> 
> Don't you need to start any closing sequence here?

We don't need to start closing here.
When WebSocketChannel::didReceiveSocketStreamData receives 0 byte buffer, it starts disconnecting.

> >> Source/WebCore/platform/network/curl/CurlStream.cpp:191
> >> +void CurlStream::notifyFailWithError(CURLcode errorCode)
> > 
> > notifyFailWithError should be renamed because 'fail' is a verb, the phase 'notify fail' sounds weird.
> > I think you refereed 'curlDidFailWithError'. the phase 'did fail' is not weird.
> 
> The verb `notify` is important for this method, simply using noun `Failure`
> and `notifyFailure` is enough, isn't it? `WithError` is apparent from the
> parameter.
> 

Fixed.

> > Source/WebCore/platform/network/curl/CurlStream.cpp:199
> > +        client.didFailWithError(SocketStreamError(errorCode, url, CurlHandle::errorDescription(errorCode)));
> 
> Is this CurlScheduler's responsibility to instantiate SocketStreamError? It
> is good idea to be as ignorant as possible for WebKit implementation. Just
> return errorCode and url is enough, I think.

Fixed.

> 
> > Source/WebCore/platform/network/curl/CurlStream.cpp:203
> > +void CurlStream::callClient(Function<void(Client&)>&& task)
> 
> Do we need this method? Each occasion to call client callback can call
> m_scheduler's method directly. That makes code a little bit busy, but more
> straight forward.

Fixed.
Comment 18 Takashi Komori 2020-02-02 16:28:56 PST
(In reply to Basuke Suzuki from comment #11)
> 
> > Source/WebCore/platform/network/curl/CurlStream.h:46
> > +        virtual void didOpen() = 0;
> 
> These interface method should have caller's instance as a first argument.
> 
>     virtual void didOpen(CurlStream&) = 0;
> 
> That is similar to Cocoa's delegation pattern. Adding that to each method
> gives more context to implementation side, i.e. who will require them. Also
> it may solve future method name conflict.

CurlStreamHandleImpl's lifetime is different from CurlStream's, so it is possible that when CurlStreamHandleImple::didFail is invoked, the CurlStream instance already had been destructed.
To implement that we should consider not to access destructed instances too.


> > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:52
> > +    auto task = [this, streamID, url = url.isolatedCopy()]() mutable {
> 
> This might be just a preference, but for me, assigning to temporal variable
> and use it in next line is not good rhythm to read. Calling
> `callOnWorkerThread()` with direct lambda is easy to read.

Fixed.

>
> > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:70
> > +        ASSERT(!isMainThread());
> 
> These assertion should be responsibility of callOnWorkerThead() or
> executeTask() method and no need to check in every places.

Fixed.

> 
> > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:97
> > +void CurlStreamScheduler::callOnMainThread(CurlStreamID streamID, WTF::Function<void(CurlStream::Client&)>&& task)
> 
> This is not general purpose invocation method. This function do more. How
> about callClientOnMainThread?

Fixed.

> 
> > Source/WebCore/platform/network/curl/CurlStreamScheduler.h:65
> > +    HashMap<CurlStreamID, CurlStream::Client*> m_clientLists;
> 
> The value of item cannot be nullptr, so is should be CurlStream::Client&

We can't use reference here because CurlStream::Client is abstract class as we can't get the instance by calling m_clientLists.get
Comment 19 Basuke Suzuki 2020-02-03 16:16:38 PST
(In reply to Takashi Komori from comment #13)
> > > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:172
> > > +    struct timeval timeout { 0, selectTimeoutMS * 1000};
> > 
> > You should implement self-piping technique here to wake up. (Bug 173449)
> > Then, you can sleep forever here instead of waking up every 20ms.
> 
> 
> We feel like implementing that in another ticket, because already we have
> had many changes.
> 
> And we think that we might be able to implement similar function by using
> curl_multi_poll and curl_multi_wakeup which are supported above curl 7.68.0
> How do you think about using curl_multi_poo/curl_multi_wakeup?

I feel that is very right direction. I agree with postponing this improvement later.
Comment 20 Basuke Suzuki 2020-02-03 16:35:12 PST
(In reply to Takashi Komori from comment #17)
> > You cannot try connecting here. It blocks until it connect.
> 
> We tried to avoid blocking by using curl_multi_perform, but it causes some
> failures of layout tests (timeout).
> Connections made by curl_multi_perform seem to fail detecting server close.
> If you know about this, please let us know.


> > 
> > > Source/WebCore/platform/network/curl/CurlStream.cpp:153
> > > +        destroyHandle();
> > 
> > Don't you need to start any closing sequence here?
> 
> We don't need to start closing here.
> When WebSocketChannel::didReceiveSocketStreamData receives 0 byte buffer, it
> starts disconnecting.

Got it.
Comment 21 Basuke Suzuki 2020-02-03 16:40:14 PST
(In reply to Takashi Komori from comment #18)
> > That is similar to Cocoa's delegation pattern. Adding that to each method
> > gives more context to implementation side, i.e. who will require them. Also
> > it may solve future method name conflict.
> 
> CurlStreamHandleImpl's lifetime is different from CurlStream's, so it is
> possible that when CurlStreamHandleImple::didFail is invoked, the CurlStream
> instance already had been destructed.
> To implement that we should consider not to access destructed instances too.

Good point. Then use CurlStreamId instead. That represents the CurlStream itself and it's just an identifier, but client code can be distinguishable by that argument.

> > > Source/WebCore/platform/network/curl/CurlStreamScheduler.h:65
> > > +    HashMap<CurlStreamID, CurlStream::Client*> m_clientLists;
> > 
> > The value of item cannot be nullptr, so is should be CurlStream::Client&
> 
> We can't use reference here because CurlStream::Client is abstract class as
> we can't get the instance by calling m_clientLists.get

Ah, okay. Sorry about this.
Comment 22 Fujii Hironori 2020-02-03 19:25:52 PST
Comment on attachment 389232 [details]
Handling connections by one thread.

View in context: https://bugs.webkit.org/attachment.cgi?id=389232&action=review

>>>> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:48
>>>> +    m_nextStreamID = (m_nextStreamID + 1 != invalidCurlStreamID) ? m_nextStreamID + 1 : 1;
>>> 
>>> You don't check m_nextStreamID is not used.
>>> Can you use the pointer of client instead of m_nextStreamID?
>> 
>> The timing of issuing ID and object allocation is different. We cannot simply use the address as an id. Also memory region can be reused for same object type and it is safe not to use address. But existence check is required.
> 
> Fixed not to use in-use IDs.
> As stand in comment11 we cannot substitute IDs by pointers.

I don't agree on this. SocketStreamHandleImpl always lives longer than CurlStream.
Even if CurlStream would be alive longer than SocketStreamHandleImpl, I think using WeakPtr is better because HashMap is expensive.
Comment 23 Basuke Suzuki 2020-02-04 11:10:26 PST
Comment on attachment 389232 [details]
Handling connections by one thread.

View in context: https://bugs.webkit.org/attachment.cgi?id=389232&action=review

>>>>> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:48
>>>>> +    m_nextStreamID = (m_nextStreamID + 1 != invalidCurlStreamID) ? m_nextStreamID + 1 : 1;
>>>> 
>>>> You don't check m_nextStreamID is not used.
>>>> Can you use the pointer of client instead of m_nextStreamID?
>>> 
>>> The timing of issuing ID and object allocation is different. We cannot simply use the address as an id. Also memory region can be reused for same object type and it is safe not to use address. But existence check is required.
>> 
>> Fixed not to use in-use IDs.
>> As stand in comment11 we cannot substitute IDs by pointers.
> 
> I don't agree on this. SocketStreamHandleImpl always lives longer than CurlStream.
> Even if CurlStream would be alive longer than SocketStreamHandleImpl, I think using WeakPtr is better because HashMap is expensive.

No, he's talking about the timing difference between issuing ID and object instantiation. ID has to be issued at the timing of CurlStreamScheduler::create() and CurStream object will be created in the worker thread. That's the design of this patch. 

As of WeakPtr, it has to be created in main thread and that's also the approach we cannot take.
Comment 24 Takashi Komori 2020-02-05 05:40:23 PST
Created attachment 389797 [details]
Handling connections by one thread.
Comment 25 Takashi Komori 2020-02-05 05:40:54 PST
(In reply to Takashi Komori from comment #14)
> (In reply to Fujii Hironori from comment #9)
> > Comment on attachment 389232 [details]
> > Handling connections by one thread.
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=389232&action=review
> > 
> > > Source/WebCore/platform/network/curl/CurlStream.cpp:168
> > > +    while (m_writeBufferOffset < m_writeBufferSize) {
> > 
> > Do you know the reason why 'while' loop is used here even though 'tryToRead'
> > doesn't?
> 
> We haven't fixed it, we will start soon.

Fixed.
Comment 26 Takashi Komori 2020-02-05 05:44:49 PST
(In reply to Basuke Suzuki from comment #11)
> Comment on attachment 389232 [details]
> Handling connections by one thread.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=389232&action=review
> 
> This is informal review.
> 
> > Source/WebCore/platform/network/curl/CurlStream.cpp:54
> > +    auto errorCode = m_curlHandle->perform();
> 
> You cannot try connecting here. It blocks until it connect.
> 
> > Source/WebCore/platform/network/curl/CurlStream.cpp:87
> > +    if (!m_curlHandle || m_writeBuffer)
> 
> This is wrong. If m_writeBuffer is not null, the contents of buffer is gone.
> And there's no way the client can know whether it can send or not. Add
> canSend() method or append data in the end of writeBuffer.
> 
> Or m_writeBuffer can be Vector<std::tuple<UniqueArray<uint_t>, size_t>>,
> then appending is just a simple task.
> 

Simplified m_writeBuffer by using Vector<std::tuple<UniqueArray<uint_t>, size_t>>
m_writeBuffer -> m_sendBuffer
Comment 27 Takashi Komori 2020-02-05 05:45:37 PST
(In reply to Basuke Suzuki from comment #21)
> (In reply to Takashi Komori from comment #18)
> > > That is similar to Cocoa's delegation pattern. Adding that to each method
> > > gives more context to implementation side, i.e. who will require them. Also
> > > it may solve future method name conflict.
> > 
> > CurlStreamHandleImpl's lifetime is different from CurlStream's, so it is
> > possible that when CurlStreamHandleImple::didFail is invoked, the CurlStream
> > instance already had been destructed.
> > To implement that we should consider not to access destructed instances too.
> 
> Good point. Then use CurlStreamId instead. That represents the CurlStream
> itself and it's just an identifier, but client code can be distinguishable
> by that argument.
> 

We tried to let CurlStream::Client::didXXX accept CurlStreamID as the first argument.
But this is not used at all for now. I think we could make this change when we really need.
Comment 28 Basuke Suzuki 2020-02-07 12:25:11 PST
Comment on attachment 389797 [details]
Handling connections by one thread.

View in context: https://bugs.webkit.org/attachment.cgi?id=389797&action=review

This is another informal review from the previous implementer of this logic. This time, checking the thread lock area.

> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:118
> +        auto locker = holdLock(m_mutex);

Just after waitForCompletion(), it is ensured the thread is not running here. No need to lock.

> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:126
> +        m_runThread = false;

workerThread() never return from looping until m_runThread == false. I think there's no need to set the variable here. This makes entire life cycle unclear and very confusing.

> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:131
> +{

Add assert to ensure called in worker thread. And let code reader (me) know about that explicitly.

> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:133
> +    if (m_taskQueue.size() || m_streamLists.size())

Move m_streamLists.size() check before locking to avoid unneeded lock.

> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:162
> +            auto locker = holdLock(m_mutex);

m_runThread is set to false only within worker thread. There's no need to lock.

> Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:163
> +            if (!m_runThread)

Then this check can be move into `while`'s condition.
Comment 29 Basuke Suzuki 2020-02-07 12:40:24 PST
Comment on attachment 389797 [details]
Handling connections by one thread.

View in context: https://bugs.webkit.org/attachment.cgi?id=389797&action=review

(In reply to Takashi Komori from comment #27)

> Source/WebCore/platform/network/curl/SocketStreamHandleImpl.h:65
> +    void didFail(CurlStreamID, CURLcode, URL&&) final;

> We tried to let CurlStream::Client::didXXX accept CurlStreamID as the first argument.
> But this is not used at all for now. I think we could make this change when we really need.

You said this is not used, but that's not true. This signature express that these are implemented in interfaces related to CurlStream. That is very important to read a code by others. That's the thing I feel very important.
Comment 30 Basuke Suzuki 2020-02-10 13:15:06 PST
Comment on attachment 389797 [details]
Handling connections by one thread.

View in context: https://bugs.webkit.org/attachment.cgi?id=389797&action=review

> Source/WebCore/platform/network/curl/CurlStreamScheduler.h:64
> +    HashMap<CurlStreamID, std::unique_ptr<CurlStream>> m_streamLists;

If HashMap is the concern because of the cost, we can switch to StdMap. If that's not enough, use Vector<std::tuple<CurlStreamID, CurlStream::Client*>>.
Comment 31 Fujii Hironori 2020-02-10 14:08:29 PST
Comment on attachment 389797 [details]
Handling connections by one thread.

View in context: https://bugs.webkit.org/attachment.cgi?id=389797&action=review

>> Source/WebCore/platform/network/curl/CurlStreamScheduler.h:64
>> +    HashMap<CurlStreamID, std::unique_ptr<CurlStream>> m_streamLists;
> 
> If HashMap is the concern because of the cost, we can switch to StdMap. If that's not enough, use Vector<std::tuple<CurlStreamID, CurlStream::Client*>>.

Why do you prefer the inefficient approach? m_streamLists simply can be removed.
Comment 32 Basuke Suzuki 2020-02-10 23:47:16 PST
Comment on attachment 389797 [details]
Handling connections by one thread.

View in context: https://bugs.webkit.org/attachment.cgi?id=389797&action=review

>>> Source/WebCore/platform/network/curl/CurlStreamScheduler.h:64
>>> +    HashMap<CurlStreamID, std::unique_ptr<CurlStream>> m_streamLists;
>> 
>> If HashMap is the concern because of the cost, we can switch to StdMap. If that's not enough, use Vector<std::tuple<CurlStreamID, CurlStream::Client*>>.
> 
> Why do you prefer the inefficient approach? m_streamLists simply can be removed.

Can you provide how to do that? We need bi-directional reference from Client to Stream and Stream to Client. I don't understand what you are proposing.
Comment 33 Takashi Komori 2020-02-11 16:38:23 PST
Created attachment 390463 [details]
Handling connections by one thread.

The former patch had a flaw (CurlStream::send wrongly resets m_sendBufferOffset)
This patch also aims to fix this.
Comment 34 Takashi Komori 2020-02-11 16:40:38 PST
(In reply to Basuke Suzuki from comment #28)
> Comment on attachment 389797 [details]
> Handling connections by one thread.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=389797&action=review
> 
> This is another informal review from the previous implementer of this logic.
> This time, checking the thread lock area.
> 
> > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:118
> > +        auto locker = holdLock(m_mutex);
> 
> Just after waitForCompletion(), it is ensured the thread is not running
> here. No need to lock.

Removed.

> 
> > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:126
> > +        m_runThread = false;
> 
> workerThread() never return from looping until m_runThread == false. I think
> there's no need to set the variable here. This makes entire life cycle
> unclear and very confusing.

Fixed.

> 
> > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:131
> > +{
> 
> Add assert to ensure called in worker thread. And let code reader (me) know
> about that explicitly.

Added.

> 
> > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:133
> > +    if (m_taskQueue.size() || m_streamLists.size())
> 
> Move m_streamLists.size() check before locking to avoid unneeded lock.

Moved.

> 
> > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:162
> > +            auto locker = holdLock(m_mutex);
> 
> m_runThread is set to false only within worker thread. There's no need to
> lock.
> 
> > Source/WebCore/platform/network/curl/CurlStreamScheduler.cpp:163
> > +            if (!m_runThread)
> 
> Then this check can be move into `while`'s condition.

Fixed.
Comment 35 Fujii Hironori 2020-02-11 19:11:58 PST
Comment on attachment 389797 [details]
Handling connections by one thread.

View in context: https://bugs.webkit.org/attachment.cgi?id=389797&action=review

>>>> Source/WebCore/platform/network/curl/CurlStreamScheduler.h:64
>>>> +    HashMap<CurlStreamID, std::unique_ptr<CurlStream>> m_streamLists;
>>> 
>>> If HashMap is the concern because of the cost, we can switch to StdMap. If that's not enough, use Vector<std::tuple<CurlStreamID, CurlStream::Client*>>.
>> 
>> Why do you prefer the inefficient approach? m_streamLists simply can be removed.
> 
> Can you provide how to do that? We need bi-directional reference from Client to Stream and Stream to Client. I don't understand what you are proposing.

By having an internal discussion, I understand why the raw address of SocketStreamHandleImpl can't be use as the ID.
But, I still think using WeakPtr or making SocketStreamHandleImpl ThreadSafeRefCounted is better than using double HashMap.
Do you know WeakPtrImpl is a ThreadSafeRefCounted<WeakPtrImpl>? it can be safely increment the ref-counter in the worker thread.
Anyway, you are the curl port architecture, I buy your architectural decision.
Comment 36 Takashi Komori 2020-02-12 22:59:53 PST
Created attachment 390619 [details]
Patch
Comment 37 Fujii Hironori 2020-02-14 04:08:39 PST
Comment on attachment 390619 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390619&action=review

> Source/WebCore/platform/network/curl/CurlStream.cpp:161
> +    auto& sendBuffer = m_sendBuffers.first();

auto& [buffer, length] = m_sendBuffers.first();

> Source/WebCore/platform/network/curl/CurlStream.cpp:188
> +        client.didFail(streamID, errorCode, WTFMove(url));

Remove url of didFail becasue SocketStreamHandleImpl knows own m_url.
m_url.isolatedCopy() is redundant.
Comment 38 Fujii Hironori 2020-02-14 07:07:51 PST
Comment on attachment 390619 [details]
Patch

What will happen in the invaid cert scenario?
Comment 39 Takashi Komori 2020-02-17 00:29:32 PST
Created attachment 390905 [details]
Patch
Comment 40 Takashi Komori 2020-02-17 00:30:39 PST
(In reply to Fujii Hironori from comment #37)
> Comment on attachment 390619 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=390619&action=review
> 
> > Source/WebCore/platform/network/curl/CurlStream.cpp:161
> > +    auto& sendBuffer = m_sendBuffers.first();
> 
> auto& [buffer, length] = m_sendBuffers.first();

Fixed.

> 
> > Source/WebCore/platform/network/curl/CurlStream.cpp:188
> > +        client.didFail(streamID, errorCode, WTFMove(url));
> 
> Remove url of didFail becasue SocketStreamHandleImpl knows own m_url.
> m_url.isolatedCopy() is redundant.

Removed.
Comment 41 Takashi Komori 2020-02-17 00:31:23 PST
(In reply to Fujii Hironori from comment #38)
> Comment on attachment 390619 [details]
> Patch
> 
> What will happen in the invaid cert scenario?

CURLE_PEER_FAILED_VERIFICATION error occurs in libcurl, and the connection is terminated.

NetworkSocketStream doesn't have a mechanism like handling 'Authentication Challenge' in NetworkDataTask.
Comment 42 Fujii Hironori 2020-02-17 01:31:09 PST
Comment on attachment 390905 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390905&action=review

> Source/WebCore/platform/network/curl/CurlStream.cpp:39
> +    , m_url(WTFMove(url))

m_url is not used outside of the ctor. Remove it.
Comment 43 Takashi Komori 2020-02-17 02:50:41 PST
Created attachment 390907 [details]
Patch
Comment 44 Takashi Komori 2020-02-17 02:51:30 PST
(In reply to Fujii Hironori from comment #42)
> Comment on attachment 390905 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=390905&action=review
> 
> > Source/WebCore/platform/network/curl/CurlStream.cpp:39
> > +    , m_url(WTFMove(url))
> 
> m_url is not used outside of the ctor. Remove it.

Removed.
Comment 45 WebKit Commit Bot 2020-02-17 04:06:36 PST
The commit-queue encountered the following flaky tests while processing attachment 390907 [details]:

editing/spelling/spellcheck-async-remove-frame.html bug 158401 (authors: morrita@google.com, rniwa@webkit.org, and tony@chromium.org)
The commit-queue is continuing to process your patch.
Comment 46 WebKit Commit Bot 2020-02-17 04:07:14 PST
Comment on attachment 390907 [details]
Patch

Clearing flags on attachment: 390907

Committed r256728: <https://trac.webkit.org/changeset/256728>
Comment 47 WebKit Commit Bot 2020-02-17 04:07:16 PST
All reviewed patches have been landed.  Closing bug.
Comment 48 Radar WebKit Bug Importer 2020-02-17 04:08:18 PST
<rdar://problem/59506869>