Bug 173991 - [Curl] Unify ResourceHandleManager into CurlJobManager.
Summary: [Curl] Unify ResourceHandleManager into CurlJobManager.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Basuke Suzuki
URL:
Keywords:
Depends on:
Blocks: 70229 117300
  Show dependency treegraph
 
Reported: 2017-06-29 14:30 PDT by Basuke Suzuki
Modified: 2017-07-18 11:49 PDT (History)
3 users (show)

See Also:


Attachments
Curl ResourceHandle is now run in background (88.59 KB, patch)
2017-07-14 13:03 PDT, Basuke Suzuki
achristensen: review-
Details | Formatted Diff | Diff
PATCH with reviewed advices (91.22 KB, patch)
2017-07-17 11:08 PDT, Basuke Suzuki
achristensen: review-
Details | Formatted Diff | Diff
PATCH with reviewed advices (91.88 KB, patch)
2017-07-17 17:02 PDT, Basuke Suzuki
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Basuke Suzuki 2017-06-29 14:30:23 PDT
CurlJobManager, based on former CurlDownloadManager, manages job in background using Thread.
ResourceHandleManager manages job in foreground using timer.
Apparently those two have similar responsibility and should be unified into one.
CurlJobManager has better architecture, so usage of ResourceHandleManager will switch to CurlJobManager and it should be gone.

Some missing implementations to manage both download job and ResourceHandle are required.
Comment 1 Basuke Suzuki 2017-07-14 13:03:21 PDT
Created attachment 315477 [details]
Curl ResourceHandle is now run in background
Comment 2 Alex Christensen 2017-07-14 16:01:18 PDT
Comment on attachment 315477 [details]
Curl ResourceHandle is now run in background

View in context: https://bugs.webkit.org/attachment.cgi?id=315477&action=review

Hooray!

> Source/WebCore/ChangeLog:6
> +        Use CurlJobManager to make ResourceHandle run in background.

Hooray!

> Source/WebCore/platform/network/ResourceHandle.h:297
> +    Vector<Vector<char>> m_receivedQueue;

This is...interesting.

> Source/WebCore/platform/network/curl/CurlContext.cpp:302
> +    void* that;

handle

> Source/WebCore/platform/network/curl/CurlContext.cpp:314
> +    return (m_errorCode = curl_easy_perform(m_handle));

unnecessary parentheses?
Please just put on two lines to make it clear that this is an assignment then returning, not a forgotten ==

> Source/WebCore/platform/network/curl/CurlContext.h:174
> +class CurlSList {

Please put this in its own header.

> Source/WebCore/platform/network/curl/CurlDownload.cpp:94
> +            callOnMainThread([this] {

protectedThis = makeRef(*this) to prevent use-after-free bugs if the last reference goes out of scope before the lambda is called on the main thread.

> Source/WebCore/platform/network/curl/CurlJobManager.cpp:45
> +    using Predicate = std::function<bool(const CurlJob&)>;
> +    using Action = std::function<void(CurlJob&)>;

Let's use WTF::Function unless it needs to be copyable.

> Source/WebCore/platform/network/curl/CurlJobManager.cpp:88
> +    std::map<CurlJobTicket, CurlJob> m_jobs;

We typically use WTF data structures in WebKit instead of stl, such as HashMap in this case.  They're more optimized for WebKit, and using the same data structures everywhere makes it easier to put things into different places without copying between WTF/stl types.

> Source/WebCore/platform/network/curl/CurlJobManager.cpp:152
> +void CurlJobManager::stopThreadIfNoMore()

No more what?

> Source/WebCore/platform/network/curl/CurlJobManager.cpp:172
> +template<typename C>

T is more common, or a more descriptive name.

> Source/WebCore/platform/network/curl/CurlJobManager.cpp:199
> +    std::vector<CurlJob> pendingJobs;

Vector

> Source/WebCore/platform/network/curl/CurlJobManager.h:99
> +        return std::count(m_activeJobs.begin(), m_activeJobs.end(), job) > 0;

If you make m_activeJobs is a HashMap, you could probably use contains here.

> Source/WebCore/platform/network/curl/CurlJobManager.h:106
> +    std::set<CurlJobTicket> m_activeJobs;

HashSet

> Source/WebCore/platform/network/curl/CurlJobManager.h:109
> +    std::map<CurlJobTicket, std::vector<CurlJobTask>> m_taskQueue;

MessageQueue?

> Source/WebCore/platform/network/curl/ResourceError.h:27
>  #ifndef ResourceError_h
>  #define ResourceError_h

#pragma once

> Source/WebCore/platform/network/curl/ResourceHandleCurl.cpp:74
> +    CurlJobManager::singleton().add(d->m_curlHandle, [this](CurlJobResult result) {

protectedThis = makeRef(*this)

> Source/WebCore/platform/network/curl/ResourceHandleCurl.cpp:99
> +            callOnMainThread([this] {

protectedThis = makeRef(*this)

> Source/WebCore/platform/network/curl/ResourceHandleCurl.cpp:710
> +    if (splitPos != -1) {

notFound

> Source/WebCore/platform/network/curl/ResourceHandleCurl.cpp:731
> +        if (statusCodePos != -1) {

ditto
Comment 3 Basuke Suzuki 2017-07-14 16:04:38 PDT
Thanks for review!
I've worried about the size of patch, but it seems okay about that. I'll fix'em pretty soon.
Comment 4 Basuke Suzuki 2017-07-14 16:35:27 PDT
(In reply to Alex Christensen from comment #2)
> > Source/WebCore/platform/network/ResourceHandle.h:297
> > +    Vector<Vector<char>> m_receivedQueue;
> 
> This is...interesting.

Ah... actually. I'll change to simple Vector<char> and append new data on it.


> > Source/WebCore/platform/network/curl/CurlContext.h:174
> > +class CurlSList {
> 
> Please put this in its own header.

Okay, but only this one? Or also others, such as CurlHandle, CurlMultiHandle, CurlShareHandle? I prefer every thin wrapper sit inside CurlContext until the purpose of the class is clear. CurlSList will be replaced with standard list or vector to return list of items in really near future.



> > Source/WebCore/platform/network/curl/CurlDownload.cpp:94
> > +            callOnMainThread([this] {
> 
> protectedThis = makeRef(*this) to prevent use-after-free bugs if the last
> reference goes out of scope before the lambda is called on the main thread.

I am not good at this idiom yet. Is this what you meant? Do I need internal protectThis? It seems I cannot pass by copy and passing by reference is also not right, I define same var again from this.
 

bool CurlDownload::start()
{
    m_job = CurlJobManager::singleton().add(m_curlHandle, 
        [this, protectThis = makeRef(*this)](CurlJobResult result) {
        
        switch (result) {
        case CurlJobResult::Done:
            callOnMainThread([this, protectThis = makeRef(*this)] {
                didFinish();
                deref();
            });
            break;
        }
    });
    return true;
}
Comment 5 Alex Christensen 2017-07-17 11:02:25 PDT
(In reply to Basuke Suzuki from comment #4)
> > > Source/WebCore/platform/network/curl/CurlContext.h:174
> > > +class CurlSList {
> > 
> > Please put this in its own header.
> 
> Okay, but only this one? Or also others, such as CurlHandle,
> CurlMultiHandle, CurlShareHandle? I prefer every thin wrapper sit inside
> CurlContext until the purpose of the class is clear. CurlSList will be
> replaced with standard list or vector to return list of items in really near
> future.
We should be moving towards one class per header.  If there's a thin wrapper only used in a certain context, use nested classes.

> > > Source/WebCore/platform/network/curl/CurlDownload.cpp:94
> > > +            callOnMainThread([this] {
> > 
> > protectedThis = makeRef(*this) to prevent use-after-free bugs if the last
> > reference goes out of scope before the lambda is called on the main thread.
> 
> I am not good at this idiom yet. Is this what you meant? Do I need internal
> protectThis? It seems I cannot pass by copy and passing by reference is also
> not right, I define same var again from this.
>  
> 
> bool CurlDownload::start()
> {
>     m_job = CurlJobManager::singleton().add(m_curlHandle, 
>         [this, protectThis = makeRef(*this)](CurlJobResult result) {
>         
>         switch (result) {
>         case CurlJobResult::Done:
>             callOnMainThread([this, protectThis = makeRef(*this)] {
>                 didFinish();
>                 deref();
>             });
>             break;
>         }
>     });
>     return true;
> }
When using callOnMainThread, you save a this pointer but there is no guarantee that when the lambda is actually called, this hasn't been deleted.  As it stands now, it could easily use this memory after its freed, which is *very* bad.  replacing [this] with [this, protectedThis = makeRef(*this)] with RefCounted types makes it so there is guaranteed to be at least one reference to this, so it won't be deleted before the lambda is called.  The object might be disconnected by then, which is why we null out pointers and check if clients still exist when calling client functions.
Comment 6 Basuke Suzuki 2017-07-17 11:08:40 PDT
Created attachment 315682 [details]
PATCH with reviewed advices
Comment 7 Basuke Suzuki 2017-07-17 11:13:20 PDT
> We should be moving towards one class per header.  If there's a thin wrapper
> only used in a certain context, use nested classes.

Got it. Can I make this change as a separate bug? We wanna keep our code base maintainable and easy to improve.
Comment 8 Alex Christensen 2017-07-17 11:26:31 PDT
Comment on attachment 315682 [details]
PATCH with reviewed advices

View in context: https://bugs.webkit.org/attachment.cgi?id=315682&action=review

Cool.  Let's do another iteration.

> Source/WebCore/platform/network/curl/CurlContext.cpp:300
> +CurlHandle* CurlHandle::recovery(CURL* h)

This isn't used, is it?  I also don't think it's well named.  Maybe fromCURL?

> Source/WebCore/platform/network/curl/CurlContext.cpp:348
> +    if (headers.size() > 0) {

I don't think the > 0 is necessary.

> Source/WebCore/platform/network/curl/CurlContext.h:178
> +    using native = struct curl_slist*;

I don't think this alias improves anything.

> Source/WebCore/platform/network/curl/CurlContext.h:184
> +    CurlSList(CurlSList&& other) { std::swap(m_list, other.m_list); }
> +    CurlSList& operator=(CurlSList&& other) { std::swap(m_list, other.m_list); return *this; }

I don't think this is a proper use of std::swap.  We don't need other to get the current pointer from this.

> Source/WebCore/platform/network/curl/CurlDownload.cpp:90
> +    m_job = CurlJobManager::singleton().add(m_curlHandle, 
> +        [protectedThis = makeRef(*this)](CurlJobResult result) mutable {

This can be on one line.

> Source/WebCore/platform/network/curl/CurlDownload.cpp:226
> +    callOnMainThread([protectedThis = makeRef(*this)]() mutable {
> +        if (protectedThis->m_listener)

You can also capture this so you don't need all the protectedThis->

> Source/WebCore/platform/network/curl/CurlDownload.cpp:242
> +        if (protectedThis->m_listener)

ditto

> Source/WebCore/platform/network/curl/CurlJobManager.h:80
> +        std::swap(m_curl, other.m_curl);
> +        std::swap(m_job, other.m_job);

Again, other doesn't need to get the pointer from this.

> Source/WebCore/platform/network/curl/ResourceHandleCurl.cpp:74
> +        [protectedThis = makeRef(*this)](CurlJobResult result) {

Capture this, too

> Source/WebCore/platform/network/curl/ResourceHandleCurl.cpp:156
> +    if (firstRequest().httpHeaderFields().size() > 0) {

> 0 unnecessary

> Source/WebCore/platform/network/curl/ResourceHandleCurl.cpp:898
> +        callOnMainThread([job, httpCode, contentLength] {

It would prevent some use-after-free bugs to make a RefPtr for job, too.  Right now there is no guarantee the ResourceHandle still exists when the lambda is called on the main thread.
job = RefPtr<ResourceHandle>(job)

> Source/WebCore/platform/network/curl/ResourceHandleCurl.cpp:941
> +        callOnMainThread([job] {

ditto
Comment 9 Alex Christensen 2017-07-17 11:27:23 PDT
(In reply to Basuke Suzuki from comment #7)
> > We should be moving towards one class per header.  If there's a thin wrapper
> > only used in a certain context, use nested classes.
> 
> Got it. Can I make this change as a separate bug? We wanna keep our code
> base maintainable and easy to improve.
Sure.
Comment 10 Basuke Suzuki 2017-07-17 12:09:33 PDT
I've observed the crash caused by calling method of the deleted ResourceHandle over the thread boundary. protectedThis = makeRef(*this) helps a lot. Thanks for the very good advice.
Comment 11 Basuke Suzuki 2017-07-17 17:02:39 PDT
Created attachment 315732 [details]
PATCH with reviewed advices
Comment 12 Alex Christensen 2017-07-17 23:52:40 PDT
Comment on attachment 315732 [details]
PATCH with reviewed advices

View in context: https://bugs.webkit.org/attachment.cgi?id=315732&action=review

Let's land this.

> Source/WebCore/platform/network/curl/CurlJobManager.cpp:91
> +        return withJob(ticket, [&callback](JobMap::iterator it) {

Here we just call the callback immediately, but if we needed to do anything interesting with the callback, we would want to move it into the lambda capture list, not just capturing it by reference.
[callback = WTFMove(callback)]
This can be done in a future patch if needed.

> Source/WebCore/platform/network/curl/CurlJobManager.h:60
> +        : m_curl { curl }, m_job { WTFMove(job) } { }

These could be on different lines.

> Source/WebCore/platform/network/curl/CurlJobManager.h:64
> +        other.m_curl = nullptr;

It's probably not necessary to null out a pointer of something that was moved into this constructor.

> Source/WebCore/platform/network/curl/ResourceHandleCurl.cpp:712
> +    int splitPos = header.find(":");

auto or size_t.
Also please rename splitPos to splitPosition in a followup patch.
Comment 13 WebKit Commit Bot 2017-07-18 00:20:42 PDT
Comment on attachment 315732 [details]
PATCH with reviewed advices

Clearing flags on attachment: 315732

Committed r219606: <http://trac.webkit.org/changeset/219606>
Comment 14 WebKit Commit Bot 2017-07-18 00:20:44 PDT
All reviewed patches have been landed.  Closing bug.
Comment 15 Basuke Suzuki 2017-07-18 11:49:22 PDT
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=315732&action=review
> 
> Let's land this.

Thank you!


> > Source/WebCore/platform/network/curl/CurlJobManager.cpp:91
> > +        return withJob(ticket, [&callback](JobMap::iterator it) {
> 
> Here we just call the callback immediately, but if we needed to do anything
> interesting with the callback, we would want to move it into the lambda
> capture list, not just capturing it by reference.
> [callback = WTFMove(callback)]
> This can be done in a future patch if needed.


The purpose of that is to change configuration of CurlHandle safely, but nobody knows what will be required in the future. I make that point in my mind. Thanks.


> > Source/WebCore/platform/network/curl/CurlJobManager.h:60
> > +        : m_curl { curl }, m_job { WTFMove(job) } { }
> 
> These could be on different lines.

Okay.

> > Source/WebCore/platform/network/curl/CurlJobManager.h:64
> > +        other.m_curl = nullptr;
> 
> It's probably not necessary to null out a pointer of something that was
> moved into this constructor.

Not necessary, but it is safe to clear that because move semantics doesn't prevent accessing old instance:

auto job2 = WTFMove(job);
job.curlHandle(); // <-- still accessible 

At least VS2015 allow this. I think it is still safe and worth clearing out the value will make the future problem clear.

> > Source/WebCore/platform/network/curl/ResourceHandleCurl.cpp:712
> > +    int splitPos = header.find(":");
> 
> auto or size_t.
> Also please rename splitPos to splitPosition in a followup patch.

got it.