Bug 191649

Summary: [Curl] Implement synchronous data url handling for NetworkDataTask on Curl Port.
Product: WebKit Reporter: Basuke Suzuki <Basuke.Suzuki>
Component: PlatformAssignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: achristensen, Basuke.Suzuki, chris.reid, don.olmstead, Hironori.Fujii, ross.kirsling, takashi.komori, youennf
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch none

Description Basuke Suzuki 2018-11-14 12:50:20 PST
I don't know the policy of WebKit for synchronous data: url handling, but it falls to NetworkDataTask and libcurl cannot handle the request so that it fails inside network process. Current plan is to block the request in NetworkDataTask.
Comment 1 Takashi Komori 2019-02-27 00:00:43 PST
Created attachment 363078 [details]
Patch
Comment 2 Takashi Komori 2019-02-27 00:06:34 PST
This patch enables to access to data url synchronously.
If implement it, we don't have to block the requests.
Comment 3 Basuke Suzuki 2019-02-27 13:50:38 PST
Comment on attachment 363078 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363078&action=review

This is informal review.

> Source/WebKit/ChangeLog:3
> +        [Curl] Prevent Data URL Scheme to be called from NetworkDataTask.

Please make the title matched with the bug. Sorry I've changed the bug title to match what actually this patch does.

> Source/WebKit/ChangeLog:8
> +        Implement sync access to Data URL for curl port.

"Currently synchronous data url access is not handled in NetworkDataTask and passed to curl layer. Curl doesn't handle data url so that it just fails. Catching the synchronous data request at the beginning of the NetworkDataTask and handle it synchronously."

> Source/WebKit/NetworkProcess/curl/NetworkDataTaskCurl.cpp:462
> +void NetworkDataTaskCurl::loadDataURL(const URL& url)

This method is just invoke DataURLDecode::decode with a big chunk of response handling lambda. How about moving out the lambda as an method such as respondToDataRequest() and move invocation of decode() to the constructor above? Callback chain makes following the flow of code harder. If it happens, it should be nice to be one place together to understand the overview of the structure.

ex)
    if (requestWithCredentials.url().protocolIsData()) {
        callOnMainThread([this, url = requestWithCredentials.url()] {
            DataURLDecoder::ScheduleContext scheduleContext;
            DataURLDecoder::decode(url, scheduleContext, [this, protectedThis = makeRef(*this), url](auto decodeResult) mutable {
                respondToDataRequest(url, decodeResult);
            });
        });
        return;
    }

> Source/WebKit/NetworkProcess/curl/NetworkDataTaskCurl.cpp:484
> +        dataResponse.setSource(ResourceResponse::Source::Network);

This is out of topic from this bug, but I'm not sure this is correct. I know this comes from other part of code, but it can be new Source type.
https://github.com/webkit/webkit/blob/master/Source/WebCore/loader/ResourceLoader.cpp#L276

> LayoutTests/platform/wincairo-wk1/TestExpectations:17
> +fast/encoding/char-decoding.html [ Skip ]

You can just skip fast/encoding as it used to be because you didn't do any change on WK Legacy.
fast/encoding [ Skip ]
Comment 4 Basuke Suzuki 2019-02-27 23:51:36 PST
Comment on attachment 363078 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363078&action=review

> Source/WebKit/NetworkProcess/curl/NetworkDataTaskCurl.cpp:53
> +        callOnMainThread([this, url = requestWithCredentials.url()] {

No need to protectThis here? It may be ref counted by caller, but it should be captured by this lambda to make sure `this` is alive at the execution.

>> Source/WebKit/NetworkProcess/curl/NetworkDataTaskCurl.cpp:462
>> +void NetworkDataTaskCurl::loadDataURL(const URL& url)
> 
> This method is just invoke DataURLDecode::decode with a big chunk of response handling lambda. How about moving out the lambda as an method such as respondToDataRequest() and move invocation of decode() to the constructor above? Callback chain makes following the flow of code harder. If it happens, it should be nice to be one place together to understand the overview of the structure.
> 
> ex)
>     if (requestWithCredentials.url().protocolIsData()) {
>         callOnMainThread([this, url = requestWithCredentials.url()] {
>             DataURLDecoder::ScheduleContext scheduleContext;
>             DataURLDecoder::decode(url, scheduleContext, [this, protectedThis = makeRef(*this), url](auto decodeResult) mutable {
>                 respondToDataRequest(url, decodeResult);
>             });
>         });
>         return;
>     }

Sorry I forget about MSVC lambda issue, that multiple level of lambda causes lose `this` context. It is okay as is.
Comment 5 Don Olmstead 2019-02-28 12:19:29 PST
Comment on attachment 363078 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363078&action=review

Fix Basuke's comments and I think we're good to go

> Source/WebKit/NetworkProcess/curl/NetworkDataTaskCurl.cpp:485
> +        m_client->didReceiveResponse(WTFMove(dataResponse), [this, protectedThis = WTFMove(protectedThis), dataSize, data = result.data.releaseNonNull()](WebCore::PolicyAction) mutable {

protectedThis.copyRef() is what we've been doing when hitting this.
Comment 6 Takashi Komori 2019-03-04 17:50:50 PST
Comment on attachment 363078 [details]
Patch

I found a defect in my patch.
NetworkDataTask and CurlRequest can't handle the request which redirects to data URL.

I'll try to find solution.