Bug 138835 - Move Blob loading out of ResourceHandle
Summary: Move Blob loading out of ResourceHandle
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: youenn fablet
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-18 08:00 PST by youenn fablet
Modified: 2017-10-14 16:54 PDT (History)
5 users (show)

See Also:


Attachments
WIP (47.46 KB, patch)
2014-11-19 01:50 PST, youenn fablet
no flags Details | Formatted Diff | Diff
Patch (34.83 KB, patch)
2014-11-20 03:30 PST, youenn fablet
no flags Details | Formatted Diff | Diff
Rebasing (34.80 KB, patch)
2014-11-20 05:25 PST, youenn fablet
no flags Details | Formatted Diff | Diff
WIP: Moving BlobResourceHandle out of ResourceHandle (65.33 KB, patch)
2014-11-25 00:01 PST, youenn fablet
no flags Details | Formatted Diff | Diff
Introducing generic ResourceHandle replacement (51.45 KB, patch)
2014-12-02 05:59 PST, youenn fablet
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description youenn fablet 2014-11-18 08:00:21 PST
As discussed in bug 109199 and bug 137005, it would be appropriate to move blob loading outside of ResourceHandle.
Probably blob loading should always be done in the Web Process and not in the Networking Process.
Comment 1 youenn fablet 2014-11-18 08:08:43 PST
One approach is to subclass SubresourceLoader to specialize it for different resource types: blobs, data URLs...
That would mean moving ResourceHandle-specific ResourceLoader stuff in specialized classes for WK1 and WK2.
That may also mean merging ResourceLoader and SubresourceLoader in one abstract class.
Comment 2 youenn fablet 2014-11-19 01:50:06 PST
Created attachment 241851 [details]
WIP
Comment 3 Antti Koivisto 2014-11-19 02:23:12 PST
Comment on attachment 241851 [details]
WIP

View in context: https://bugs.webkit.org/attachment.cgi?id=241851&action=review

> Source/WebCore/loader/blob/BlobResourceLoader.h:17
> +class BlobResourceLoader final : public SubresourceLoader {
> +public:

Instead of subclassing SubresourceLoader I think it would be better to just put blob functionality to SubresourceLoader itself. It is ok to have extra members for that case.

We probably want to eliminate the ResourceLoader class hierarchy and just smash SubresourceLoader and ResourceLoader together eventually. The only other subclass is the obscure NetscapePlugInStreamLoader.

> Source/WebCore/loader/blob/BlobResourceLoader.h:35
> +    // FIXME: Integrate implementation directly within BlobResourceLoader
> +    RefPtr<BlobResourceHandle> m_blobHandle;

Or maybe BlobResourceHandle just needs a better name. I think encapsulating blob specific functionality to a class makes sense, it just better to do it without subclassing.
Comment 4 Antti Koivisto 2014-11-19 02:27:32 PST
> Instead of subclassing SubresourceLoader I think it would be better to just
> put blob functionality to SubresourceLoader itself. It is ok to have extra
> members for that case.

... or really hooks for the blob functionality, along the lines of

SubresourceLoader::scheduleLoad() 
{
if (isBlob) {
   m_blobData->load();
   return;
}
...regular load
Comment 5 youenn fablet 2014-11-20 03:30:26 PST
Created attachment 241938 [details]
Patch
Comment 6 youenn fablet 2014-11-20 05:25:36 PST
Created attachment 241941 [details]
Rebasing
Comment 7 Alexey Proskuryakov 2014-11-20 11:18:52 PST
Comment on attachment 241941 [details]
Rebasing

View in context: https://bugs.webkit.org/attachment.cgi?id=241941&action=review

> Source/WebCore/ChangeLog:3
> +        Move Blob loading in Web Process.

I thought that what we were discussing was moving Blob URL resolution to Web Process, not actual blob loading.

I don't see how it makes sense to handle blobs in Web Process, give that multiple processes can use the same blob.
Comment 8 Alexey Proskuryakov 2014-11-21 15:50:10 PST
The original idea behind blobs is that they are truly large binary objects. They exist primarily to avoid the need to move data between networking process and web process.

I think that blobs somewhat outgrew this vision, yet one way or another, they are still very much a networking concept. They are needed to deal with either uploads or downloads. 

One thing that we do need to do is actually implement this vision for XMLHttpRequest. Currently, we send all loaded data to WebProcess, and immediately send it back to create a blob. That's just crazy.
Comment 9 youenn fablet 2014-11-21 16:52:25 PST
(In reply to comment #7)
> Comment on attachment 241941 [details]
> Rebasing
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=241941&action=review
> 
> > Source/WebCore/ChangeLog:3
> > +        Move Blob loading in Web Process.
> 
> I thought that what we were discussing was moving Blob URL resolution to Web
> Process, not actual blob loading.

Not sure about the exact meaning of the terms you are using.
The goal is to move to the Web Process the translation of a BlobData into a HTTP response. If we want to do that, the Web Process should have a way to get a BlobData from its URL. One way would be to extend BlobRegistryProxy like it is done for Blob size. 

> I don't see how it makes sense to handle blobs in Web Process, give that
> multiple processes can use the same blob.

Agreed.
I wonder though how often one blob created within a Web Process is actually used in another Web Process.
Comment 10 youenn fablet 2014-11-21 16:55:12 PST
(In reply to comment #8)
> The original idea behind blobs is that they are truly large binary objects.
> They exist primarily to avoid the need to move data between networking
> process and web process.
> 
> I think that blobs somewhat outgrew this vision, yet one way or another,
> they are still very much a networking concept. They are needed to deal with
> either uploads or downloads. 
> 
> One thing that we do need to do is actually implement this vision for
> XMLHttpRequest. Currently, we send all loaded data to WebProcess, and
> immediately send it back to create a blob. That's just crazy.

Definitely.
IIRC, Blink solution would be something like adding a loading parameter to store  the data as a blob directly within the Networking Process. Probably data is still transferred to the Web Process/XHR so that its state gets updated (progress events for instance).

A different approach is to store BlobData in Web Processes, the Networking Process managing a mapping of all blob URLs with the one Web Process that owns the corresponding BlobData. Not sure the added complexity is worth the effort.
Comment 11 youenn fablet 2014-11-25 00:01:12 PST
Created attachment 242187 [details]
WIP: Moving BlobResourceHandle out of ResourceHandle
Comment 12 youenn fablet 2014-11-25 01:02:07 PST
(In reply to comment #11)
> Created attachment 242187 [details]
> WIP: Moving BlobResourceHandle out of ResourceHandle

This patch introduces a new interface between ResourceLoader and ResourceHandle. It is basically an abstract version of ResourceHandle, tentatively called ResourceResolver.
I think we need a clean interface for resource resolution, ResourceHandle cannot be that interface obviously.

The plan would be the following:
- Introduce ResourceResolver interface and make ResourceHandle and BlobResourceHandle derive from it (WIP patch is about that)
- Add DataURL resolver using ResourceResolver interface

The next step would be to reconcile ResourceLoader loading w/o Networking process.
The steps would be:
- Split ResourceHandleClient in two: a generic ResourceResolverClient and a more specific ResourceHandleClient
- Make WebResourceLoader derive from ResourceResolver. Let it be stored as the ResourceLoader handle. WebResourceLoader would be made responsible of communicating with the networking process (defers loading, cancel... or on the other way didReceiveResponse...)
- Create a new  class (probably specific to WK1) that implements ResourceHandleClient and abstracts ResourceLoader from all ResourceHandleClient specific callbacks. Let it be stored as the ResourceLoader handle. This will allow ResourceLoader to implement ResourceResolverClient only
- Make other ResourceHandleClient implementing classes (PingLoader and ApplicationCacheGroup) derive from ResourceResolverClient only and use the WK1/WK2 ResourceResolver classes, not ResourceHandle directly.

A further step may consist in improving ResourceHandle directly, for instance:
- Changing names for ResourceHandle and ResourceHandleClient
- Splitting ResourceHandle in several platform-specific classes (cf, curl, soup), all implementing ResourceResolver or a subclass of it.
Comment 13 youenn fablet 2014-12-02 05:59:36 PST
Created attachment 242407 [details]
Introducing generic ResourceHandle replacement
Comment 14 youenn fablet 2014-12-02 07:38:28 PST
(In reply to comment #13)
> Created attachment 242407 [details]
> Introducing generic ResourceHandle replacement

This patch introduces a simple interface (ResourceResolver) from which BlobResourceHandle derives (and ResourceHandle should as well).
I did the exercise of migrating from ResourceHandle to ResourceResolver the different loaders (resource loader, network resource loader, ping and appcache loaders, not had time to port download stuff though).
This works smoothly. I can upload it somewhere if that helps the review process.

I am not cq? the patch as it probably fails compiling on Windows (missing project solution update).
Comment 15 youenn fablet 2014-12-05 07:40:48 PST
(In reply to comment #13)
> Created attachment 242407 [details]
> Introducing generic ResourceHandle replacement

I uploaded commits that would do the migration from ResourceHandle/ResourceHandleClient to ResourceResolver/ResourceResolverClient at: https://github.com/canonresearchfrance/webkit/commits/resourcehandle-to-resourceresolver-migration

The branch HEAD is compiling and seems to run well for GTK.
It is compiling in Mac, seems to run well for mac-wk2, mac-wk1 is crashing.
Debug build is untested.

The commits that would be missing are:
- Making WebResourceLoader (or equivalent) to implement ResourceResolver interface
- Moving Ping and Appcache to use networking process for WK2 (through WebResourceLoader or equivalent)
- Unifying ResourceLoader WK1 and WK2 implementation by having a ResourceResolver in both cases, all callbacks going to ResourceLoader directly for WK1 and WK2).
- Clean-up ResourceLoader from WK1 specific stuff (moving ResourceHandleClient callback implementation somewhere else)