Bug 45909 - BlobData should be copied for it to be used cross-thread in ThreadableBlobRegistry
Summary: BlobData should be copied for it to be used cross-thread in ThreadableBlobReg...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore JavaScript (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Jian Li
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-16 11:07 PDT by Jian Li
Modified: 2010-12-15 11:47 PST (History)
3 users (show)

See Also:


Attachments
Proposed Patch (2.36 KB, patch)
2010-09-16 11:24 PDT, Jian Li
levin: review+
jianli: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jian Li 2010-09-16 11:07:57 PDT
BlobData should be copied for it to be used cross-thread in ThreadableBlobRegistry. This is to pull part of the fix from bug 45576. Other part of the fix remain in that bug and will be relanded when the discussion is settled.
Comment 1 Jian Li 2010-09-16 11:24:30 PDT
Created attachment 67820 [details]
Proposed Patch
Comment 2 David Levin 2010-09-16 11:27:50 PDT
Comment on attachment 67820 [details]
Proposed Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=67820&action=prettypatch

> LayoutTests/ChangeLog:8
> +        Also fix a test issue. 

This is a bit generic. It would be nice to say what test issue is being fixed.
Comment 3 Michael Nordman 2010-09-17 14:24:46 PDT
ownership of the BlobData instance is being passed to the registry... why does it need to be copied?
Comment 4 Jian Li 2010-09-17 14:29:08 PDT
This is because BlobData contains String and KURL that need to be copied explicitly in order for it to be used in other thread.
Comment 5 Michael Nordman 2010-09-17 14:35:50 PDT
what about the underlying CString's in BlobDataItems... do copies of those strings need to be made?
Comment 6 Jian Li 2010-09-17 15:01:30 PDT
(In reply to comment #5)
> what about the underlying CString's in BlobDataItems... do copies of those strings need to be made?

This is fine. See comment at line 42 in BlobData.cpp.
Comment 7 Michael Nordman 2010-09-17 15:12:02 PDT
I see that line...
    data = item.data; // This is OK because the underlying storage is Vector<char>.
... and it appears that a copy is being made

My point is do we need to make a copy since the registry has been given of ownership. It looks like a copy will be made and the original promptly deleted, so why make the copy?

Sorry for harping on the copies, but I think we should expect largish amounts of data in these elements. I can't help but notice when I see a copy being made. I haven't counted the total number of memcpys involved to actually get one of these registered in chrome... that should be interesting.
Comment 8 David Levin 2010-09-17 15:23:43 PDT
(In reply to comment #7)
> I see that line...
>     data = item.data; // This is OK because the underlying storage is Vector<char>.
> ... and it appears that a copy is being made
> 
> My point is do we need to make a copy since the registry has been given of ownership. It looks like a copy will be made and the original promptly deleted, so why make the copy?
> 
> Sorry for harping on the copies, but I think we should expect largish amounts of data in these elements. I can't help but notice when I see a copy being made. I haven't counted the total number of memcpys involved to actually get one of these registered in chrome... that should be interesting.

fwiw, strings may be "copied" very quickly actually.

The key problem is that once you have a String (which is ref counted in a non-threadsafe manner) you don't know who else on the thread may have it also referenced.

It seems best that we get this code correct and then optimize.
Comment 9 David Levin 2010-09-17 15:24:39 PDT
(In reply to comment #8)
> It seems best that we get this code correct and then optimize.

I meant to say optimize as needed. :)
Comment 10 Michael Nordman 2010-09-17 15:37:30 PDT
Ok... I'll try to continue to point out the unneeded copies and continue to wonder how many years it will take for them to be removed ;)
Comment 11 David Levin 2010-09-17 16:03:14 PDT
michaeln pointed out that this line is incorrrect: 
  data = item.data; // This is OK because the underlying storage is Vector<char>.

Because data isn't a Vector<char> nor a wrapper around one. It is a CString which is a thin wrapper around a RefCounted object.
Comment 12 Jian Li 2010-09-17 16:06:19 PDT
(In reply to comment #11)
> michaeln pointed out that this line is incorrrect: 
>   data = item.data; // This is OK because the underlying storage is Vector<char>.
> 
> Because data isn't a Vector<char> nor a wrapper around one. It is a CString which is a thin wrapper around a RefCounted object.

Will fix. Thanks.
Comment 13 Michael Nordman 2010-09-17 17:00:44 PDT
(In reply to comment #7)
> I see that line...
>     data = item.data; // This is OK because the underlying storage is Vector<char>.
> ... and it appears that a copy is being made
> 
> My point is do we need to make a copy since the registry has been given of ownership. It looks like a copy will be made and the original promptly deleted, so why make the copy?
> 
> Sorry for harping on the copies, but I think we should expect largish amounts of data in these elements. I can't help but notice when I see a copy being made. I haven't counted the total number of memcpys involved to actually get one of these registered in chrome... that should be interesting.

Some observations from our chat...

We're using 'copy()' semantics here in order to pass ownership from one thread to another. Copy works for that, but is also overkill for that since we don't need to retain an instance for use on the original thread. A method like 'detachFromThread' or 'disconnectFromThread' might make a better fit than this usage of blobData.copy().

I wonder if we should even have a copy() method on BlobData? I would prefer a way to produce a deep copy were not so readily available to help avoid that from happening.

COM has semantics like prepareToSwitchThreads() and switchThreads() for objects with thread affinity, where the former happens on the originating thread and the later on the new 'owning' thread. Maybe something along those lines for BlobData objects?
Comment 14 Eric Seidel (no email) 2010-12-14 01:59:24 PST
What's the status on this patch?
Comment 15 Jian Li 2010-12-15 11:42:03 PST
Forgot to close.

Committed as http://trac.webkit.org/changeset/67646.
Comment 16 David Levin 2010-12-15 11:47:15 PST
(In reply to comment #15)
> Forgot to close.
> 
> Committed as http://trac.webkit.org/changeset/67646.

Is there a new bug for the semantics that Michael described to reduce copies?