Bug 115804

Summary: Avoid unnecessary data copies when loading subresources with DoNotBufferData option
Product: WebKit Reporter: Carlos Garcia Campos <cgarcia>
Component: Page LoadingAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, beidson, buildbot, commit-queue, darin, esprehn+autocc, japhet, pnormand, rniwa, webkit-ews
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 118169    
Bug Blocks: 73743    
Attachments:
Description Flags
Patch
buildbot: commit-queue-
Try to fix mac build
beidson: review-
Updated patch
buildbot: commit-queue-
Archive of layout-test-results from webkit-ews-09 for mac-mountainlion-wk2
none
New patch
webkit-ews: commit-queue-
Try to fix the build with CSS_SHADERS
buildbot: commit-queue-
Archive of layout-test-results from webkit-ews-05 for mac-mountainlion
none
Updated patch
buildbot: commit-queue-
Archive of layout-test-results from webkit-ews-01 for mac-mountainlion
none
Try to fix mac tests
none
Patch rebased to apply on current git master
darin: review-
Updated patch
darin: review+
Patch for landing none

Description Carlos Garcia Campos 2013-05-08 07:59:50 PDT
When DoNotBufferData option is used to load a resource, its data is always copied before sending it to the CachedResource. Most of the cached resources ignore the incremental data and wait until all data has been received to save the ResourceBuffer, that will be NULL anyway when DoNotBufferData is used. CachedRawResource notifies its clients about the incremental data, but it doesn't save the data when DoNotBufferData option is present. In those cases we are unnecessary copying the data.
Comment 1 Carlos Garcia Campos 2013-05-08 08:04:44 PDT
Created attachment 201075 [details]
Patch
Comment 2 Build Bot 2013-05-08 08:39:48 PDT
Comment on attachment 201075 [details]
Patch

Attachment 201075 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.appspot.com/results/434337
Comment 3 Build Bot 2013-05-08 08:42:43 PDT
Comment on attachment 201075 [details]
Patch

Attachment 201075 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.appspot.com/results/434336
Comment 4 Carlos Garcia Campos 2013-05-08 08:45:03 PDT
Created attachment 201079 [details]
Try to fix mac build
Comment 5 Brady Eidson 2013-05-08 10:45:36 PDT
Comment on attachment 201079 [details]
Try to fix mac build

View in context: https://bugs.webkit.org/attachment.cgi?id=201079&action=review

Good direction.

We plan to do a pretty significant overhaul of Cache/Loader land somewhat soon, so it'll be important to make very subtle things like this more difficult to screw up while refactoring.

That's the motivation for my review comments below.

> Source/WebCore/loader/cache/CachedImage.cpp:402
> +void CachedImage::data(const char* data, int length)
> +{
> +    CachedImage::data(ResourceBuffer::create(data, length), false);
> +}

Let's throw the ASSERT(m_options.dataBufferingPolicy == DoNotBufferData); here, too.

> Source/WebCore/loader/cache/CachedResource.h:105
> +    virtual void data(const char* /*data*/, int /*length*/) { }

Lets give this an implementation in CachedResource.cpp and have it ASSERT(m_options.dataBufferingPolicy == DoNotBufferData)

Assuming all tests continue passing without that ASSERT firing, I'd like us to take it a step further and name this method something different instead of just overloading ::data().  "unbufferedData()" is my first instinct, though I'm aware that's not perfect.
Comment 6 Carlos Garcia Campos 2013-05-08 10:58:59 PDT
Created attachment 201083 [details]
Updated patch

Patch updated according to review comments.
Comment 7 Carlos Garcia Campos 2013-05-08 11:02:12 PDT
(In reply to comment #5)
> (From update of attachment 201079 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=201079&action=review
> 
> Good direction.
> 
> We plan to do a pretty significant overhaul of Cache/Loader land somewhat soon, so it'll be important to make very subtle things like this more difficult to screw up while refactoring.
> 
> That's the motivation for my review comments below.
> 
> > Source/WebCore/loader/cache/CachedImage.cpp:402
> > +void CachedImage::data(const char* data, int length)
> > +{
> > +    CachedImage::data(ResourceBuffer::create(data, length), false);
> > +}
> 
> Let's throw the ASSERT(m_options.dataBufferingPolicy == DoNotBufferData); here, too.

Sure.

> > Source/WebCore/loader/cache/CachedResource.h:105
> > +    virtual void data(const char* /*data*/, int /*length*/) { }
> 
> Lets give this an implementation in CachedResource.cpp and have it ASSERT(m_options.dataBufferingPolicy == DoNotBufferData)
> 
> Assuming all tests continue passing without that ASSERT firing, I'd like us to take it a step further and name this method something different instead of just overloading ::data().  "unbufferedData()" is my first instinct, though I'm aware that's not perfect.

Yes, I also thought about the names, my initial idea was to rename the current one to buffer or resourceBuffer and this new as data, but I wanted to keep the patch as simple as possible and buffer doesn't sound like a great name anyway. For now, I left both as data so that they can be renamed when refactoring. 

Do apple EWS bots run the tests with asserts enabled?
Comment 8 Ryosuke Niwa 2013-05-08 11:03:57 PDT
Comment on attachment 201079 [details]
Try to fix mac build

View in context: https://bugs.webkit.org/attachment.cgi?id=201079&action=review

>>> Source/WebCore/loader/cache/CachedResource.h:105
>>> +    virtual void data(const char* /*data*/, int /*length*/) { }
>> 
>> Lets give this an implementation in CachedResource.cpp and have it ASSERT(m_options.dataBufferingPolicy == DoNotBufferData)
>> 
>> Assuming all tests continue passing without that ASSERT firing, I'd like us to take it a step further and name this method something different instead of just overloading ::data().  "unbufferedData()" is my first instinct, though I'm aware that's not perfect.
> 
> Yes, I also thought about the names, my initial idea was to rename the current one to buffer or resourceBuffer and this new as data, but I wanted to keep the patch as simple as possible and buffer doesn't sound like a great name anyway. For now, I left both as data so that they can be renamed when refactoring. 
> 
> Do apple EWS bots run the tests with asserts enabled?

No. You have to run tests manually on a debug build yourself as clearly outlined in http://www.webkit.org/coding/contributing.html.
Comment 9 Brady Eidson 2013-05-08 11:11:05 PDT
(In reply to comment #7)
> (In reply to comment #5)

> > > Source/WebCore/loader/cache/CachedResource.h:105
> > > +    virtual void data(const char* /*data*/, int /*length*/) { }

> > Assuming all tests continue passing without that ASSERT firing, I'd like us to take it a step further and name this method something different instead of just overloading ::data().  "unbufferedData()" is my first instinct, though I'm aware that's not perfect.
> 
> Yes, I also thought about the names, my initial idea was to rename the current one to buffer or resourceBuffer and this new as data, but I wanted to keep the patch as simple as possible and buffer doesn't sound like a great name anyway. For now, I left both as data so that they can be renamed when refactoring. 

Please name it something else now.

"data()" is already such a truly terrible name, and overloading truly terrible names is even more truly terrible.
Comment 10 Build Bot 2013-05-08 11:38:26 PDT
Comment on attachment 201083 [details]
Updated patch

Attachment 201083 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.appspot.com/results/353394

New failing tests:
http/tests/cache/cached-main-resource.html
Comment 11 Build Bot 2013-05-08 11:38:28 PDT
Created attachment 201088 [details]
Archive of layout-test-results from webkit-ews-09 for mac-mountainlion-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: webkit-ews-09  Port: mac-mountainlion-wk2  Platform: Mac OS X 10.8.2
Comment 12 Carlos Garcia Campos 2013-05-08 12:17:41 PDT
(In reply to comment #8)
> (From update of attachment 201079 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=201079&action=review
> 
> >>> Source/WebCore/loader/cache/CachedResource.h:105
> >>> +    virtual void data(const char* /*data*/, int /*length*/) { }
> >> 
> >> Lets give this an implementation in CachedResource.cpp and have it ASSERT(m_options.dataBufferingPolicy == DoNotBufferData)
> >> 
> >> Assuming all tests continue passing without that ASSERT firing, I'd like us to take it a step further and name this method something different instead of just overloading ::data().  "unbufferedData()" is my first instinct, though I'm aware that's not perfect.
> > 
> > Yes, I also thought about the names, my initial idea was to rename the current one to buffer or resourceBuffer and this new as data, but I wanted to keep the patch as simple as possible and buffer doesn't sound like a great name anyway. For now, I left both as data so that they can be renamed when refactoring. 
> > 
> > Do apple EWS bots run the tests with asserts enabled?
> 
> No. You have to run tests manually on a debug build yourself as clearly outlined in http://www.webkit.org/coding/contributing.html.

Sure, I know, problem is that I can't build debug builds in my laptop, I'll try to build with assertions enabled in any case. But I asked also because sometimes even if tests pass locally, it still fails in other platforms or some tests are skipped for my platform and not others, so if mac ews ran tests with assertions enabled, I could be extra sure.
Comment 13 Carlos Garcia Campos 2013-05-08 12:19:04 PDT
(In reply to comment #10)
> (From update of attachment 201083 [details])
> Attachment 201083 [details] did not pass mac-wk2-ews (mac-wk2):
> Output: http://webkit-queues.appspot.com/results/353394
> 
> New failing tests:
> http/tests/cache/cached-main-resource.html

This passes for me locally, I'll check the results attached.
Comment 14 Carlos Garcia Campos 2013-05-08 12:24:15 PDT
(In reply to comment #9)
> (In reply to comment #7)
> > (In reply to comment #5)
> 
> > > > Source/WebCore/loader/cache/CachedResource.h:105
> > > > +    virtual void data(const char* /*data*/, int /*length*/) { }
> 
> > > Assuming all tests continue passing without that ASSERT firing, I'd like us to take it a step further and name this method something different instead of just overloading ::data().  "unbufferedData()" is my first instinct, though I'm aware that's not perfect.
> > 
> > Yes, I also thought about the names, my initial idea was to rename the current one to buffer or resourceBuffer and this new as data, but I wanted to keep the patch as simple as possible and buffer doesn't sound like a great name anyway. For now, I left both as data so that they can be renamed when refactoring. 
> 
> Please name it something else now.
> 
> "data()" is already such a truly terrible name, and overloading truly terrible names is even more truly terrible.

Yeah, I agree, I'll do it, I thought you also preferred to keep it like this for now and rename it as part of the refactoring.
Comment 15 Carlos Garcia Campos 2013-05-09 11:03:53 PDT
Created attachment 201252 [details]
New patch

I've renamed the methods, actually I've split data into 3 addDataBuffer and addData to add data chunks to the resource when buffering/not buffering and finishLoading called when all data has been received. this way we get rid of the boolean parameter and cached resources only interested in the final data, only implement finishLoading. I've also noticed that sendDataToResource is confusing, so I've removed it to call resource methods directly. This way is more obvious when data needs to be copied and when to call addData, addDataBuffer or finishLoading.
Comment 16 Early Warning System Bot 2013-05-09 11:13:18 PDT
Comment on attachment 201252 [details]
New patch

Attachment 201252 [details] did not pass qt-ews (qt):
Output: http://webkit-queues.appspot.com/results/433074
Comment 17 Early Warning System Bot 2013-05-09 11:14:51 PDT
Comment on attachment 201252 [details]
New patch

Attachment 201252 [details] did not pass qt-wk2-ews (qt-wk2):
Output: http://webkit-queues.appspot.com/results/329153
Comment 18 Build Bot 2013-05-09 11:43:36 PDT
Comment on attachment 201252 [details]
New patch

Attachment 201252 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.appspot.com/results/279089
Comment 19 Build Bot 2013-05-09 11:53:17 PDT
Comment on attachment 201252 [details]
New patch

Attachment 201252 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.appspot.com/results/329175
Comment 20 Carlos Garcia Campos 2013-05-09 12:16:44 PDT
Created attachment 201262 [details]
Try to fix the build with CSS_SHADERS

Forgot to update CachedShader
Comment 21 Build Bot 2013-05-09 18:11:06 PDT
Comment on attachment 201262 [details]
Try to fix the build with CSS_SHADERS

Attachment 201262 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.appspot.com/results/398171

New failing tests:
platform/mac/fast/loader/file-url-mimetypes-3.html
platform/mac/fast/loader/file-url-mimetypes.html
Comment 22 Build Bot 2013-05-09 18:11:08 PDT
Created attachment 201307 [details]
Archive of layout-test-results from webkit-ews-05 for mac-mountainlion

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: webkit-ews-05  Port: mac-mountainlion  Platform: Mac OS X 10.8.2
Comment 23 Carlos Garcia Campos 2013-05-09 23:11:46 PDT
(In reply to comment #22)
> Created an attachment (id=201307) [details]
> Archive of layout-test-results from webkit-ews-05 for mac-mountainlion
> 
> The attached test failures were seen while running run-webkit-tests on the mac-ews.
> Bot: webkit-ews-05  Port: mac-mountainlion  Platform: Mac OS X 10.8.2

Looking at the results it seems the results are just in a different order, not sure why, this patch shouldn't affect the mime types nor the order the resources are loaded.
Comment 24 Carlos Garcia Campos 2013-05-13 09:34:20 PDT
Created attachment 201576 [details]
Updated patch

Added ASSERT also to CachedTextTrack::addDataBuffer() that I had forgotten.
Comment 25 Build Bot 2013-05-13 14:52:57 PDT
Comment on attachment 201576 [details]
Updated patch

Attachment 201576 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.appspot.com/results/467295

New failing tests:
platform/mac/fast/loader/file-url-mimetypes-3.html
platform/mac/fast/loader/file-url-mimetypes.html
Comment 26 Build Bot 2013-05-13 14:53:00 PDT
Created attachment 201637 [details]
Archive of layout-test-results from webkit-ews-01 for mac-mountainlion

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: webkit-ews-01  Port: mac-mountainlion  Platform: Mac OS X 10.8.2
Comment 27 Carlos Garcia Campos 2013-05-14 00:41:25 PDT
(In reply to comment #25)
> (From update of attachment 201576 [details])
> Attachment 201576 [details] did not pass mac-ews (mac):
> Output: http://webkit-queues.appspot.com/results/467295
> 
> New failing tests:
> platform/mac/fast/loader/file-url-mimetypes-3.html
> platform/mac/fast/loader/file-url-mimetypes.html

Any hint on why these are failing? The output is just in a different order, I've tried the tests locally and I get the correct order, so I can't reproduce it. I don't see how the patch can affect the order resources are loaded, MIME types are dumped in didReceiveResponse, and the patch only changes data and finish that should happen after the response is received.
Comment 28 Carlos Garcia Campos 2013-05-23 09:30:54 PDT
Created attachment 202722 [details]
Try to fix mac tests

Ok, I think I found the issue, with the previous patch, in case of error the onload event was also triggered, since the test connects a handler to both onload and onerror, resources could start on parallel.
Comment 29 Carlos Garcia Campos 2013-06-04 04:39:30 PDT
Created attachment 203681 [details]
Patch rebased to apply on current git master

Previous patch didn't apply anymore.
Comment 30 Carlos Garcia Campos 2013-06-05 11:17:22 PDT
Could someone review this patch please?, it's blocking other improvements in the gstreamer media backend.
Comment 31 Darin Adler 2013-06-05 18:14:08 PDT
Comment on attachment 203681 [details]
Patch rebased to apply on current git master

View in context: https://bugs.webkit.org/attachment.cgi?id=203681&action=review

review- because of some not-entirely-harmless PassRefPtr mistakes

> Source/WebCore/ChangeLog:20
> +        incremental data chunks when not beffering and

typo: beffering

> Source/WebCore/loader/SubresourceLoader.cpp:243
> +        if (resourceData())
> +            m_resource->addDataBuffer(resourceData());

The resourceData function is not inlined, so it’s not great to call it twice. I would write:

    if (ResourceBuffer* resourceData = this->resourceData().get())
        m_resource->addDataBuffer(resourceData);
    else

And also we need to fix resourceData() since it’s return a PassRefPtr even though it is not transferring ownership.

> Source/WebCore/loader/cache/CachedCSSStyleSheet.h:50
> +        virtual void finishLoading(PassRefPtr<ResourceBuffer> data);

I’m not sure this argument needs a name. The name data says almost nothing!

> Source/WebCore/loader/cache/CachedFont.h:51
> +    virtual void finishLoading(PassRefPtr<ResourceBuffer> data);

Can this virtual function be marked OVERRIDE, FINAL, or both? Can it be made private?

> Source/WebCore/loader/cache/CachedImage.cpp:397
> +    addDataBuffer(data, false);

This use of a boolean is no good. With every caller passing a constant, it’s not readable at the call site. We typically use enums in cases like this to make the code readable.

> Source/WebCore/loader/cache/CachedImage.cpp:403
> +    addDataBuffer(ResourceBuffer::create(data, length), false);

Ditto.

> Source/WebCore/loader/cache/CachedImage.cpp:408
> +    addDataBuffer(data, true);

Ditto.

> Source/WebCore/loader/cache/CachedImage.cpp:410
> +        CachedResource::finishLoading(data);

This is always going to pass a 0 to CachedResource::finishLoading, because data has already passed off its resource buffer when calling the addDataBuffer function. We know that doesn’t matter because CachedResource::finishLoading ignores the data passed to it, but it’s really strange to code this way. Maybe we can at least just pass 0 here instead of pretending to pass data.

> Source/WebCore/loader/cache/CachedImage.h:80
> +    virtual void addData(const char* data, int length);

I don’t think int is the right type for length. The type unsigned could be OK, or maybe it needs to be size_t.

> Source/WebCore/loader/cache/CachedImage.h:81
> +    virtual void addDataBuffer(PassRefPtr<ResourceBuffer>);
> +    virtual void addData(const char* data, int length);
> +    virtual void finishLoading(PassRefPtr<ResourceBuffer>);

Can these virtual functions be marked OVERRIDE, FINAL, or both? Can they be made private?

> Source/WebCore/loader/cache/CachedRawResource.cpp:74
> +    CachedResourceHandle<CachedRawResource> protect(this);
> +    ASSERT(m_options.dataBufferingPolicy == DoNotBufferData);
> +    notifyDataReceived(data, length);

The protector here seems like it’s at the wrong level. It could instead be inside notifyDataReceived, since that’s the only function called by this one.

> Source/WebCore/loader/cache/CachedRawResource.h:55
> -    virtual void data(PassRefPtr<ResourceBuffer> data, bool allDataReceived);
> +    virtual void addDataBuffer(PassRefPtr<ResourceBuffer>);
> +    virtual void addData(const char* data, int length);
> +    virtual void finishLoading(PassRefPtr<ResourceBuffer>);

Can these virtual functions be marked OVERRIDE, FINAL or both?

> Source/WebCore/loader/cache/CachedRawResource.h:67
> +    void notifyDataReceived(const char* data, int length);

I know we’ve used this kind of naming before, but it’s horribly confusing: “notify data received”? Notify who? Notify the data? I think notifyClientsDataWasReceived is one possible better name. Or callDataReceivedForAllClients? Maybe there’s another one that’s clear.

> Source/WebCore/loader/cache/CachedResource.cpp:375
> +void CachedResource::addData(const char* data, int length)
> +{
> +    UNUSED_PARAM(data);
> +    UNUSED_PARAM(length);
> +    ASSERT(m_options.dataBufferingPolicy == DoNotBufferData);
> +}

I think it’s better to omit the argument names here.

> Source/WebCore/loader/cache/CachedResource.h:104
> +    virtual void addDataBuffer(PassRefPtr<ResourceBuffer> data);

I don’t think the argument name here is helpful or needed.

> Source/WebCore/loader/cache/CachedResource.h:106
> +    virtual void finishLoading(PassRefPtr<ResourceBuffer> data);

I don’t think the argument name here is helpful or needed.

> Source/WebCore/loader/cache/CachedSVGDocument.h:43
> +    virtual void finishLoading(PassRefPtr<ResourceBuffer> data);

Can these virtual functions be marked OVERRIDE, FINAL or both? I don’t think the argument name here is helpful or needed.

> Source/WebCore/loader/cache/CachedScript.cpp:89
> +    CachedResource::finishLoading(data);

This is always going to pass a 0 to CachedResource::finishLoading, because data has already passed off its resource buffer assigning to m_data. We know that doesn’t matter because CachedResource::finishLoading ignores the data passed to it, but it’s really strange to code this way. We could pass m_data.

> Source/WebCore/loader/cache/CachedScript.h:45
> +        virtual void finishLoading(PassRefPtr<ResourceBuffer> data);

Can these virtual functions be marked OVERRIDE, FINAL or both?

> Source/WebCore/loader/cache/CachedShader.cpp:66
> +    m_data = data;
> +    CachedResource::finishLoading(data);

This is always going to pass a 0 to CachedResource::finishLoading, because data has already passed off its resource buffer assigning to m_data. We know that doesn’t matter because CachedResource::finishLoading ignores the data passed to it, but it’s really strange to code this way. We could pass m_data. Seems this bug already existed in the old version of this function.

> Source/WebCore/loader/cache/CachedTextTrack.cpp:65
> +    addDataBuffer(data);
> +    CachedResource::finishLoading(data);

This is always going to pass a 0 to CachedResource::finishLoading, because data has already passed off its resource buffer calling addDataBuffer. We know that doesn’t matter because CachedResource::finishLoading ignores the data passed to it, but it’s really strange to code this way. We could pass 0, or go out of our way to actually pass the data buffer.

> Source/WebCore/loader/cache/CachedTextTrack.h:42
> +    virtual void addDataBuffer(PassRefPtr<ResourceBuffer>);
> +    virtual void finishLoading(PassRefPtr<ResourceBuffer>);

Can these virtual functions be marked OVERRIDE, FINAL or both? Can they be made private?

> Source/WebCore/loader/cache/CachedXSLStyleSheet.h:48
> +        virtual void finishLoading(PassRefPtr<ResourceBuffer> data);

Can these virtual functions be marked OVERRIDE, FINAL or both? Can they be made private?
Comment 32 Carlos Garcia Campos 2013-06-06 05:59:27 PDT
(In reply to comment #31)
> (From update of attachment 203681 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=203681&action=review
> 
> review- because of some not-entirely-harmless PassRefPtr mistakes

Thanks for the review. Since some of the issues are not actually introduced by this patch, I've opened separate bugs to fix them.

> > Source/WebCore/ChangeLog:20
> > +        incremental data chunks when not beffering and
> 
> typo: beffering

Oops.

> > Source/WebCore/loader/SubresourceLoader.cpp:243
> > +        if (resourceData())
> > +            m_resource->addDataBuffer(resourceData());
> 
> The resourceData function is not inlined, so it’s not great to call it twice. I would write:
> 
>     if (ResourceBuffer* resourceData = this->resourceData().get())
>         m_resource->addDataBuffer(resourceData);
>     else
> 
> And also we need to fix resourceData() since it’s return a PassRefPtr even though it is not transferring ownership.

Yes, this was actually a refactoring of current code, see SubresourceLoader::sendDataToResource(). I also don't understand why resourceData returns a PassRefPtr. See bug #117288.

> > Source/WebCore/loader/cache/CachedCSSStyleSheet.h:50
> > +        virtual void finishLoading(PassRefPtr<ResourceBuffer> data);
> 
> I’m not sure this argument needs a name. The name data says almost nothing!

Agree.
 
> > Source/WebCore/loader/cache/CachedFont.h:51
> > +    virtual void finishLoading(PassRefPtr<ResourceBuffer> data);
> 
> Can this virtual function be marked OVERRIDE, FINAL, or both? Can it be made private?

Yes, actually all of them can be made private and marked as OVERRIDE, see bug #117289.

> > Source/WebCore/loader/cache/CachedImage.cpp:397
> > +    addDataBuffer(data, false);
> 
> This use of a boolean is no good. With every caller passing a constant, it’s not readable at the call site. We typically use enums in cases like this to make the code readable.

Will fix it.

> > Source/WebCore/loader/cache/CachedImage.cpp:410
> > +        CachedResource::finishLoading(data);
> 
> This is always going to pass a 0 to CachedResource::finishLoading, because data has already passed off its resource buffer when calling the addDataBuffer function. We know that doesn’t matter because CachedResource::finishLoading ignores the data passed to it, but it’s really strange to code this way. Maybe we can at least just pass 0 here instead of pretending to pass data.

Ah, right, PassRefPtr is leaked and assigned to a RefPtr. I think it's confusing in this case where we chain up to the parent. I've changed data() to receive a raw pointer too in bug #117288, I think it makes the code less confusing and still allows the cached resource to take a reference.
 
> > Source/WebCore/loader/cache/CachedImage.h:80
> > +    virtual void addData(const char* data, int length);
> 
> I don’t think int is the right type for length. The type unsigned could be OK, or maybe it needs to be size_t.

I think this comes from ResourceHandle::didReceiveData() that also uses int for length. I agree it should be size_t. ResourceHandle::didSendData() uses unsigned long long though.

> > Source/WebCore/loader/cache/CachedRawResource.cpp:74
> > +    CachedResourceHandle<CachedRawResource> protect(this);
> > +    ASSERT(m_options.dataBufferingPolicy == DoNotBufferData);
> > +    notifyDataReceived(data, length);
> 
> The protector here seems like it’s at the wrong level. It could instead be inside notifyDataReceived, since that’s the only function called by this one.

Ok.

> > Source/WebCore/loader/cache/CachedRawResource.h:67
> > +    void notifyDataReceived(const char* data, int length);
> 
> I know we’ve used this kind of naming before, but it’s horribly confusing: “notify data received”? Notify who? Notify the data? I think notifyClientsDataWasReceived is one possible better name. Or callDataReceivedForAllClients? Maybe there’s another one that’s clear.

Fair enough, I'll rename it.
 
> > Source/WebCore/loader/cache/CachedResource.cpp:375
> > +void CachedResource::addData(const char* data, int length)
> > +{
> > +    UNUSED_PARAM(data);
> > +    UNUSED_PARAM(length);
> > +    ASSERT(m_options.dataBufferingPolicy == DoNotBufferData);
> > +}
> 
> I think it’s better to omit the argument names here.

Right, I don't know why I used UNUSED_PARAM in this case.
Comment 33 Darin Adler 2013-06-06 09:33:04 PDT
Comment on attachment 203681 [details]
Patch rebased to apply on current git master

View in context: https://bugs.webkit.org/attachment.cgi?id=203681&action=review

>>> Source/WebCore/loader/cache/CachedImage.h:80
>>> +    virtual void addData(const char* data, int length);
>> 
>> I don’t think int is the right type for length. The type unsigned could be OK, or maybe it needs to be size_t.
> 
> I think this comes from ResourceHandle::didReceiveData() that also uses int for length. I agree it should be size_t. ResourceHandle::didSendData() uses unsigned long long though.

Here’s my take on this:

1) When we want to be able to hold the size of something, size_t is the type that can accommodate anything we could have in memory, so it’s always big enough.

2) As an optimization to our data structures, many should use unsigned instead of size_t, since on 64-bit systems this can make our objects much smaller, and there are many things that simply can’t go past 2^32. I know that Andreas Kling and Antti are considering changes like this, including possibly changing Vector to use unsigned instead of size_t.

3) When we are dealing with the size of a file on disk or something that can be dealt with without mapping or loading the whole thing into memory all at once, we want to use a type like "unsigned long long" that can represent sizes larger than 2^32 even on 32-bit systems. I think that’s why ResourceHandle::didSendData uses this type, because there are code paths where it can send data that does not need to be resident in memory all at the same time, possibly including very large files.
Comment 34 Carlos Garcia Campos 2013-06-06 09:38:12 PDT
(In reply to comment #33)
> (From update of attachment 203681 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=203681&action=review
> 
> >>> Source/WebCore/loader/cache/CachedImage.h:80
> >>> +    virtual void addData(const char* data, int length);
> >> 
> >> I don’t think int is the right type for length. The type unsigned could be OK, or maybe it needs to be size_t.
> > 
> > I think this comes from ResourceHandle::didReceiveData() that also uses int for length. I agree it should be size_t. ResourceHandle::didSendData() uses unsigned long long though.
> 
> Here’s my take on this:
> 
> 1) When we want to be able to hold the size of something, size_t is the type that can accommodate anything we could have in memory, so it’s always big enough.
> 
> 2) As an optimization to our data structures, many should use unsigned instead of size_t, since on 64-bit systems this can make our objects much smaller, and there are many things that simply can’t go past 2^32. I know that Andreas Kling and Antti are considering changes like this, including possibly changing Vector to use unsigned instead of size_t.
> 
> 3) When we are dealing with the size of a file on disk or something that can be dealt with without mapping or loading the whole thing into memory all at once, we want to use a type like "unsigned long long" that can represent sizes larger than 2^32 even on 32-bit systems. I think that’s why ResourceHandle::didSendData uses this type, because there are code paths where it can send data that does not need to be resident in memory all at the same time, possibly including very large files.

This piece would be useful to be somewhere in the wiki. Actually I think this comes from ResourceBuffer(const char*, int) that comes from SharedBuffer(const char* data, int size) that indeed has a FIXME:

// FIXME: Use unsigned consistently, and check for invalid casts when calling into SharedBuffer from other code.                                                                          
    if (size < 0)
        CRASH();
Comment 35 Darin Adler 2013-06-06 09:39:50 PDT
(In reply to comment #34)
> This piece would be useful to be somewhere in the wiki.

Sure thing. Please put it up there somewhere and point me to it. I’ll happily help edit it.
Comment 36 Carlos Garcia Campos 2013-06-06 10:15:54 PDT
(In reply to comment #35)
> (In reply to comment #34)
> > This piece would be useful to be somewhere in the wiki.
> 
> Sure thing. Please put it up there somewhere and point me to it. I’ll happily help edit it.

Something like this? http://trac.webkit.org/wiki/TypesForSize
Comment 37 Carlos Garcia Campos 2013-06-07 11:42:58 PDT
Created attachment 204059 [details]
Updated patch

Patch rebased and updated to address review comments. In the end I got rid of the allDataReceived boolean parameter in CachedImage by refactoring the code a bit, since now we have different methods for incremental data chunks and finishLoading when all data has been received (and considering the multipart content case), I think it makes the code a bit easier to read.
Comment 38 Carlos Garcia Campos 2013-06-13 04:02:55 PDT
ping reviewers again :-)
Comment 39 Darin Adler 2013-06-13 10:22:22 PDT
Comment on attachment 204059 [details]
Updated patch

View in context: https://bugs.webkit.org/attachment.cgi?id=204059&action=review

Looks good.

> Source/WebCore/loader/SubresourceLoader.cpp:208
> +        // The loader delivers the data in a multipart section all at once, then sends eof.

Not sure this comment adds much. The comment two lines down seems to say the same thing. I know you moved this comment here another function, but I think we can just drop it.

> Source/WebCore/loader/cache/CachedImage.cpp:347
> +bool CachedImage::isValidDecodedImageSize() const

I think the name here should be “<cached image> has valid decoded image size” not “<cached image> is valid decoded image size”. So hasValidDecodedImageSize.

But I am not sure I understand the use of the word “valid” here to mean “smaller than the maximum”. I find this function quite confusing and a better name might help, and a why comment is needed.

The old code referred to this as “we have received all the data or the size is known”, but even that did not make the maximumDecodedImageSize feature clear.

I think this is actually “is ready to draw” or something like that.

> Source/WebCore/loader/cache/CachedImage.cpp:386
> +    // Go ahead and tell our observers to try to draw if the size is known.
> +    // Each chunk from the network causes observers to repaint, which will
> +    // force that chunk to decode.
> +    if (!isValidDecodedImageSize()) {
> +        error(errorOccurred() ? status() : DecodeError);
> +        if (inCache())
> +            memoryCache()->remove(this);
> +        return;
>      }

This comment is confusing. It says “tell observers to draw if the size is known”, but the code says “don’t tell observers to draw unless the size is known”. We should fix the comment since we reversed the code.

> Source/WebCore/loader/cache/CachedImage.cpp:405
> +    RefPtr<ResourceBuffer> resourceData = ResourceBuffer::create(data, length);
> +    addIncrementalDataBuffer(resourceData.get());

We don’t need a local variable here: It would read better as a one-liner.

> Source/WebCore/loader/cache/CachedImage.cpp:414
> +    if (!m_image && data) {
> +        // When loading multipart content all data is delivered in finishLoading.
> +        createImage();
>      }

This is not a good comment to have here. Even if all data wasn’t delivered in finishLoading, the first data could be delivered here for some other reason. So this comment is a bit of trivia that does not correctly answer the question “why do we have this code?” We would not want to remove this if we removed support for multipart content, for example, or if we changed how multipart content loading works.

> Source/WebCore/loader/cache/CachedImage.h:68
> +    virtual void addDataBuffer(ResourceBuffer*) OVERRIDE;
> +    virtual void finishLoading(ResourceBuffer*) OVERRIDE;

Can these be private instead of public?

> Source/WebCore/loader/cache/CachedRawResource.cpp:52
> +    incrementalDataLength = data->size() - previousDataLength;

Do we have a guarantee that this will not overflow 32 bits? The old code used size_t for this, not unsigned.

> Source/WebCore/loader/cache/CachedRawResource.cpp:92
> +        unsigned incrementalDataLength;
> +        const char* incrementalData = calculateIncrementalDataChunk(data, incrementalDataLength);
> +        if (data)
> +            setEncodedSize(data->size());
> +        notifyClientsDataWasReceived(incrementalData, incrementalDataLength);

A little strange to have the setEncodedSize call in the middle of the logic to calculate the incremental data chunk. I guess the old code was like that, but is it important to do things in this order?

> Source/WebCore/loader/cache/CachedRawResource.h:53
> -    virtual void data(ResourceBuffer*, bool allDataReceived) OVERRIDE;
> +    virtual void addDataBuffer(ResourceBuffer*);
> +    virtual void addData(const char* data, unsigned length);
> +    virtual void finishLoading(ResourceBuffer*);

These should all be marked OVERRIDE.
Comment 40 Carlos Garcia Campos 2013-06-14 01:51:16 PDT
Created attachment 204684 [details]
Patch for landing
Comment 41 Carlos Garcia Campos 2013-06-14 02:08:09 PDT
(In reply to comment #39)
> (From update of attachment 204059 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=204059&action=review
> 
> Looks good.

Thanks!

> > Source/WebCore/loader/SubresourceLoader.cpp:208
> > +        // The loader delivers the data in a multipart section all at once, then sends eof.
> 
> Not sure this comment adds much. The comment two lines down seems to say the same thing. I know you moved this comment here another function, but I think we can just drop it.

Removed.

> > Source/WebCore/loader/cache/CachedImage.cpp:347
> > +bool CachedImage::isValidDecodedImageSize() const
> 
> I think the name here should be “<cached image> has valid decoded image size” not “<cached image> is valid decoded image size”. So hasValidDecodedImageSize.
> 
> But I am not sure I understand the use of the word “valid” here to mean “smaller than the maximum”. I find this function quite confusing and a better name might help, and a why comment is needed.

The use of valid is mainly because we are using this function to decide whether the image can be drawn or if we should raise an error. 

> The old code referred to this as “we have received all the data or the size is known”, but even that did not make the maximumDecodedImageSize feature clear.
> 
> I think this is actually “is ready to draw” or something like that.

Is ready to draw sounds also confusing to me, since it sounds like we are failing just because the image is not yet ready to be drawn. So I used canBeDrawn in the end, and added a comment to explain why we are failing when the image can't be drawn.

> > Source/WebCore/loader/cache/CachedImage.cpp:386
> > +    // Go ahead and tell our observers to try to draw if the size is known.
> > +    // Each chunk from the network causes observers to repaint, which will
> > +    // force that chunk to decode.
> > +    if (!isValidDecodedImageSize()) {
> > +        error(errorOccurred() ? status() : DecodeError);
> > +        if (inCache())
> > +            memoryCache()->remove(this);
> > +        return;
> >      }
> 
> This comment is confusing. It says “tell observers to draw if the size is known”, but the code says “don’t tell observers to draw unless the size is known”. We should fix the comment since we reversed the code.

What is confusing is the place where the comment is, right before the if that checks whether the image can be drawn or not, that now has nothing to do with if the size is available or not. So, I moved the comment and removed the part about the size available, since there's an early return when the size it's not available.

> > Source/WebCore/loader/cache/CachedImage.cpp:405
> > +    RefPtr<ResourceBuffer> resourceData = ResourceBuffer::create(data, length);
> > +    addIncrementalDataBuffer(resourceData.get());
> 
> We don’t need a local variable here: It would read better as a one-liner.

Ok.

> > Source/WebCore/loader/cache/CachedImage.cpp:414
> > +    if (!m_image && data) {
> > +        // When loading multipart content all data is delivered in finishLoading.
> > +        createImage();
> >      }
> 
> This is not a good comment to have here. Even if all data wasn’t delivered in finishLoading, the first data could be delivered here for some other reason. So this comment is a bit of trivia that does not correctly answer the question “why do we have this code?” We would not want to remove this if we removed support for multipart content, for example, or if we changed how multipart content loading works.

Ok, removed it.

> > Source/WebCore/loader/cache/CachedImage.h:68
> > +    virtual void addDataBuffer(ResourceBuffer*) OVERRIDE;
> > +    virtual void finishLoading(ResourceBuffer*) OVERRIDE;
> 
> Can these be private instead of public?

No, they are used by ImageDocument, so I left them public as we discussed in the other bug, see:

https://bugs.webkit.org/show_bug.cgi?id=117289#c2

> > Source/WebCore/loader/cache/CachedRawResource.cpp:52
> > +    incrementalDataLength = data->size() - previousDataLength;
> 
> Do we have a guarantee that this will not overflow 32 bits? The old code used size_t for this, not unsigned.

ResourceBuffer::size() returns unsigned, and CachedResource::encodedSize() also returns unsigned, so previousDataLength so actually be unsigned too instead of size_t. That makes clearer that unsigned is appropriate for incrementalDataLength.

> > Source/WebCore/loader/cache/CachedRawResource.cpp:92
> > +        unsigned incrementalDataLength;
> > +        const char* incrementalData = calculateIncrementalDataChunk(data, incrementalDataLength);
> > +        if (data)
> > +            setEncodedSize(data->size());
> > +        notifyClientsDataWasReceived(incrementalData, incrementalDataLength);
> 
> A little strange to have the setEncodedSize call in the middle of the logic to calculate the incremental data chunk. I guess the old code was like that, but is it important to do things in this order?

calculateIncrementalDataChunk() uses current encoded size to calculate the incremental data length, so we need to set the new one after calculateIncrementalDataChunk. I left it also before notifying the clients to make sure that new encoded size is updated in case clients need to query that value.

> > Source/WebCore/loader/cache/CachedRawResource.h:53
> > -    virtual void data(ResourceBuffer*, bool allDataReceived) OVERRIDE;
> > +    virtual void addDataBuffer(ResourceBuffer*);
> > +    virtual void addData(const char* data, unsigned length);
> > +    virtual void finishLoading(ResourceBuffer*);
> 
> These should all be marked OVERRIDE.

Right!
Comment 42 Carlos Garcia Campos 2013-06-14 03:59:41 PDT
Committed r151586: <http://trac.webkit.org/changeset/151586>
Comment 43 Alexey Proskuryakov 2013-06-27 22:00:44 PDT
This change broke multipart/x-mixed-replace images (currently tracked internally as <rdar://problem/14242032>, we'll make a bug in Bugzilla shortly).
Comment 44 Alexey Proskuryakov 2013-06-27 22:12:17 PDT
Filed bug 118169.