Summary: | Avoid unnecessary data copies when loading subresources with DoNotBufferData option | ||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Carlos Garcia Campos <cgarcia> | ||||||||||||||||||||||||||||
Component: | Page Loading | Assignee: | Nobody <webkit-unassigned> | ||||||||||||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||||||||||||
Severity: | Normal | CC: | ap, beidson, buildbot, commit-queue, darin, esprehn+autocc, japhet, pnormand, rniwa, webkit-ews | ||||||||||||||||||||||||||||
Priority: | P2 | ||||||||||||||||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||||||
Bug Depends on: | 118169 | ||||||||||||||||||||||||||||||
Bug Blocks: | 73743 | ||||||||||||||||||||||||||||||
Attachments: |
|
Description
Carlos Garcia Campos
2013-05-08 07:59:50 PDT
Created attachment 201075 [details]
Patch
Comment on attachment 201075 [details] Patch Attachment 201075 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/434337 Comment on attachment 201075 [details] Patch Attachment 201075 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/434336 Created attachment 201079 [details]
Try to fix mac build
Comment on attachment 201079 [details] Try to fix mac build View in context: https://bugs.webkit.org/attachment.cgi?id=201079&action=review Good direction. We plan to do a pretty significant overhaul of Cache/Loader land somewhat soon, so it'll be important to make very subtle things like this more difficult to screw up while refactoring. That's the motivation for my review comments below. > Source/WebCore/loader/cache/CachedImage.cpp:402 > +void CachedImage::data(const char* data, int length) > +{ > + CachedImage::data(ResourceBuffer::create(data, length), false); > +} Let's throw the ASSERT(m_options.dataBufferingPolicy == DoNotBufferData); here, too. > Source/WebCore/loader/cache/CachedResource.h:105 > + virtual void data(const char* /*data*/, int /*length*/) { } Lets give this an implementation in CachedResource.cpp and have it ASSERT(m_options.dataBufferingPolicy == DoNotBufferData) Assuming all tests continue passing without that ASSERT firing, I'd like us to take it a step further and name this method something different instead of just overloading ::data(). "unbufferedData()" is my first instinct, though I'm aware that's not perfect. Created attachment 201083 [details]
Updated patch
Patch updated according to review comments.
(In reply to comment #5) > (From update of attachment 201079 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=201079&action=review > > Good direction. > > We plan to do a pretty significant overhaul of Cache/Loader land somewhat soon, so it'll be important to make very subtle things like this more difficult to screw up while refactoring. > > That's the motivation for my review comments below. > > > Source/WebCore/loader/cache/CachedImage.cpp:402 > > +void CachedImage::data(const char* data, int length) > > +{ > > + CachedImage::data(ResourceBuffer::create(data, length), false); > > +} > > Let's throw the ASSERT(m_options.dataBufferingPolicy == DoNotBufferData); here, too. Sure. > > Source/WebCore/loader/cache/CachedResource.h:105 > > + virtual void data(const char* /*data*/, int /*length*/) { } > > Lets give this an implementation in CachedResource.cpp and have it ASSERT(m_options.dataBufferingPolicy == DoNotBufferData) > > Assuming all tests continue passing without that ASSERT firing, I'd like us to take it a step further and name this method something different instead of just overloading ::data(). "unbufferedData()" is my first instinct, though I'm aware that's not perfect. Yes, I also thought about the names, my initial idea was to rename the current one to buffer or resourceBuffer and this new as data, but I wanted to keep the patch as simple as possible and buffer doesn't sound like a great name anyway. For now, I left both as data so that they can be renamed when refactoring. Do apple EWS bots run the tests with asserts enabled? Comment on attachment 201079 [details] Try to fix mac build View in context: https://bugs.webkit.org/attachment.cgi?id=201079&action=review >>> Source/WebCore/loader/cache/CachedResource.h:105 >>> + virtual void data(const char* /*data*/, int /*length*/) { } >> >> Lets give this an implementation in CachedResource.cpp and have it ASSERT(m_options.dataBufferingPolicy == DoNotBufferData) >> >> Assuming all tests continue passing without that ASSERT firing, I'd like us to take it a step further and name this method something different instead of just overloading ::data(). "unbufferedData()" is my first instinct, though I'm aware that's not perfect. > > Yes, I also thought about the names, my initial idea was to rename the current one to buffer or resourceBuffer and this new as data, but I wanted to keep the patch as simple as possible and buffer doesn't sound like a great name anyway. For now, I left both as data so that they can be renamed when refactoring. > > Do apple EWS bots run the tests with asserts enabled? No. You have to run tests manually on a debug build yourself as clearly outlined in http://www.webkit.org/coding/contributing.html. (In reply to comment #7) > (In reply to comment #5) > > > Source/WebCore/loader/cache/CachedResource.h:105 > > > + virtual void data(const char* /*data*/, int /*length*/) { } > > Assuming all tests continue passing without that ASSERT firing, I'd like us to take it a step further and name this method something different instead of just overloading ::data(). "unbufferedData()" is my first instinct, though I'm aware that's not perfect. > > Yes, I also thought about the names, my initial idea was to rename the current one to buffer or resourceBuffer and this new as data, but I wanted to keep the patch as simple as possible and buffer doesn't sound like a great name anyway. For now, I left both as data so that they can be renamed when refactoring. Please name it something else now. "data()" is already such a truly terrible name, and overloading truly terrible names is even more truly terrible. Comment on attachment 201083 [details] Updated patch Attachment 201083 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/353394 New failing tests: http/tests/cache/cached-main-resource.html Created attachment 201088 [details]
Archive of layout-test-results from webkit-ews-09 for mac-mountainlion-wk2
The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: webkit-ews-09 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.2
(In reply to comment #8) > (From update of attachment 201079 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=201079&action=review > > >>> Source/WebCore/loader/cache/CachedResource.h:105 > >>> + virtual void data(const char* /*data*/, int /*length*/) { } > >> > >> Lets give this an implementation in CachedResource.cpp and have it ASSERT(m_options.dataBufferingPolicy == DoNotBufferData) > >> > >> Assuming all tests continue passing without that ASSERT firing, I'd like us to take it a step further and name this method something different instead of just overloading ::data(). "unbufferedData()" is my first instinct, though I'm aware that's not perfect. > > > > Yes, I also thought about the names, my initial idea was to rename the current one to buffer or resourceBuffer and this new as data, but I wanted to keep the patch as simple as possible and buffer doesn't sound like a great name anyway. For now, I left both as data so that they can be renamed when refactoring. > > > > Do apple EWS bots run the tests with asserts enabled? > > No. You have to run tests manually on a debug build yourself as clearly outlined in http://www.webkit.org/coding/contributing.html. Sure, I know, problem is that I can't build debug builds in my laptop, I'll try to build with assertions enabled in any case. But I asked also because sometimes even if tests pass locally, it still fails in other platforms or some tests are skipped for my platform and not others, so if mac ews ran tests with assertions enabled, I could be extra sure. (In reply to comment #10) > (From update of attachment 201083 [details]) > Attachment 201083 [details] did not pass mac-wk2-ews (mac-wk2): > Output: http://webkit-queues.appspot.com/results/353394 > > New failing tests: > http/tests/cache/cached-main-resource.html This passes for me locally, I'll check the results attached. (In reply to comment #9) > (In reply to comment #7) > > (In reply to comment #5) > > > > > Source/WebCore/loader/cache/CachedResource.h:105 > > > > + virtual void data(const char* /*data*/, int /*length*/) { } > > > > Assuming all tests continue passing without that ASSERT firing, I'd like us to take it a step further and name this method something different instead of just overloading ::data(). "unbufferedData()" is my first instinct, though I'm aware that's not perfect. > > > > Yes, I also thought about the names, my initial idea was to rename the current one to buffer or resourceBuffer and this new as data, but I wanted to keep the patch as simple as possible and buffer doesn't sound like a great name anyway. For now, I left both as data so that they can be renamed when refactoring. > > Please name it something else now. > > "data()" is already such a truly terrible name, and overloading truly terrible names is even more truly terrible. Yeah, I agree, I'll do it, I thought you also preferred to keep it like this for now and rename it as part of the refactoring. Created attachment 201252 [details]
New patch
I've renamed the methods, actually I've split data into 3 addDataBuffer and addData to add data chunks to the resource when buffering/not buffering and finishLoading called when all data has been received. this way we get rid of the boolean parameter and cached resources only interested in the final data, only implement finishLoading. I've also noticed that sendDataToResource is confusing, so I've removed it to call resource methods directly. This way is more obvious when data needs to be copied and when to call addData, addDataBuffer or finishLoading.
Comment on attachment 201252 [details] New patch Attachment 201252 [details] did not pass qt-ews (qt): Output: http://webkit-queues.appspot.com/results/433074 Comment on attachment 201252 [details] New patch Attachment 201252 [details] did not pass qt-wk2-ews (qt-wk2): Output: http://webkit-queues.appspot.com/results/329153 Comment on attachment 201252 [details] New patch Attachment 201252 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/279089 Comment on attachment 201252 [details] New patch Attachment 201252 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/329175 Created attachment 201262 [details]
Try to fix the build with CSS_SHADERS
Forgot to update CachedShader
Comment on attachment 201262 [details] Try to fix the build with CSS_SHADERS Attachment 201262 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/398171 New failing tests: platform/mac/fast/loader/file-url-mimetypes-3.html platform/mac/fast/loader/file-url-mimetypes.html Created attachment 201307 [details]
Archive of layout-test-results from webkit-ews-05 for mac-mountainlion
The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: webkit-ews-05 Port: mac-mountainlion Platform: Mac OS X 10.8.2
(In reply to comment #22) > Created an attachment (id=201307) [details] > Archive of layout-test-results from webkit-ews-05 for mac-mountainlion > > The attached test failures were seen while running run-webkit-tests on the mac-ews. > Bot: webkit-ews-05 Port: mac-mountainlion Platform: Mac OS X 10.8.2 Looking at the results it seems the results are just in a different order, not sure why, this patch shouldn't affect the mime types nor the order the resources are loaded. Created attachment 201576 [details]
Updated patch
Added ASSERT also to CachedTextTrack::addDataBuffer() that I had forgotten.
Comment on attachment 201576 [details] Updated patch Attachment 201576 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/467295 New failing tests: platform/mac/fast/loader/file-url-mimetypes-3.html platform/mac/fast/loader/file-url-mimetypes.html Created attachment 201637 [details]
Archive of layout-test-results from webkit-ews-01 for mac-mountainlion
The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: webkit-ews-01 Port: mac-mountainlion Platform: Mac OS X 10.8.2
(In reply to comment #25) > (From update of attachment 201576 [details]) > Attachment 201576 [details] did not pass mac-ews (mac): > Output: http://webkit-queues.appspot.com/results/467295 > > New failing tests: > platform/mac/fast/loader/file-url-mimetypes-3.html > platform/mac/fast/loader/file-url-mimetypes.html Any hint on why these are failing? The output is just in a different order, I've tried the tests locally and I get the correct order, so I can't reproduce it. I don't see how the patch can affect the order resources are loaded, MIME types are dumped in didReceiveResponse, and the patch only changes data and finish that should happen after the response is received. Created attachment 202722 [details]
Try to fix mac tests
Ok, I think I found the issue, with the previous patch, in case of error the onload event was also triggered, since the test connects a handler to both onload and onerror, resources could start on parallel.
Created attachment 203681 [details]
Patch rebased to apply on current git master
Previous patch didn't apply anymore.
Could someone review this patch please?, it's blocking other improvements in the gstreamer media backend. Comment on attachment 203681 [details] Patch rebased to apply on current git master View in context: https://bugs.webkit.org/attachment.cgi?id=203681&action=review review- because of some not-entirely-harmless PassRefPtr mistakes > Source/WebCore/ChangeLog:20 > + incremental data chunks when not beffering and typo: beffering > Source/WebCore/loader/SubresourceLoader.cpp:243 > + if (resourceData()) > + m_resource->addDataBuffer(resourceData()); The resourceData function is not inlined, so it’s not great to call it twice. I would write: if (ResourceBuffer* resourceData = this->resourceData().get()) m_resource->addDataBuffer(resourceData); else And also we need to fix resourceData() since it’s return a PassRefPtr even though it is not transferring ownership. > Source/WebCore/loader/cache/CachedCSSStyleSheet.h:50 > + virtual void finishLoading(PassRefPtr<ResourceBuffer> data); I’m not sure this argument needs a name. The name data says almost nothing! > Source/WebCore/loader/cache/CachedFont.h:51 > + virtual void finishLoading(PassRefPtr<ResourceBuffer> data); Can this virtual function be marked OVERRIDE, FINAL, or both? Can it be made private? > Source/WebCore/loader/cache/CachedImage.cpp:397 > + addDataBuffer(data, false); This use of a boolean is no good. With every caller passing a constant, it’s not readable at the call site. We typically use enums in cases like this to make the code readable. > Source/WebCore/loader/cache/CachedImage.cpp:403 > + addDataBuffer(ResourceBuffer::create(data, length), false); Ditto. > Source/WebCore/loader/cache/CachedImage.cpp:408 > + addDataBuffer(data, true); Ditto. > Source/WebCore/loader/cache/CachedImage.cpp:410 > + CachedResource::finishLoading(data); This is always going to pass a 0 to CachedResource::finishLoading, because data has already passed off its resource buffer when calling the addDataBuffer function. We know that doesn’t matter because CachedResource::finishLoading ignores the data passed to it, but it’s really strange to code this way. Maybe we can at least just pass 0 here instead of pretending to pass data. > Source/WebCore/loader/cache/CachedImage.h:80 > + virtual void addData(const char* data, int length); I don’t think int is the right type for length. The type unsigned could be OK, or maybe it needs to be size_t. > Source/WebCore/loader/cache/CachedImage.h:81 > + virtual void addDataBuffer(PassRefPtr<ResourceBuffer>); > + virtual void addData(const char* data, int length); > + virtual void finishLoading(PassRefPtr<ResourceBuffer>); Can these virtual functions be marked OVERRIDE, FINAL, or both? Can they be made private? > Source/WebCore/loader/cache/CachedRawResource.cpp:74 > + CachedResourceHandle<CachedRawResource> protect(this); > + ASSERT(m_options.dataBufferingPolicy == DoNotBufferData); > + notifyDataReceived(data, length); The protector here seems like it’s at the wrong level. It could instead be inside notifyDataReceived, since that’s the only function called by this one. > Source/WebCore/loader/cache/CachedRawResource.h:55 > - virtual void data(PassRefPtr<ResourceBuffer> data, bool allDataReceived); > + virtual void addDataBuffer(PassRefPtr<ResourceBuffer>); > + virtual void addData(const char* data, int length); > + virtual void finishLoading(PassRefPtr<ResourceBuffer>); Can these virtual functions be marked OVERRIDE, FINAL or both? > Source/WebCore/loader/cache/CachedRawResource.h:67 > + void notifyDataReceived(const char* data, int length); I know we’ve used this kind of naming before, but it’s horribly confusing: “notify data received”? Notify who? Notify the data? I think notifyClientsDataWasReceived is one possible better name. Or callDataReceivedForAllClients? Maybe there’s another one that’s clear. > Source/WebCore/loader/cache/CachedResource.cpp:375 > +void CachedResource::addData(const char* data, int length) > +{ > + UNUSED_PARAM(data); > + UNUSED_PARAM(length); > + ASSERT(m_options.dataBufferingPolicy == DoNotBufferData); > +} I think it’s better to omit the argument names here. > Source/WebCore/loader/cache/CachedResource.h:104 > + virtual void addDataBuffer(PassRefPtr<ResourceBuffer> data); I don’t think the argument name here is helpful or needed. > Source/WebCore/loader/cache/CachedResource.h:106 > + virtual void finishLoading(PassRefPtr<ResourceBuffer> data); I don’t think the argument name here is helpful or needed. > Source/WebCore/loader/cache/CachedSVGDocument.h:43 > + virtual void finishLoading(PassRefPtr<ResourceBuffer> data); Can these virtual functions be marked OVERRIDE, FINAL or both? I don’t think the argument name here is helpful or needed. > Source/WebCore/loader/cache/CachedScript.cpp:89 > + CachedResource::finishLoading(data); This is always going to pass a 0 to CachedResource::finishLoading, because data has already passed off its resource buffer assigning to m_data. We know that doesn’t matter because CachedResource::finishLoading ignores the data passed to it, but it’s really strange to code this way. We could pass m_data. > Source/WebCore/loader/cache/CachedScript.h:45 > + virtual void finishLoading(PassRefPtr<ResourceBuffer> data); Can these virtual functions be marked OVERRIDE, FINAL or both? > Source/WebCore/loader/cache/CachedShader.cpp:66 > + m_data = data; > + CachedResource::finishLoading(data); This is always going to pass a 0 to CachedResource::finishLoading, because data has already passed off its resource buffer assigning to m_data. We know that doesn’t matter because CachedResource::finishLoading ignores the data passed to it, but it’s really strange to code this way. We could pass m_data. Seems this bug already existed in the old version of this function. > Source/WebCore/loader/cache/CachedTextTrack.cpp:65 > + addDataBuffer(data); > + CachedResource::finishLoading(data); This is always going to pass a 0 to CachedResource::finishLoading, because data has already passed off its resource buffer calling addDataBuffer. We know that doesn’t matter because CachedResource::finishLoading ignores the data passed to it, but it’s really strange to code this way. We could pass 0, or go out of our way to actually pass the data buffer. > Source/WebCore/loader/cache/CachedTextTrack.h:42 > + virtual void addDataBuffer(PassRefPtr<ResourceBuffer>); > + virtual void finishLoading(PassRefPtr<ResourceBuffer>); Can these virtual functions be marked OVERRIDE, FINAL or both? Can they be made private? > Source/WebCore/loader/cache/CachedXSLStyleSheet.h:48 > + virtual void finishLoading(PassRefPtr<ResourceBuffer> data); Can these virtual functions be marked OVERRIDE, FINAL or both? Can they be made private? (In reply to comment #31) > (From update of attachment 203681 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=203681&action=review > > review- because of some not-entirely-harmless PassRefPtr mistakes Thanks for the review. Since some of the issues are not actually introduced by this patch, I've opened separate bugs to fix them. > > Source/WebCore/ChangeLog:20 > > + incremental data chunks when not beffering and > > typo: beffering Oops. > > Source/WebCore/loader/SubresourceLoader.cpp:243 > > + if (resourceData()) > > + m_resource->addDataBuffer(resourceData()); > > The resourceData function is not inlined, so it’s not great to call it twice. I would write: > > if (ResourceBuffer* resourceData = this->resourceData().get()) > m_resource->addDataBuffer(resourceData); > else > > And also we need to fix resourceData() since it’s return a PassRefPtr even though it is not transferring ownership. Yes, this was actually a refactoring of current code, see SubresourceLoader::sendDataToResource(). I also don't understand why resourceData returns a PassRefPtr. See bug #117288. > > Source/WebCore/loader/cache/CachedCSSStyleSheet.h:50 > > + virtual void finishLoading(PassRefPtr<ResourceBuffer> data); > > I’m not sure this argument needs a name. The name data says almost nothing! Agree. > > Source/WebCore/loader/cache/CachedFont.h:51 > > + virtual void finishLoading(PassRefPtr<ResourceBuffer> data); > > Can this virtual function be marked OVERRIDE, FINAL, or both? Can it be made private? Yes, actually all of them can be made private and marked as OVERRIDE, see bug #117289. > > Source/WebCore/loader/cache/CachedImage.cpp:397 > > + addDataBuffer(data, false); > > This use of a boolean is no good. With every caller passing a constant, it’s not readable at the call site. We typically use enums in cases like this to make the code readable. Will fix it. > > Source/WebCore/loader/cache/CachedImage.cpp:410 > > + CachedResource::finishLoading(data); > > This is always going to pass a 0 to CachedResource::finishLoading, because data has already passed off its resource buffer when calling the addDataBuffer function. We know that doesn’t matter because CachedResource::finishLoading ignores the data passed to it, but it’s really strange to code this way. Maybe we can at least just pass 0 here instead of pretending to pass data. Ah, right, PassRefPtr is leaked and assigned to a RefPtr. I think it's confusing in this case where we chain up to the parent. I've changed data() to receive a raw pointer too in bug #117288, I think it makes the code less confusing and still allows the cached resource to take a reference. > > Source/WebCore/loader/cache/CachedImage.h:80 > > + virtual void addData(const char* data, int length); > > I don’t think int is the right type for length. The type unsigned could be OK, or maybe it needs to be size_t. I think this comes from ResourceHandle::didReceiveData() that also uses int for length. I agree it should be size_t. ResourceHandle::didSendData() uses unsigned long long though. > > Source/WebCore/loader/cache/CachedRawResource.cpp:74 > > + CachedResourceHandle<CachedRawResource> protect(this); > > + ASSERT(m_options.dataBufferingPolicy == DoNotBufferData); > > + notifyDataReceived(data, length); > > The protector here seems like it’s at the wrong level. It could instead be inside notifyDataReceived, since that’s the only function called by this one. Ok. > > Source/WebCore/loader/cache/CachedRawResource.h:67 > > + void notifyDataReceived(const char* data, int length); > > I know we’ve used this kind of naming before, but it’s horribly confusing: “notify data received”? Notify who? Notify the data? I think notifyClientsDataWasReceived is one possible better name. Or callDataReceivedForAllClients? Maybe there’s another one that’s clear. Fair enough, I'll rename it. > > Source/WebCore/loader/cache/CachedResource.cpp:375 > > +void CachedResource::addData(const char* data, int length) > > +{ > > + UNUSED_PARAM(data); > > + UNUSED_PARAM(length); > > + ASSERT(m_options.dataBufferingPolicy == DoNotBufferData); > > +} > > I think it’s better to omit the argument names here. Right, I don't know why I used UNUSED_PARAM in this case. Comment on attachment 203681 [details] Patch rebased to apply on current git master View in context: https://bugs.webkit.org/attachment.cgi?id=203681&action=review >>> Source/WebCore/loader/cache/CachedImage.h:80 >>> + virtual void addData(const char* data, int length); >> >> I don’t think int is the right type for length. The type unsigned could be OK, or maybe it needs to be size_t. > > I think this comes from ResourceHandle::didReceiveData() that also uses int for length. I agree it should be size_t. ResourceHandle::didSendData() uses unsigned long long though. Here’s my take on this: 1) When we want to be able to hold the size of something, size_t is the type that can accommodate anything we could have in memory, so it’s always big enough. 2) As an optimization to our data structures, many should use unsigned instead of size_t, since on 64-bit systems this can make our objects much smaller, and there are many things that simply can’t go past 2^32. I know that Andreas Kling and Antti are considering changes like this, including possibly changing Vector to use unsigned instead of size_t. 3) When we are dealing with the size of a file on disk or something that can be dealt with without mapping or loading the whole thing into memory all at once, we want to use a type like "unsigned long long" that can represent sizes larger than 2^32 even on 32-bit systems. I think that’s why ResourceHandle::didSendData uses this type, because there are code paths where it can send data that does not need to be resident in memory all at the same time, possibly including very large files. (In reply to comment #33) > (From update of attachment 203681 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=203681&action=review > > >>> Source/WebCore/loader/cache/CachedImage.h:80 > >>> + virtual void addData(const char* data, int length); > >> > >> I don’t think int is the right type for length. The type unsigned could be OK, or maybe it needs to be size_t. > > > > I think this comes from ResourceHandle::didReceiveData() that also uses int for length. I agree it should be size_t. ResourceHandle::didSendData() uses unsigned long long though. > > Here’s my take on this: > > 1) When we want to be able to hold the size of something, size_t is the type that can accommodate anything we could have in memory, so it’s always big enough. > > 2) As an optimization to our data structures, many should use unsigned instead of size_t, since on 64-bit systems this can make our objects much smaller, and there are many things that simply can’t go past 2^32. I know that Andreas Kling and Antti are considering changes like this, including possibly changing Vector to use unsigned instead of size_t. > > 3) When we are dealing with the size of a file on disk or something that can be dealt with without mapping or loading the whole thing into memory all at once, we want to use a type like "unsigned long long" that can represent sizes larger than 2^32 even on 32-bit systems. I think that’s why ResourceHandle::didSendData uses this type, because there are code paths where it can send data that does not need to be resident in memory all at the same time, possibly including very large files. This piece would be useful to be somewhere in the wiki. Actually I think this comes from ResourceBuffer(const char*, int) that comes from SharedBuffer(const char* data, int size) that indeed has a FIXME: // FIXME: Use unsigned consistently, and check for invalid casts when calling into SharedBuffer from other code. if (size < 0) CRASH(); (In reply to comment #34) > This piece would be useful to be somewhere in the wiki. Sure thing. Please put it up there somewhere and point me to it. I’ll happily help edit it. (In reply to comment #35) > (In reply to comment #34) > > This piece would be useful to be somewhere in the wiki. > > Sure thing. Please put it up there somewhere and point me to it. I’ll happily help edit it. Something like this? http://trac.webkit.org/wiki/TypesForSize Created attachment 204059 [details]
Updated patch
Patch rebased and updated to address review comments. In the end I got rid of the allDataReceived boolean parameter in CachedImage by refactoring the code a bit, since now we have different methods for incremental data chunks and finishLoading when all data has been received (and considering the multipart content case), I think it makes the code a bit easier to read.
ping reviewers again :-) Comment on attachment 204059 [details] Updated patch View in context: https://bugs.webkit.org/attachment.cgi?id=204059&action=review Looks good. > Source/WebCore/loader/SubresourceLoader.cpp:208 > + // The loader delivers the data in a multipart section all at once, then sends eof. Not sure this comment adds much. The comment two lines down seems to say the same thing. I know you moved this comment here another function, but I think we can just drop it. > Source/WebCore/loader/cache/CachedImage.cpp:347 > +bool CachedImage::isValidDecodedImageSize() const I think the name here should be “<cached image> has valid decoded image size” not “<cached image> is valid decoded image size”. So hasValidDecodedImageSize. But I am not sure I understand the use of the word “valid” here to mean “smaller than the maximum”. I find this function quite confusing and a better name might help, and a why comment is needed. The old code referred to this as “we have received all the data or the size is known”, but even that did not make the maximumDecodedImageSize feature clear. I think this is actually “is ready to draw” or something like that. > Source/WebCore/loader/cache/CachedImage.cpp:386 > + // Go ahead and tell our observers to try to draw if the size is known. > + // Each chunk from the network causes observers to repaint, which will > + // force that chunk to decode. > + if (!isValidDecodedImageSize()) { > + error(errorOccurred() ? status() : DecodeError); > + if (inCache()) > + memoryCache()->remove(this); > + return; > } This comment is confusing. It says “tell observers to draw if the size is known”, but the code says “don’t tell observers to draw unless the size is known”. We should fix the comment since we reversed the code. > Source/WebCore/loader/cache/CachedImage.cpp:405 > + RefPtr<ResourceBuffer> resourceData = ResourceBuffer::create(data, length); > + addIncrementalDataBuffer(resourceData.get()); We don’t need a local variable here: It would read better as a one-liner. > Source/WebCore/loader/cache/CachedImage.cpp:414 > + if (!m_image && data) { > + // When loading multipart content all data is delivered in finishLoading. > + createImage(); > } This is not a good comment to have here. Even if all data wasn’t delivered in finishLoading, the first data could be delivered here for some other reason. So this comment is a bit of trivia that does not correctly answer the question “why do we have this code?” We would not want to remove this if we removed support for multipart content, for example, or if we changed how multipart content loading works. > Source/WebCore/loader/cache/CachedImage.h:68 > + virtual void addDataBuffer(ResourceBuffer*) OVERRIDE; > + virtual void finishLoading(ResourceBuffer*) OVERRIDE; Can these be private instead of public? > Source/WebCore/loader/cache/CachedRawResource.cpp:52 > + incrementalDataLength = data->size() - previousDataLength; Do we have a guarantee that this will not overflow 32 bits? The old code used size_t for this, not unsigned. > Source/WebCore/loader/cache/CachedRawResource.cpp:92 > + unsigned incrementalDataLength; > + const char* incrementalData = calculateIncrementalDataChunk(data, incrementalDataLength); > + if (data) > + setEncodedSize(data->size()); > + notifyClientsDataWasReceived(incrementalData, incrementalDataLength); A little strange to have the setEncodedSize call in the middle of the logic to calculate the incremental data chunk. I guess the old code was like that, but is it important to do things in this order? > Source/WebCore/loader/cache/CachedRawResource.h:53 > - virtual void data(ResourceBuffer*, bool allDataReceived) OVERRIDE; > + virtual void addDataBuffer(ResourceBuffer*); > + virtual void addData(const char* data, unsigned length); > + virtual void finishLoading(ResourceBuffer*); These should all be marked OVERRIDE. Created attachment 204684 [details]
Patch for landing
(In reply to comment #39) > (From update of attachment 204059 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=204059&action=review > > Looks good. Thanks! > > Source/WebCore/loader/SubresourceLoader.cpp:208 > > + // The loader delivers the data in a multipart section all at once, then sends eof. > > Not sure this comment adds much. The comment two lines down seems to say the same thing. I know you moved this comment here another function, but I think we can just drop it. Removed. > > Source/WebCore/loader/cache/CachedImage.cpp:347 > > +bool CachedImage::isValidDecodedImageSize() const > > I think the name here should be “<cached image> has valid decoded image size” not “<cached image> is valid decoded image size”. So hasValidDecodedImageSize. > > But I am not sure I understand the use of the word “valid” here to mean “smaller than the maximum”. I find this function quite confusing and a better name might help, and a why comment is needed. The use of valid is mainly because we are using this function to decide whether the image can be drawn or if we should raise an error. > The old code referred to this as “we have received all the data or the size is known”, but even that did not make the maximumDecodedImageSize feature clear. > > I think this is actually “is ready to draw” or something like that. Is ready to draw sounds also confusing to me, since it sounds like we are failing just because the image is not yet ready to be drawn. So I used canBeDrawn in the end, and added a comment to explain why we are failing when the image can't be drawn. > > Source/WebCore/loader/cache/CachedImage.cpp:386 > > + // Go ahead and tell our observers to try to draw if the size is known. > > + // Each chunk from the network causes observers to repaint, which will > > + // force that chunk to decode. > > + if (!isValidDecodedImageSize()) { > > + error(errorOccurred() ? status() : DecodeError); > > + if (inCache()) > > + memoryCache()->remove(this); > > + return; > > } > > This comment is confusing. It says “tell observers to draw if the size is known”, but the code says “don’t tell observers to draw unless the size is known”. We should fix the comment since we reversed the code. What is confusing is the place where the comment is, right before the if that checks whether the image can be drawn or not, that now has nothing to do with if the size is available or not. So, I moved the comment and removed the part about the size available, since there's an early return when the size it's not available. > > Source/WebCore/loader/cache/CachedImage.cpp:405 > > + RefPtr<ResourceBuffer> resourceData = ResourceBuffer::create(data, length); > > + addIncrementalDataBuffer(resourceData.get()); > > We don’t need a local variable here: It would read better as a one-liner. Ok. > > Source/WebCore/loader/cache/CachedImage.cpp:414 > > + if (!m_image && data) { > > + // When loading multipart content all data is delivered in finishLoading. > > + createImage(); > > } > > This is not a good comment to have here. Even if all data wasn’t delivered in finishLoading, the first data could be delivered here for some other reason. So this comment is a bit of trivia that does not correctly answer the question “why do we have this code?” We would not want to remove this if we removed support for multipart content, for example, or if we changed how multipart content loading works. Ok, removed it. > > Source/WebCore/loader/cache/CachedImage.h:68 > > + virtual void addDataBuffer(ResourceBuffer*) OVERRIDE; > > + virtual void finishLoading(ResourceBuffer*) OVERRIDE; > > Can these be private instead of public? No, they are used by ImageDocument, so I left them public as we discussed in the other bug, see: https://bugs.webkit.org/show_bug.cgi?id=117289#c2 > > Source/WebCore/loader/cache/CachedRawResource.cpp:52 > > + incrementalDataLength = data->size() - previousDataLength; > > Do we have a guarantee that this will not overflow 32 bits? The old code used size_t for this, not unsigned. ResourceBuffer::size() returns unsigned, and CachedResource::encodedSize() also returns unsigned, so previousDataLength so actually be unsigned too instead of size_t. That makes clearer that unsigned is appropriate for incrementalDataLength. > > Source/WebCore/loader/cache/CachedRawResource.cpp:92 > > + unsigned incrementalDataLength; > > + const char* incrementalData = calculateIncrementalDataChunk(data, incrementalDataLength); > > + if (data) > > + setEncodedSize(data->size()); > > + notifyClientsDataWasReceived(incrementalData, incrementalDataLength); > > A little strange to have the setEncodedSize call in the middle of the logic to calculate the incremental data chunk. I guess the old code was like that, but is it important to do things in this order? calculateIncrementalDataChunk() uses current encoded size to calculate the incremental data length, so we need to set the new one after calculateIncrementalDataChunk. I left it also before notifying the clients to make sure that new encoded size is updated in case clients need to query that value. > > Source/WebCore/loader/cache/CachedRawResource.h:53 > > - virtual void data(ResourceBuffer*, bool allDataReceived) OVERRIDE; > > + virtual void addDataBuffer(ResourceBuffer*); > > + virtual void addData(const char* data, unsigned length); > > + virtual void finishLoading(ResourceBuffer*); > > These should all be marked OVERRIDE. Right! Committed r151586: <http://trac.webkit.org/changeset/151586> This change broke multipart/x-mixed-replace images (currently tracked internally as <rdar://problem/14242032>, we'll make a bug in Bugzilla shortly). Filed bug 118169. |