Bug 175890

Summary:

Image data should be coalesced if it comes in small chunks before updating the ImageSource

Product:

WebKit

Reporter:

Said Abou-Hallawa <sabouhallawa>

Component:

Images

Assignee:

Said Abou-Hallawa <sabouhallawa>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

buildbot, cdumez, commit-queue, darin, dbates, dino, esprehn+autocc, gyuyoung.kim, japhet, simon.fraser, thorton, webkit-bug-importer, youennf

Priority:

Keywords:

InRadar

Version:

WebKit Nightly Build

Hardware:

Unspecified

OS:

Unspecified

See Also:

https://bugs.webkit.org/show_bug.cgi?id=178020

Attachments:

Description	Flags
Patch	none
Patch	none
Patch	none
Patch	none
Patch	none

Said Abou-Hallawa

Reported 2017-08-23 10:40:23 PDT

1. Open https://www.nasa.gov/sites/default/files/thumbnails/image/pia21778.png Results: On fast machine and fast network, the page finishes loading after seven seconds. On a slow network, the page can finish loading after five minutes. NOTE: The size of this image is 45MB. When loading it, the server sends it in a small chunks, each chunk is less than 1MB. The networking process receives all the data within normal duration. For example on a fast network, the whole image data is received in less than a second. The problem is having to repeat the following steps with each chunk of data: -- The new data chunk is added to a SharedBuffer. -- CachedImage sets its BitmapImage with the new data which asks the SharedBuffer to create a CFData. Creating a CFData requires all the segments of the SharedBuffer to be combined in one single buffer. -- BitmapImage sets the new data to its ImageSource which passes it to its ImageDecoder. -- CachedImage notifies its clients that new data chunk has arrived. The image renderer, which is one of the CachedImage's clients, will repaint itself. This will cause the image to be drawn which will request the ImageDecoder to decode the image frame. It turns out running theses steps is extremely expensive if they have to be repeated many times because a large image receives its data in small chunks. The solution can be coalescing the number of updates based on time, i.e. updating the ImageSource only after certain period of time. This period of time increases as long as the images has not finished loading. This strategy has been working for iOS; see ImageSource::dataChanged(). We need to enable this strategy for all platforms. In the current iOS solution, ImageSource coalesces how many times it updates the ImageDecoder with the new data. But CachedImage still causes the image renderer to repaint itself even though the Image itself does not change. We need to fix this for all platforms including iOS.

Attachments
Patch (6.57 KB, patch) 2017-08-23 10:52 PDT, Said Abou-Hallawa	no flags	Details Formatted Diff Diff
Patch (6.16 KB, patch) 2017-08-24 12:48 PDT, Said Abou-Hallawa	no flags	Details Formatted Diff Diff
Patch (6.27 KB, patch) 2017-09-05 15:24 PDT, Said Abou-Hallawa	no flags	Details Formatted Diff Diff
Patch (19.63 KB, patch) 2017-09-29 19:06 PDT, Said Abou-Hallawa	no flags	Details Formatted Diff Diff
Patch (19.35 KB, patch) 2017-10-09 14:26 PDT, Said Abou-Hallawa	no flags	Details Formatted Diff Diff
Show Obsolete (4) View All Add attachment proposed patch, testcase, etc.

Said Abou-Hallawa

Comment 1 2017-08-23 10:52:14 PDT

Created attachment 318886 [details] Patch

Said Abou-Hallawa

Comment 2 2017-08-23 10:53:21 PDT

<rdar://problem/33963003>

Tim Horton

Comment 3 2017-08-23 16:32:01 PDT

Comment on attachment 318886 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=318886&action=review > Source/WebCore/loader/cache/CachedImage.cpp:450 > + ASSERT(m_coalesceDataCount < coalesceDataSeconds.get().size()); This assert seems super unnecessary especially now that it's a Vector instead of a C array. > Source/WebCore/loader/cache/CachedImage.cpp:455 > + m_coalesceDataCount = std::min<unsigned>(m_coalesceDataCount + 1, coalesceDataSeconds.get().size() - 1); Isn't this just a silly way of writing: if (m_coalesceDataCount < coalesceDataSeconds.get().size() - 1) m_coalesceDataCount++; Also, is that the best name? Since we cap it based on the number of items in the durations array, it's more of an index into that array than a count of times we coalesced. > Source/WebCore/loader/cache/CachedImage.cpp:468 > + if (shouldCoalesceDataBuffer()) > + return; I'm confused. How does this not result in the data getting dropped on the floor?

Said Abou-Hallawa

Comment 4 2017-08-24 12:48:23 PDT

Created attachment 319013 [details] Patch

Said Abou-Hallawa

Comment 5 2017-08-24 14:24:53 PDT

Comment on attachment 318886 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=318886&action=review >> Source/WebCore/loader/cache/CachedImage.cpp:450 >> + ASSERT(m_coalesceDataCount < coalesceDataSeconds.get().size()); > > This assert seems super unnecessary especially now that it's a Vector instead of a C array. I am not sure if I agree with that. What is the difference between accessing a Vector and a C array with an out-of-bound index? I wanted to ensure that the index does not change outside the scope of this function such that it is larger than the last index in the Vector. But anyway I removed it. >> Source/WebCore/loader/cache/CachedImage.cpp:455 >> + m_coalesceDataCount = std::min<unsigned>(m_coalesceDataCount + 1, coalesceDataSeconds.get().size() - 1); > > Isn't this just a silly way of writing: > > if (m_coalesceDataCount < coalesceDataSeconds.get().size() - 1) > m_coalesceDataCount++; > > Also, is that the best name? Since we cap it based on the number of items in the durations array, it's more of an index into that array than a count of times we coalesced. The std::min() is replaced by the if-statemnt as suggested. The Vector's name is changed to coalesceDataDurations and the index's name is changed to m_currentCoalesceDataDuration. >> Source/WebCore/loader/cache/CachedImage.cpp:468 >> + return; > > I'm confused. How does this not result in the data getting dropped on the floor? It does not because the CachedRawResource passes its SharedBuffer to the CachedImage which just makes makeRef() of it. The CachedRawResource keeps accumulating all the incoming data in the same SharedBuffer. It is guaranteed that the SharedBuffer will be set in the ImageSource at least once in CachedImage::finishLoading(). My purpose of this change was to ensure the data is consistent between the CachedImage and the ImageSource. I thought as long as the ImageSource is not updated with the new data, it is better not to update the CachedImage with the new data. But after realizing that the SharedBuffer of the CachedRawResource is already referenced by the CachedImage, I think it is better to keep the old behavior which can be done by moving the check if (shouldCoalesceDataBuffer()) to addIncrementalDataBuffer().

Simon Fraser (smfr)

Comment 6 2017-09-01 12:38:26 PDT

Comment on attachment 319013 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=319013&action=review > Source/WebCore/loader/cache/CachedImage.cpp:449 > + static const auto coalesceDataDurations = makeNeverDestroyed(Vector<double> { 0, 1, 3, 6, 15 }); Vector<Seconds>. makeNeverDestroyed is overkill, but I guess you might need it if static Seconds[] doesn't work. > Source/WebCore/loader/cache/CachedImage.cpp:456 > + m_coalesceDataTime = time; A function called shouldCoalesceDataBuffer() should not change state, I think. m_coalesceDataTime should be changed when you actually do the coalescing. > Source/WebCore/loader/cache/CachedImage.h:163 > + size_t m_currentCoalesceDataDuration { 0 }; This isn't a duration, right? So the name is misleading. It's an index into some array.

Darin Adler

Comment 7 2017-09-03 17:09:09 PDT

Comment on attachment 319013 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=319013&action=review >> Source/WebCore/loader/cache/CachedImage.cpp:449 >> + static const auto coalesceDataDurations = makeNeverDestroyed(Vector<double> { 0, 1, 3, 6, 15 }); > > Vector<Seconds>. > > makeNeverDestroyed is overkill, but I guess you might need it if static Seconds[] doesn't work. It’s not overkill because this is a Vector and it won’t compile otherwise. If this was an array, and it should be an array instead of a vector, then we would not need makeNeverDestroyed. static const double coalesceDataDurations[] = { 0, 1, 3, 6, 15 };

Said Abou-Hallawa

Comment 8 2017-09-05 15:24:34 PDT

Created attachment 319942 [details] Patch

Tim Horton

Comment 9 2017-09-29 12:40:36 PDT

Comment on attachment 319942 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=319942&action=review > Source/WebCore/loader/cache/CachedImage.cpp:450 > + static const double dataBufferAddingIntervals[] = { 0, 1, 3, 6, 15 }; I think this one should be more clear that it's about backoff. Maybe incrementalLoadingBackoffIntervals? Maybe propagate that naming through all of the function/member/variable names? Right now it's kind of mysterious. e.g. bool CachedImage::shouldDeferIncrementalLoad() const { static const double incrementalLoadingBackoffIntervals[] = { 0, 1, 3, 6, 15 }; unsigned interval = std::min<unsigned>(m_incrementalLoadCount, 4); // The first time through, the chunk time will be 0 and the image will get an update. return (MonotonicTime::now() - m_lastIncrementalLoadingUpdateTime).seconds() < incrementalLoadingBackoffIntervals[interval]; } void CachedImage::didIncrementalLoad() { m_lastIncrementalLoadingUpdateTime = MonotonicTime::now(); ASSERT(m_incrementalLoadCount < std::numeric_limits<unsigned>::max()); ++m_incrementalLoadCount; } I'm not totally sold on the names tho. Maybe someone else has an idea.

Said Abou-Hallawa

Comment 10 2017-09-29 19:06:08 PDT

Created attachment 322264 [details] Patch

youenn fablet

Comment 11 2017-10-02 14:21:45 PDT

Would it be useful to have that kind of behavior for other types of resources? Or other related behavior like processing only when receiving the last byte? Instead of doing it in the WebProcess, could NetworkProcess be told to adapt its pace of sending data chunks on a per load basis?

Said Abou-Hallawa

Comment 12 2017-10-02 15:26:00 PDT

(In reply to youenn fablet from comment #11) > Would it be useful to have that kind of behavior for other types of > resources? There is no problem in receiving the resource's bytes in small chunks. The problem with the image is updating the decoder with small chunks of data. This is costly because the ImageDecoder has to re-decode the new data. Also repainting and drawing the RenderImage every time an image data is received is expensive. Or other related behavior like processing only when receiving the > last byte? No, we can't do that. We would like to show partial images instead of showing nothing till the whole image data is received. > Instead of doing it in the WebProcess, could NetworkProcess be told to adapt > its pace of sending data chunks on a per load basis? I don't know if this solution is feasible or not. But I think dealing with this issue has to be done based on the resource types. Each resource can decide whether it needs to process any incoming data immediately or it can coalesce the number of processing the incoming data. So this decision has to be in WebProcess.

youenn fablet

Comment 13 2017-10-03 01:23:13 PDT

(In reply to Said Abou-Hallawa from comment #12) > (In reply to youenn fablet from comment #11) > > Would it be useful to have that kind of behavior for other types of > > resources? > > There is no problem in receiving the resource's bytes in small chunks. The > problem with the image is updating the decoder with small chunks of data. > This is costly because the ImageDecoder has to re-decode the new data. Also > repainting and drawing the RenderImage every time an image data is received > is expensive. The current patch seems to fix the issue for CachedImage. If a similar optimization can be envisioned for other types of resources, it may be better to do it at a lower layer, the CachedResource layer being responsible to decide which sending they expect. > > Or other related behavior like processing only when receiving the > > last byte? > > No, we can't do that. We would like to show partial images instead of > showing nothing till the whole image data is received. Sure for images, my question is more for other types of resources. > > Instead of doing it in the WebProcess, could NetworkProcess be told to adapt > > its pace of sending data chunks on a per load basis? > > I don't know if this solution is feasible or not. But I think dealing with > this issue has to be done based on the resource types. Each resource can > decide whether it needs to process any incoming data immediately or it can > coalesce the number of processing the incoming data. So this decision has to > be in WebProcess. If this issue is spanning several types of resources, it might be simpler to have CachedResource pass a parameter expressing what it is expecting. For instance, ResourceLoader already does buffering according its dataBufferingPolicy option. Maybe extending this option would be a good way to see whether that approach would work for your use case. If so, I would then suggest to move the buffering to NetworkProcess by passing an additional parameter to NetworkConnectionToWebProcess::ScheduleResourceLoad. That would reduce the number of IPC calls.

Said Abou-Hallawa

Comment 14 2017-10-03 17:44:02 PDT

After looking at the code of NetworkConnectionToWebProcess, I realized that the network process implements a linear back-off buffering for the image resource. WebLoaderStrategy::scheduleLoad() sets the member maximumBufferingTime of the NetworkResourceLoadParameters to 500_ms if the resource type is CachedResource::ImageResource. NetworkConnectionToWebProcess creates a timer with 500_ms and when it fires it sends the received data to the WebProcess. This patch implements an exponential back-off buffering. The timer interval keeps increasing till it reaches the last interval and then it uses this interval till loading the image finishes. I agree having two buffering mechanisms the image resource is bad. I could change NetworkResourceLoadParameters::maximumBufferingTime to be a Vector<Seconds>. And I could change the NetworkResourceLoader to create its timers with the intervals from the Vector<Seconds> one after the other till the last one. But we will have two problems: 1. This solution does not work for WK1. A similar buffering mechanism is needed for WK1 at least for iOS since the exponential buffering was implemented for iOS. We may have two solutions for this: a) WebCoreResourceHandleAsDelegate.didReceiveData() needs to implements the same buffering that NetworkResourceLoader implements. b) Keep the same buffering mechanism in ImageSource::dataChanged() but enable it for for WK1. And this is really ugly because we will have the buffering for the image in two different levels for WK1 and WK2. 2. This solution is mainly needed for ImageDocument for WK2. The bug is not noticeable when the image is included in an <img> element. For example open these images in Safari and notice how bad the performance is: http://www.914world.com/bbs2/uploads_offsite/s6.postimg.org-1623-1505909118.1.gif https://www.nasa.gov/sites/default/files/thumbnails/image/pia21778.png Now open the same images as sub-resources, i.e. create HTML files with the following <img> element: <img src = "http://www.914world.com/bbs2/uploads_offsite/s6.postimg.org-1623-1505909118.1.gif"> and <img src = "https://www.nasa.gov/sites/default/files/thumbnails/image/pia21778.png"> The reason for the difference is "static Seconds maximumBufferingTimes(CachedResource*)" returns 500_ms for the CachedResource::ImageResource but it returns 0_s for CachedResource::MainResource. When the image is the main resource no buffering is made which is the case of the ImageDocument. But when the image is a sub-resource, updating the web process with the image data happens only every 500_ms. And this buffering is enough at least for non iOS platforms. I do not think we can know for sure that the main resource is an image before starting loading it and parsing it. In fact, we create the ImageDocument and the ImageDocumentParse after receiving the first chunk of data. So the buffering is is not implemented for WK1 and ImageDocument. And both are currently not handled by the NetworkProcess. And I think implementing the single image buffering in the network side for Wk1 and ImageDocument is expensive. Having said that, I think having a single image buffering for all platforms is not in the scope of this bug.

youenn fablet

Comment 15 2017-10-03 17:50:57 PDT

It is late for me so I may have other feedback tomorrow. Regarding WK1, I agree that the NetworkProcess idea is not fully portable to WK1. At this point, not optimizing WK1 might be ok if we think this is the right solution for WK2. I would check how this idea would fly with the various way images can be loaded (main document, sub resources) and whether doing it in NetworkProcess will or not solve potential problems for other resources.

Said Abou-Hallawa

Comment 16 2017-10-03 18:20:11 PDT

(In reply to youenn fablet from comment #15) > It is late for me so I may have other feedback tomorrow. > > Regarding WK1, I agree that the NetworkProcess idea is not fully portable to > WK1. > At this point, not optimizing WK1 might be ok if we think this is the right > solution for WK2. I disagree because the image buffering is currently implemented for WK1 on iOS in the function ImageSource::dataChanged() and we should not remove it unless we have a better way of doing it. > > I would check how this idea would fly with the various way images can be > loaded (main document, sub resources) and whether doing it in NetworkProcess > will or not solve potential problems for other resources.

Said Abou-Hallawa

Comment 17 2017-10-06 11:51:22 PDT

I filed https://bugs.webkit.org/show_bug.cgi?id=178020 to address youenn's comments in a separate patch. The attached patch improves the performance on WK1 for all images and on WK2 for the ImageDocument by enabling the existing backing off mechanism on all platforms. And I think we are not making anything worse. So this patch is ready for review.

Simon Fraser (smfr)

Comment 18 2017-10-09 10:04:42 PDT

Comment on attachment 322264 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=322264&action=review > Source/WebCore/loader/cache/CachedImage.cpp:411 > +void CachedImage::doUpdateBuffer(SharedBuffer& data) This name is awkward. Maybe updateBufferInternal? > Source/WebCore/loader/cache/CachedImage.cpp:421 > + if (shouldDeferUpdateImageData()) > + return; > + > // Have the image update its data from its internal buffer. > // It will not do anything now, but will delay decoding until The early return after shouldDeferUpdateImageData() and then the comment which says "It will not do anything now" are confusing. If updateImageData don't do anything, why did we early return?

Said Abou-Hallawa

Comment 19 2017-10-09 14:26:35 PDT

Created attachment 323217 [details] Patch

WebKit Commit Bot

Comment 20 2017-10-09 17:07:51 PDT

Comment on attachment 323217 [details] Patch Clearing flags on attachment: 323217 Committed r223091: <http://trac.webkit.org/changeset/223091>

WebKit Commit Bot

Comment 21 2017-10-09 17:07:53 PDT

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.