Bug 181711 - Limit the number of images that are asynchronously decoded at a time to half the number of cores
Summary: Limit the number of images that are asynchronously decoded at a time to half ...
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Images (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Said Abou-Hallawa
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2018-01-16 17:54 PST by Said Abou-Hallawa
Modified: 2022-09-25 04:23 PDT (History)
15 users (show)

See Also:


Attachments
Patch (6.94 KB, patch)
2018-01-16 18:25 PST, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Patch (9.06 KB, patch)
2018-01-18 17:34 PST, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Patch (18.39 KB, patch)
2018-01-19 12:01 PST, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews123 for ios-simulator-wk2 (2.20 MB, application/zip)
2018-01-19 13:39 PST, EWS Watchlist
no flags Details
Patch (63.20 KB, patch)
2018-01-21 18:37 PST, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Patch (61.95 KB, patch)
2018-01-21 19:49 PST, Said Abou-Hallawa
ggaren: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Said Abou-Hallawa 2018-01-16 17:54:19 PST
Asynchronous image decoding can accelerate the image rendering and allow the main thread to process other activities. But if the number of images that are asynchronously decoded increased, such that the number of decoding thread exceeds the number of cores on the system, the overall performance of the system will be affected. Even the rendering in WebKit will be choppy.

What we need do is we need to use the system cores instead of leaving them idle. But at the same time, do not be too greedy and don't use all the available cores and don't consume too much power for decoding large or animated images.

I think a good strategy for this purpose is to allow asynchronous image decoding to take up to half the available cores.
Comment 1 Said Abou-Hallawa 2018-01-16 18:25:13 PST
Created attachment 331446 [details]
Patch
Comment 2 Said Abou-Hallawa 2018-01-16 18:26:35 PST
<rdar://problem/35017173>
Comment 3 Said Abou-Hallawa 2018-01-18 17:34:16 PST
Created attachment 331678 [details]
Patch
Comment 4 Saam Barati 2018-01-18 17:50:07 PST
Comment on attachment 331678 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=331678&action=review

> Source/WebCore/platform/graphics/ImageSource.cpp:317
> +    if (!DecodingQueue::canCreateDecodingQueue())
> +        return false;

This looks racy. Not sure if it matters. But, I think the race is:

count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue.
Comment 5 Tim Horton 2018-01-18 17:53:07 PST
Comment on attachment 331678 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=331678&action=review

>> Source/WebCore/platform/graphics/ImageSource.cpp:317
>> +        return false;
> 
> This looks racy. Not sure if it matters. But, I think the race is:
> 
> count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue.

I think Saam is right. Presumably the fix is to do the check and increment together under a lock?
Comment 6 Simon Fraser (smfr) 2018-01-18 18:09:38 PST
Comment on attachment 331678 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=331678&action=review

>>> Source/WebCore/platform/graphics/ImageSource.cpp:317
>>> +        return false;
>> 
>> This looks racy. Not sure if it matters. But, I think the race is:
>> 
>> count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue.
> 
> I think Saam is right. Presumably the fix is to do the check and increment together under a lock?

So if we can't make a queue we bail and decode on the main thread? Why not enqueue on some existing queue?
Comment 7 Tim Horton 2018-01-18 18:30:40 PST
Comment on attachment 331678 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=331678&action=review

>>>> Source/WebCore/platform/graphics/ImageSource.cpp:317
>>>> +        return false;
>>> 
>>> This looks racy. Not sure if it matters. But, I think the race is:
>>> 
>>> count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue.
>> 
>> I think Saam is right. Presumably the fix is to do the check and increment together under a lock?
> 
> So if we can't make a queue we bail and decode on the main thread? Why not enqueue on some existing queue?

Actually, I think the increment and check are both on the main thread?

Also, smfr’s point seems sensible.
Comment 8 Saam Barati 2018-01-18 18:34:33 PST
(In reply to Tim Horton from comment #7)
> Comment on attachment 331678 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=331678&action=review
> 
> >>>> Source/WebCore/platform/graphics/ImageSource.cpp:317
> >>>> +        return false;
> >>> 
> >>> This looks racy. Not sure if it matters. But, I think the race is:
> >>> 
> >>> count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue.
> >> 
> >> I think Saam is right. Presumably the fix is to do the check and increment together under a lock?
> > 
> > So if we can't make a queue we bail and decode on the main thread? Why not enqueue on some existing queue?
> 
> Actually, I think the increment and check are both on the main thread?
If so, I think we should remove the atomic then. I was just assuming it was off the main thread because of the atomic.

> 
> Also, smfr’s point seems sensible.
Comment 9 Simon Fraser (smfr) 2018-01-18 18:49:23 PST
Comment on attachment 331678 [details]
Patch

r- given the feedback.
Comment 10 Said Abou-Hallawa 2018-01-19 12:01:20 PST
Created attachment 331765 [details]
Patch
Comment 11 Said Abou-Hallawa 2018-01-19 12:17:40 PST
Comment on attachment 331678 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=331678&action=review

>>>>>> Source/WebCore/platform/graphics/ImageSource.cpp:317
>>>>>> +        return false;
>>>>> 
>>>>> This looks racy. Not sure if it matters. But, I think the race is:
>>>>> 
>>>>> count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue.
>>>> 
>>>> I think Saam is right. Presumably the fix is to do the check and increment together under a lock?
>>> 
>>> So if we can't make a queue we bail and decode on the main thread? Why not enqueue on some existing queue?
>> 
>> Actually, I think the increment and check are both on the main thread?
>> 
>> Also, smfr’s point seems sensible.
> 
> If so, I think we should remove the atomic then. I was just assuming it was off the main thread because of the atomic.

Re: This looks racy. Not sure if it matters. But, I think the race is:
    count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue.

No. This is not racy. The WorkQueues are created only from the main thread. Most of the time they are also destroyed on the main thread. The only case where a WorkQueue is destroyed on the decoding thread is when the ImageSource is deleted while the decoding is still running. In this case the only reference to the WorkQueue is kept by protectedDecodingQueue; see ImageSource::startAsyncDecodingQueue().

The condition above is okay. It is not ultimate but it will always be safe. The number of active WorkQueues can only decreased by the decoding threads. But no two threads will be increasing the number of active WorkQueues at the same time.
Comment 12 Said Abou-Hallawa 2018-01-19 12:38:47 PST
Comment on attachment 331678 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=331678&action=review

>>>>>>> Source/WebCore/platform/graphics/ImageSource.cpp:317
>>>>>>> +        return false;
>>>>>> 
>>>>>> This looks racy. Not sure if it matters. But, I think the race is:
>>>>>> 
>>>>>> count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue.
>>>>> 
>>>>> I think Saam is right. Presumably the fix is to do the check and increment together under a lock?
>>>> 
>>>> So if we can't make a queue we bail and decode on the main thread? Why not enqueue on some existing queue?
>>> 
>>> Actually, I think the increment and check are both on the main thread?
>>> 
>>> Also, smfr’s point seems sensible.
>> 
>> If so, I think we should remove the atomic then. I was just assuming it was off the main thread because of the atomic.
> 
> Re: This looks racy. Not sure if it matters. But, I think the race is:
>     count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue.
> 
> No. This is not racy. The WorkQueues are created only from the main thread. Most of the time they are also destroyed on the main thread. The only case where a WorkQueue is destroyed on the decoding thread is when the ImageSource is deleted while the decoding is still running. In this case the only reference to the WorkQueue is kept by protectedDecodingQueue; see ImageSource::startAsyncDecodingQueue().
> 
> The condition above is okay. It is not ultimate but it will always be safe. The number of active WorkQueues can only decreased by the decoding threads. But no two threads will be increasing the number of active WorkQueues at the same time.

Re: So if we can't make a queue we bail and decode on the main thread?

Actually the image frame will not be decoded on the main thread. It will be decoded on a separate thread by the system when the drawing is initiated.

Re: Why not enqueue on some existing queue?

We mix sync and async image in the same page: small images and large images, non first tile paint and first tile paint. But we never tried before to block decoding/displaying an image because another one is being decoded. I think the goal of the async image decoding is to utilize the system resources wisely. It should not stress the system such that it affects the overall system performance. If the system resources are being affected by the async image decoding, we should fallback to the sync path. And I think this patch its trying to fix that. I think also delaying drawing too many expensive items should be in a higher level than the image decoding.
Comment 13 Said Abou-Hallawa 2018-01-19 12:41:03 PST
Comment on attachment 331678 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=331678&action=review

>>>>>>>> Source/WebCore/platform/graphics/ImageSource.cpp:317
>>>>>>>> +        return false;
>>>>>>> 
>>>>>>> This looks racy. Not sure if it matters. But, I think the race is:
>>>>>>> 
>>>>>>> count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue.
>>>>>> 
>>>>>> I think Saam is right. Presumably the fix is to do the check and increment together under a lock?
>>>>> 
>>>>> So if we can't make a queue we bail and decode on the main thread? Why not enqueue on some existing queue?
>>>> 
>>>> Actually, I think the increment and check are both on the main thread?
>>>> 
>>>> Also, smfr’s point seems sensible.
>>> 
>>> If so, I think we should remove the atomic then. I was just assuming it was off the main thread because of the atomic.
>> 
>> Re: This looks racy. Not sure if it matters. But, I think the race is:
>>     count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue.
>> 
>> No. This is not racy. The WorkQueues are created only from the main thread. Most of the time they are also destroyed on the main thread. The only case where a WorkQueue is destroyed on the decoding thread is when the ImageSource is deleted while the decoding is still running. In this case the only reference to the WorkQueue is kept by protectedDecodingQueue; see ImageSource::startAsyncDecodingQueue().
>> 
>> The condition above is okay. It is not ultimate but it will always be safe. The number of active WorkQueues can only decreased by the decoding threads. But no two threads will be increasing the number of active WorkQueues at the same time.
> 
> Re: So if we can't make a queue we bail and decode on the main thread?
> 
> Actually the image frame will not be decoded on the main thread. It will be decoded on a separate thread by the system when the drawing is initiated.
> 
> Re: Why not enqueue on some existing queue?
> 
> We mix sync and async image in the same page: small images and large images, non first tile paint and first tile paint. But we never tried before to block decoding/displaying an image because another one is being decoded. I think the goal of the async image decoding is to utilize the system resources wisely. It should not stress the system such that it affects the overall system performance. If the system resources are being affected by the async image decoding, we should fallback to the sync path. And I think this patch its trying to fix that. I think also delaying drawing too many expensive items should be in a higher level than the image decoding.

Re: If so, I think we should remove the atomic then. I was just assuming it was off the main thread because of the atomic.

The increment can happen only from the main thread. But the decrement can happen from the main thread or from the decoding thread. This depends on which gets destroyed first: ImageSource or the decoding WorkQueue. So the increment and the decrement have to be synchronized.
Comment 14 Said Abou-Hallawa 2018-01-19 12:51:28 PST
(In reply to Said Abou-Hallawa from comment #10)
> Created attachment 331765 [details]
> Patch

This is a more involving patch. Yes it creates the WorkQueue and increments the count in one place. But it requires more changes since the drawing depends only on our heuristics. The <img "decoding=async"> attribute is only a web developer's recommendation which can be ignored. No events are firing when decoding an image finishes. But HTMLImageElement::decode() method has to force async image decoding and a promise resolving. Also with the drawing code has to fall back to the sync code path.
Comment 15 Simon Fraser (smfr) 2018-01-19 12:55:45 PST
Does this patch still fall back to sync decoding if we have too many images decoding right now? If so, I think that's still wrong; I think we should have a pool of queues and post decode requests onto those queues.
Comment 16 EWS Watchlist 2018-01-19 13:39:57 PST
Comment on attachment 331765 [details]
Patch

Attachment 331765 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/6139329

New failing tests:
http/tests/misc/resource-timing-resolution.html
Comment 17 EWS Watchlist 2018-01-19 13:39:59 PST
Created attachment 331783 [details]
Archive of layout-test-results from ews123 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews123  Port: ios-simulator-wk2  Platform: Mac OS X 10.12.6
Comment 18 Said Abou-Hallawa 2018-01-21 18:37:42 PST
Created attachment 331891 [details]
Patch
Comment 19 Said Abou-Hallawa 2018-01-21 19:49:18 PST
Created attachment 331892 [details]
Patch
Comment 20 Said Abou-Hallawa 2018-01-22 18:58:59 PST
<rdar://problem/36757735>
Comment 21 Geoffrey Garen 2018-01-23 17:07:37 PST
Comment on attachment 331892 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=331892&action=review

> Source/WebCore/ChangeLog:12
> +        We need to control the number of active threads which are created for image
> +        decoding. This has been an issue for power consumption and overall system
> +        degradation. This can be serious once the number of decoding thread exceeds
> +        the number of system available cores.

I don't think we have evidence that image decoding ever exceeds the number of available cores. Do we? Can you supply that evidence?

It would be surprising if image decoding exceeded the number of available cores, since WorkQueue uses libdispatch, whose primary job is to avoid creating a number of threads that exceeds the number of available cores. Also, if it were our goal to avoid using too many cores for image decoding, we would need to coordinate that limitation across web processes and non-web CPU activity. This patch doesn't do so, but again, luckily, libdispatch does.

Note that libdispatch offers a priority called "user initiated" which is lower than "user interactive". If we want to guarantee that image decoding never beats the main thread, we can test deprioritizing it to "user interactive".

I think I have to say r- because we can't accept an architectural change to improve performance without a test case that demonstrates that performance improved -- especially when we have reason to believe that the architecture change duplicates existing functionality in a potentially worse way.
Comment 22 Adrien Destugues 2022-09-25 04:23:28 PDT
> I don't think we have evidence that image decoding ever exceeds the number of available cores. Do we? Can you supply that evidence?

In HaikuWebKit I end up with hundreds of ImageDecoder threads.

> It would be surprising if image decoding exceeded the number of available cores, since WorkQueue uses libdispatch, whose primary job is to avoid creating a number of threads that exceeds the number of available cores.

That's the case on Apple builds, but not for other platforms which use WorkQueueGeneric, there one thread is created for each ImageSource. Maybe the solution is to rewrite WorkQueueGeneric to do its own thread pooling, then?