Limit the number of images that are asynchronously decoded at a time to half the number of cores
https://bugs.webkit.org/show_bug.cgi?id=181711
Summary Limit the number of images that are asynchronously decoded at a time to half ...
Said Abou-Hallawa
Reported 2018-01-16 17:54:19 PST
Asynchronous image decoding can accelerate the image rendering and allow the main thread to process other activities. But if the number of images that are asynchronously decoded increased, such that the number of decoding thread exceeds the number of cores on the system, the overall performance of the system will be affected. Even the rendering in WebKit will be choppy. What we need do is we need to use the system cores instead of leaving them idle. But at the same time, do not be too greedy and don't use all the available cores and don't consume too much power for decoding large or animated images. I think a good strategy for this purpose is to allow asynchronous image decoding to take up to half the available cores.
Attachments
Patch (6.94 KB, patch)
2018-01-16 18:25 PST, Said Abou-Hallawa
no flags
Patch (9.06 KB, patch)
2018-01-18 17:34 PST, Said Abou-Hallawa
no flags
Patch (18.39 KB, patch)
2018-01-19 12:01 PST, Said Abou-Hallawa
no flags
Archive of layout-test-results from ews123 for ios-simulator-wk2 (2.20 MB, application/zip)
2018-01-19 13:39 PST, EWS Watchlist
no flags
Patch (63.20 KB, patch)
2018-01-21 18:37 PST, Said Abou-Hallawa
no flags
Patch (61.95 KB, patch)
2018-01-21 19:49 PST, Said Abou-Hallawa
ggaren: review-
Said Abou-Hallawa
Comment 1 2018-01-16 18:25:13 PST
Said Abou-Hallawa
Comment 2 2018-01-16 18:26:35 PST
Said Abou-Hallawa
Comment 3 2018-01-18 17:34:16 PST
Saam Barati
Comment 4 2018-01-18 17:50:07 PST
Comment on attachment 331678 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=331678&action=review > Source/WebCore/platform/graphics/ImageSource.cpp:317 > + if (!DecodingQueue::canCreateDecodingQueue()) > + return false; This looks racy. Not sure if it matters. But, I think the race is: count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue.
Tim Horton
Comment 5 2018-01-18 17:53:07 PST
Comment on attachment 331678 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=331678&action=review >> Source/WebCore/platform/graphics/ImageSource.cpp:317 >> + return false; > > This looks racy. Not sure if it matters. But, I think the race is: > > count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue. I think Saam is right. Presumably the fix is to do the check and increment together under a lock?
Simon Fraser (smfr)
Comment 6 2018-01-18 18:09:38 PST
Comment on attachment 331678 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=331678&action=review >>> Source/WebCore/platform/graphics/ImageSource.cpp:317 >>> + return false; >> >> This looks racy. Not sure if it matters. But, I think the race is: >> >> count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue. > > I think Saam is right. Presumably the fix is to do the check and increment together under a lock? So if we can't make a queue we bail and decode on the main thread? Why not enqueue on some existing queue?
Tim Horton
Comment 7 2018-01-18 18:30:40 PST
Comment on attachment 331678 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=331678&action=review >>>> Source/WebCore/platform/graphics/ImageSource.cpp:317 >>>> + return false; >>> >>> This looks racy. Not sure if it matters. But, I think the race is: >>> >>> count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue. >> >> I think Saam is right. Presumably the fix is to do the check and increment together under a lock? > > So if we can't make a queue we bail and decode on the main thread? Why not enqueue on some existing queue? Actually, I think the increment and check are both on the main thread? Also, smfr’s point seems sensible.
Saam Barati
Comment 8 2018-01-18 18:34:33 PST
(In reply to Tim Horton from comment #7) > Comment on attachment 331678 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=331678&action=review > > >>>> Source/WebCore/platform/graphics/ImageSource.cpp:317 > >>>> + return false; > >>> > >>> This looks racy. Not sure if it matters. But, I think the race is: > >>> > >>> count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue. > >> > >> I think Saam is right. Presumably the fix is to do the check and increment together under a lock? > > > > So if we can't make a queue we bail and decode on the main thread? Why not enqueue on some existing queue? > > Actually, I think the increment and check are both on the main thread? If so, I think we should remove the atomic then. I was just assuming it was off the main thread because of the atomic. > > Also, smfr’s point seems sensible.
Simon Fraser (smfr)
Comment 9 2018-01-18 18:49:23 PST
Comment on attachment 331678 [details] Patch r- given the feedback.
Said Abou-Hallawa
Comment 10 2018-01-19 12:01:20 PST
Said Abou-Hallawa
Comment 11 2018-01-19 12:17:40 PST
Comment on attachment 331678 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=331678&action=review >>>>>> Source/WebCore/platform/graphics/ImageSource.cpp:317 >>>>>> + return false; >>>>> >>>>> This looks racy. Not sure if it matters. But, I think the race is: >>>>> >>>>> count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue. >>>> >>>> I think Saam is right. Presumably the fix is to do the check and increment together under a lock? >>> >>> So if we can't make a queue we bail and decode on the main thread? Why not enqueue on some existing queue? >> >> Actually, I think the increment and check are both on the main thread? >> >> Also, smfr’s point seems sensible. > > If so, I think we should remove the atomic then. I was just assuming it was off the main thread because of the atomic. Re: This looks racy. Not sure if it matters. But, I think the race is: count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue. No. This is not racy. The WorkQueues are created only from the main thread. Most of the time they are also destroyed on the main thread. The only case where a WorkQueue is destroyed on the decoding thread is when the ImageSource is deleted while the decoding is still running. In this case the only reference to the WorkQueue is kept by protectedDecodingQueue; see ImageSource::startAsyncDecodingQueue(). The condition above is okay. It is not ultimate but it will always be safe. The number of active WorkQueues can only decreased by the decoding threads. But no two threads will be increasing the number of active WorkQueues at the same time.
Said Abou-Hallawa
Comment 12 2018-01-19 12:38:47 PST
Comment on attachment 331678 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=331678&action=review >>>>>>> Source/WebCore/platform/graphics/ImageSource.cpp:317 >>>>>>> + return false; >>>>>> >>>>>> This looks racy. Not sure if it matters. But, I think the race is: >>>>>> >>>>>> count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue. >>>>> >>>>> I think Saam is right. Presumably the fix is to do the check and increment together under a lock? >>>> >>>> So if we can't make a queue we bail and decode on the main thread? Why not enqueue on some existing queue? >>> >>> Actually, I think the increment and check are both on the main thread? >>> >>> Also, smfr’s point seems sensible. >> >> If so, I think we should remove the atomic then. I was just assuming it was off the main thread because of the atomic. > > Re: This looks racy. Not sure if it matters. But, I think the race is: > count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue. > > No. This is not racy. The WorkQueues are created only from the main thread. Most of the time they are also destroyed on the main thread. The only case where a WorkQueue is destroyed on the decoding thread is when the ImageSource is deleted while the decoding is still running. In this case the only reference to the WorkQueue is kept by protectedDecodingQueue; see ImageSource::startAsyncDecodingQueue(). > > The condition above is okay. It is not ultimate but it will always be safe. The number of active WorkQueues can only decreased by the decoding threads. But no two threads will be increasing the number of active WorkQueues at the same time. Re: So if we can't make a queue we bail and decode on the main thread? Actually the image frame will not be decoded on the main thread. It will be decoded on a separate thread by the system when the drawing is initiated. Re: Why not enqueue on some existing queue? We mix sync and async image in the same page: small images and large images, non first tile paint and first tile paint. But we never tried before to block decoding/displaying an image because another one is being decoded. I think the goal of the async image decoding is to utilize the system resources wisely. It should not stress the system such that it affects the overall system performance. If the system resources are being affected by the async image decoding, we should fallback to the sync path. And I think this patch its trying to fix that. I think also delaying drawing too many expensive items should be in a higher level than the image decoding.
Said Abou-Hallawa
Comment 13 2018-01-19 12:41:03 PST
Comment on attachment 331678 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=331678&action=review >>>>>>>> Source/WebCore/platform/graphics/ImageSource.cpp:317 >>>>>>>> + return false; >>>>>>> >>>>>>> This looks racy. Not sure if it matters. But, I think the race is: >>>>>>> >>>>>>> count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue. >>>>>> >>>>>> I think Saam is right. Presumably the fix is to do the check and increment together under a lock? >>>>> >>>>> So if we can't make a queue we bail and decode on the main thread? Why not enqueue on some existing queue? >>>> >>>> Actually, I think the increment and check are both on the main thread? >>>> >>>> Also, smfr’s point seems sensible. >>> >>> If so, I think we should remove the atomic then. I was just assuming it was off the main thread because of the atomic. >> >> Re: This looks racy. Not sure if it matters. But, I think the race is: >> count() is 1 away from returning false from this function. Two threads then ask this question at the same time, then both threads end up creating a queue. >> >> No. This is not racy. The WorkQueues are created only from the main thread. Most of the time they are also destroyed on the main thread. The only case where a WorkQueue is destroyed on the decoding thread is when the ImageSource is deleted while the decoding is still running. In this case the only reference to the WorkQueue is kept by protectedDecodingQueue; see ImageSource::startAsyncDecodingQueue(). >> >> The condition above is okay. It is not ultimate but it will always be safe. The number of active WorkQueues can only decreased by the decoding threads. But no two threads will be increasing the number of active WorkQueues at the same time. > > Re: So if we can't make a queue we bail and decode on the main thread? > > Actually the image frame will not be decoded on the main thread. It will be decoded on a separate thread by the system when the drawing is initiated. > > Re: Why not enqueue on some existing queue? > > We mix sync and async image in the same page: small images and large images, non first tile paint and first tile paint. But we never tried before to block decoding/displaying an image because another one is being decoded. I think the goal of the async image decoding is to utilize the system resources wisely. It should not stress the system such that it affects the overall system performance. If the system resources are being affected by the async image decoding, we should fallback to the sync path. And I think this patch its trying to fix that. I think also delaying drawing too many expensive items should be in a higher level than the image decoding. Re: If so, I think we should remove the atomic then. I was just assuming it was off the main thread because of the atomic. The increment can happen only from the main thread. But the decrement can happen from the main thread or from the decoding thread. This depends on which gets destroyed first: ImageSource or the decoding WorkQueue. So the increment and the decrement have to be synchronized.
Said Abou-Hallawa
Comment 14 2018-01-19 12:51:28 PST
(In reply to Said Abou-Hallawa from comment #10) > Created attachment 331765 [details] > Patch This is a more involving patch. Yes it creates the WorkQueue and increments the count in one place. But it requires more changes since the drawing depends only on our heuristics. The <img "decoding=async"> attribute is only a web developer's recommendation which can be ignored. No events are firing when decoding an image finishes. But HTMLImageElement::decode() method has to force async image decoding and a promise resolving. Also with the drawing code has to fall back to the sync code path.
Simon Fraser (smfr)
Comment 15 2018-01-19 12:55:45 PST
Does this patch still fall back to sync decoding if we have too many images decoding right now? If so, I think that's still wrong; I think we should have a pool of queues and post decode requests onto those queues.
EWS Watchlist
Comment 16 2018-01-19 13:39:57 PST
Comment on attachment 331765 [details] Patch Attachment 331765 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/6139329 New failing tests: http/tests/misc/resource-timing-resolution.html
EWS Watchlist
Comment 17 2018-01-19 13:39:59 PST
Created attachment 331783 [details] Archive of layout-test-results from ews123 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews123 Port: ios-simulator-wk2 Platform: Mac OS X 10.12.6
Said Abou-Hallawa
Comment 18 2018-01-21 18:37:42 PST
Said Abou-Hallawa
Comment 19 2018-01-21 19:49:18 PST
Said Abou-Hallawa
Comment 20 2018-01-22 18:58:59 PST
Geoffrey Garen
Comment 21 2018-01-23 17:07:37 PST
Comment on attachment 331892 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=331892&action=review > Source/WebCore/ChangeLog:12 > + We need to control the number of active threads which are created for image > + decoding. This has been an issue for power consumption and overall system > + degradation. This can be serious once the number of decoding thread exceeds > + the number of system available cores. I don't think we have evidence that image decoding ever exceeds the number of available cores. Do we? Can you supply that evidence? It would be surprising if image decoding exceeded the number of available cores, since WorkQueue uses libdispatch, whose primary job is to avoid creating a number of threads that exceeds the number of available cores. Also, if it were our goal to avoid using too many cores for image decoding, we would need to coordinate that limitation across web processes and non-web CPU activity. This patch doesn't do so, but again, luckily, libdispatch does. Note that libdispatch offers a priority called "user initiated" which is lower than "user interactive". If we want to guarantee that image decoding never beats the main thread, we can test deprioritizing it to "user interactive". I think I have to say r- because we can't accept an architectural change to improve performance without a test case that demonstrates that performance improved -- especially when we have reason to believe that the architecture change duplicates existing functionality in a potentially worse way.
Adrien Destugues
Comment 22 2022-09-25 04:23:28 PDT
> I don't think we have evidence that image decoding ever exceeds the number of available cores. Do we? Can you supply that evidence? In HaikuWebKit I end up with hundreds of ImageDecoder threads. > It would be surprising if image decoding exceeded the number of available cores, since WorkQueue uses libdispatch, whose primary job is to avoid creating a number of threads that exceeds the number of available cores. That's the case on Apple builds, but not for other platforms which use WorkQueueGeneric, there one thread is created for each ImageSource. Maybe the solution is to rewrite WorkQueueGeneric to do its own thread pooling, then?
Note You need to log in before you can comment on or make changes to this bug.