Bug 240865

Summary:

[GTK] Webpages completely slow down when CSS blur filter is in use

Product:

WebKit

Reporter:

Caden Mitchell <caden>

Component:

WebKitGTK

Assignee:

Nobody <webkit-unassigned>

Status:

NEW

Severity:

Normal

CC:

adam.gamble, bugs-noreply, cgarcia, fujii.hironori, kdwkleung, magomez, mcatanzaro, nekohayo, zdobersek

Priority:

Keywords:

Gtk

Version:

WebKit Nightly Build

Hardware:

OS:

Linux

Bug Depends on:

Bug Blocks:

245783

Attachments:

Description	Flags
Test case	none

Caden Mitchell

Reported 2022-05-24 04:23:00 PDT

In recent WebkitGTK versions, when encountering an image with the CSS property filter: blur *px; the webpage performance can drop to over ten seconds per frame, essentially making the page unusable. This could have something to do with my high DPI monitor, but regardless, this is on a fairly high end PC with Ryzen 7 and dedicated GPU. This issue can be observed on Reddit when a post marked as "Spoiler" appears. The element takes an image and blurs it with a value of 60px, which brings the browser to its knees. Browser: GNOME Web v42.2 (and earlier versions) System: AMD Ryzen 7 CPU AMD RX 5700 XT GPU GNOME on Wayland (also happens on X11)

Attachments
Test case (4.25 KB, application/x-bzip) 2022-09-29 05:11 PDT, Carlos Garcia Campos	no flags	Details
View All Add attachment proposed patch, testcase, etc.

Caden Mitchell

Comment 1 2022-05-24 04:25:44 PDT

I should also note, after the element has been loaded for a very long time, it may begin to run smoothly again. However, resizing the element or the window will result in the page slowing down again.

Michael Catanzaro

Comment 2 2022-05-24 05:00:08 PDT

Pretty sure somebody else reported this on a different bug, though I can't immediately find it. At least you're not alone....

Carlos Garcia Campos

Comment 3 2022-09-29 05:07:08 PDT

It seems the blur filter done by the cpu is slow. We might try to ensure it's always done by the GPU, but unfortunately it doesn't seem to work properly. Mac uses the accelerate library that I guess makes it a lot faster. I'll submit a test case.

Carlos Garcia Campos

Comment 4 2022-09-29 05:11:31 PDT

Created attachment 462701 [details] Test case There are two files in the tarball: - test-cpu.html: it has several images with blur filter. It's slow, it's easier to reproduce by resizing the window rather than just scrolling. - test-gpu: it's the same but adding a translateZ transform to force the images to be accelerated. Performance is much better, but there are two issues: + Rendering is ot the same as the sofotware version, it looks like pixelated. + Images disappear when scrolling and they aren't rendered anymore even if the window is resized.

Fujii Hironori

Comment 5 2022-09-29 14:09:45 PDT

The pixelated issue is tracked by bug#231653.

Miguel Gomez

Comment 6 2023-06-12 01:54:31 PDT

I've been giving a look for a while to how the blur filter works with opengl, but I need to change to another task, so I'll do a brain dump here of the things that I found out to catch up when I have more time. The blur filter is applied as an weighted average of the values of the pixels around the target pixel. These weights are calculated as from a gaussian distribution, so the weight is smaller as they get further from the target pixel. The filter radius specifies how many pixels to the lest/right/above/below are taken into account for the calculation. Instead of accessing all the pixel values that would be needed for each pixel, it's common that this is implemented as a 2 pass rendering, where during the first pass only the horizontal pixels are taken into account, and then during the second pass only the vertical pixels are taken into account. The combination of these two passes produces the same result with less accesses to the pixels, which improves the performance. This is what the TextureMapper implementation does. Then, regarding the specific implementation details: - TextureMapperGL is the one creating the gauss values and passing them to the TextureMapperShaderProgram in prepareFilterProgram. - In theory, we should be sampling as many pixels in each direction as the radius, but we're not doing so. We are always sampling 10 pixels in each direction (defined in GaussianKernelHalfWidth, which is 11 but it includes the target pixel, so it means 10 really). This means that we sample 10 pixels, the original pixel and then another 10 pixels, and this happens in both passes, so it's done horizontally and vertically. - If the radius is somewhat close to those 10 pixels, the result is going to be quite similar to the cairo implementation. But as the radius gets bigger and bigger, the result gets more different and the pixelated effect starts to show. This happens because as the radius grow, those 10 pixels that we sample get more and more separated among themselves, so we are sampling pixels that are not that close or related to the original pixel, so their contents are not related. Example to show the effect: 20x20 image, getting the result for pixel (10,10) by sampling 3 pixels (one to the left and one to the right, instead of the ten that we have in the real implementation). The gauss weights are (0.25, 0.5, 0.25). Talking only about the horizontal pass, but the vertical one has the same problem. - if the radius is one, we're sampling pixels (9,10), (10,10) and (11,10) with the assigned weights, which is perfect, as we're averaging the value of nearby pixels. - if the radius grows to 5, then we are sampling pixels (5,10), (10,10) and (15,10). But those pixels are not that close to the target pixel, so their content is not that related to it, and they will corrupt the result. Keep in mind that in the normal case, the second biggest weight is used for the pixels that are besides the target one, and in this case it's applied to a pixel far from the target one. The test attached by Carlos uses a radius of 60px. This means that we're sampling 10 pixels to each side, one every (60 pixels / 10 samples) 6 pixels, which shows the strange pixelated effect. As the radius gets reduced, the result gets closer and closed to the cairo one, because the pixels sampled get closer and closer to the nearby ones. There's also a detail in the implementation that I think it's a bug, and it's definition of GaussianKernelStep to 0.2. This defines the advance for each sample. If we're using 10 samples in each direction it should be 0.1, as the position gets calculated as i*step*radius (i=1..10). But being it 0.2, this means that we're sampling up to the double of the radius, which makes the result even worse. So, after all this mess, what's the proper fix to this? - First would be changing that GaussianKernelStep from 0.2 to 0.1. That will keep the result fine for bigger radius. - Eventually the radius will get big enough to cause the buggy rendering. There are several options here: * Increase the number of samples to match the radius. This would produce the perfect result always, but the computational cost will grow a lot as the radius grows. * Increase the number of samples from the current 10 to something bigger, like 30. This would make the result fine fine for bigger radius values, but there will be a point where the result will get buggy anyway. * I saw that some implementations use a trick here. Taking advantage of the interpolation done by opengl when copying textures to different sizes, what they do is they reduce the size of the image to filter and the radius to match the number of pixels that we're sampling. So if the image is 100x100 and we want to use a radius of 40, that's the same than sampling and image of 50x50 with a radius of 20, and the same as an image of 25x25 with a radius of 10, that we can do with our current 10 samples!. I think this is the way to go, despite it requires some not so obvious changes to TextureMapperLayer to perform this downscaling.

Miguel Gomez

Comment 7 2023-06-12 01:56:56 PDT

> - In theory, we should be sampling as many pixels in each direction as the > radius, but we're not doing so. We are always sampling 10 pixels in each > direction (defined in GaussianKernelHalfWidth, which is 11 but it includes > the target pixel, so it means 10 really). This means that we sample 10 > pixels, the original pixel and then another 10 pixels, and this happens in > both passes, so it's done horizontally and vertically. I mean 10 pixels to each side of the target pixel. So 10 to the left and 10 to the right.

Kdwk

Comment 8 2023-09-06 21:05:13 PDT

Pixelated issue seems to be solved by a PR. Would it be possible to use GPU for blur now?

Fujii Hironori

Comment 9 2023-09-06 21:39:59 PDT

Yes. bug#231653 is fixed. But, you have to add a CSS property will-change:transform or transform:translateZ(0) to a element to composite with GPU.

Kdwk

Comment 10 2023-09-06 21:42:09 PDT

(In reply to Carlos Garcia Campos from comment #3) > It seems the blur filter done by the cpu is slow. We might try to ensure > it's always done by the GPU, but unfortunately it doesn't seem to work > properly. Mac uses the accelerate library that I guess makes it a lot > faster. I'll submit a test case. I was referring to this comment. It would be great if the GPU can handle all blurs so as to take the load off the CPU entirely, and greatly improve performance.

Michael Catanzaro

Comment 11 2023-09-09 07:01:28 PDT

I see a number of related fixes in bug #261022, bug #261101, bug #261187, and bug #261102. These are all present in 2.41.92. Please check and see if the problem is resolved.

Michael Catanzaro

Comment 12 2023-09-09 07:03:50 PDT

(In reply to Carlos Garcia Campos from comment #4) > Created attachment 462701 [details] > Test case > > There are two files in the tarball: > > - test-cpu.html: it has several images with blur filter. It's slow, it's > easier to reproduce by resizing the window rather than just scrolling. > > - test-gpu: it's the same but adding a translateZ transform to force the > images to be accelerated. Performance is much better, but there are two > issues: > + Rendering is ot the same as the sofotware version, it looks like > pixelated. > + Images disappear when scrolling and they aren't rendered anymore even if > the window is resized. I tried these test files with 2.41.92 (but not 2.41.91, so I don't know how bad it was before). I didn't notice any performance problems with either test. But with the test-gpu, I noticed that rendering breaks after I scroll down and then scroll back up. The web content becomes all white. Should I report a new bug for this?

Kdwk

Comment 13 2024-05-14 21:25:26 PDT

This should be fixed with Skia

Note You need to log in before you can comment on or make changes to this bug.