Bug 231163 - REGRESSION (iOS 15): Canvas Rendering feature leads to strange resize behaviour
Summary: REGRESSION (iOS 15): Canvas Rendering feature leads to strange resize behaviour
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Canvas (show other bugs)
Version: Safari 15
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Cameron McCormack (:heycam)
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-10-04 03:14 PDT by timocov
Modified: 2022-01-16 14:11 PST (History)
10 users (show)

See Also:


Attachments
repro (1.42 MB, image/gif)
2021-10-04 03:14 PDT, timocov
no flags Details
repro video (13.29 MB, video/mp4)
2021-10-04 03:16 PDT, timocov
no flags Details
test (652 bytes, text/html)
2021-12-14 22:13 PST, Cameron McCormack (:heycam)
no flags Details
test case frames in Web Inspector with checkbox unchecked (88.14 KB, image/png)
2021-12-14 22:22 PST, Cameron McCormack (:heycam)
no flags Details
test 2 (604 bytes, text/html)
2021-12-16 21:35 PST, Cameron McCormack (:heycam)
no flags Details
WIP patch (10.95 KB, patch)
2022-01-06 22:54 PST, Cameron McCormack (:heycam)
no flags Details | Formatted Diff | Diff
WIP v2 (11.37 KB, patch)
2022-01-11 22:31 PST, Cameron McCormack (:heycam)
no flags Details | Formatted Diff | Diff
Patch (14.81 KB, patch)
2022-01-13 13:58 PST, Cameron McCormack (:heycam)
no flags Details | Formatted Diff | Diff
Patch (12.55 KB, patch)
2022-01-13 16:44 PST, Cameron McCormack (:heycam)
no flags Details | Formatted Diff | Diff
Patch (12.55 KB, patch)
2022-01-16 13:19 PST, Cameron McCormack (:heycam)
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description timocov 2021-10-04 03:14:21 PDT
Created attachment 440045 [details]
repro

The strange behaviour we found with enabled "GPU Process: Canvas Rendering" experimental feature (see attached gif). If you disable this feature everything is fine.

To reproduce it you can do the following (see attached video):

1. Go to https://charting-library.tradingview.com/
2. Click on Indicators button in the header
3. Select "Accumulation/Distribution" there
4. Close the dialog and try to resize panes by dragging a handle between panes

=> When you resize a pane the top right canvas is cleared (the first price axis) for some reason, the right bottom canvas is fine even it is resized as well (the second price axis). If you disable "GPU Process: Canvas Rendering" feature then everything is fine with this.
Comment 1 timocov 2021-10-04 03:16:14 PDT
Created attachment 440046 [details]
repro video
Comment 2 Radar WebKit Bug Importer 2021-10-04 17:26:05 PDT
<rdar://problem/83863292>
Comment 3 Cameron McCormack (:heycam) 2021-12-14 22:13:04 PST
Thanks for filing the bug report!

When the resizer is dragged, the touch event handler changes the size of the canvases (assigning to their width and height properties, although the width of each canvas remains the same), and then calls requestAnimationFrame to schedule a repaint of the canvases. The touch event handler can be called multiple times before the rAF callback.

Attaching a test that I think reproduces the same problem.  The test uses setInterval(..., 1) in place of the touch events.  In the setInterval handler, it changes the size of the canvas and then calls requestAnimationFrame to schedule a paint, which just does a single fillRect.  The checkbox controls whether we also redundantly set the width of the canvas.  That seems to be key to causing the flickering problem.

I haven't been able to reproduce the problem on macOS, only iOS.
Comment 4 Cameron McCormack (:heycam) 2021-12-14 22:13:31 PST
Created attachment 447202 [details]
test
Comment 5 Cameron McCormack (:heycam) 2021-12-14 22:21:45 PST
Another odd thing is that with that test, with the checkbox unchecked, the flickering does not happen, but if I record the canvas operations using Web Inspector, then I do see 50% of the frames being blank.

Blank frames are kind of expected since the HTML spec says that when you assign to width/height, the backing store is resized (if needed) and its contents are cleared.
Comment 6 Cameron McCormack (:heycam) 2021-12-14 22:22:17 PST
Created attachment 447203 [details]
test case frames in Web Inspector with checkbox unchecked
Comment 7 Cameron McCormack (:heycam) 2021-12-16 21:35:21 PST
Created attachment 447424 [details]
test 2

Slightly revised test that doesn't rely on the "redundant assigning of canvas dimension causing the canvas to be cleared" behavior.
Comment 8 Cameron McCormack (:heycam) 2021-12-16 21:44:20 PST
So with that revised test:

* The page has a canvas.
* The canvas contents is an IOSurface-backed RemoteImageBuffer.
* The canvas element is composited, so has an IOSurface-backed layer.
* The page schedules 1 ms timer that:
    1. fills the canvas with a red rect
    2. changes canvas.height, which:
         * creates a new ImageBuffer for the canvas contents
         * flushes the old ImageBuffer since it gets derefed and destroyed
    3. calls rAF with a callback that fills the canvas with a green rect
* When updateRendering is called:
    * the current canvas contents ImageBuffer is flushed
    * the canvas contents ImageBuffer's IOSurface is copied into the canvas element layer ImageBuffer's IOSurface

And the result on the display is that sometimes the canvas appears red and sometimes green.
Comment 9 Cameron McCormack (:heycam) 2022-01-05 15:05:19 PST
After adding a bunch of logging, I think the issue is the following.

After the requestAnimationFrame callback is invoked, under RemoteLayerTreeDrawingArea::updateRendering, we draw the canvas ImageBuffer into its layer's ImageBuffer, and then asynchronously schedule a task to flush the layer's ImageBuffer and send the layer transaction to the UI process.  Between the end of updateRendering and this asynchronous task running, the setInterval callback can be invoked, which does some further drawing on the canvas ImageBuffer, and that further drawing then gets captured as part of the layer transaction.

Crucially, even though the drawing of the canvas ImageBuffer on to the layer ImageBuffer is done by creating a CGImage from the canvas ImageBuffer's IOSurface, and such an operation does capture the current state of the pixels in the IOSurface, the actual drawing of the canvas onto the layer is done later (because accelerated drawing is done off a queue that needs to be flushed) and CoreAnimation will actually use the IOSurface that the CGImage was originally created from, and not the captured pixels.

Because the scheduling of the layer transaction flushing / sending is not deterministic, the final IOSurface contents that are drawn may or may not capture the drawing that happens after the requestAnimationFrame (in the tests I've attached, that's the red rect; in the URL in comment 0, that's the canvas clearing operation that happens when the canvas size is assigned to but has the same value.
Comment 10 Cameron McCormack (:heycam) 2022-01-05 15:58:56 PST
Two solutions come to mind:

1. Flush the layers and commit the layer transaction synchronously at the end of updateRendering.
2. Keep it asynchronous, but cause it to happen immediately if any canvas drawing is done (maybe checking whether it's a canvas that was drawn to a layer ImageBuffer).
Comment 11 Cameron McCormack (:heycam) 2022-01-06 22:54:02 PST
Created attachment 448564 [details]
WIP patch
Comment 12 Cameron McCormack (:heycam) 2022-01-06 22:55:25 PST
This patch takes a different approach, which is to block the Web process when a canvas drawing method is called before the layer backing stores have been flushed.
Comment 13 Cameron McCormack (:heycam) 2022-01-11 22:31:06 PST
Created attachment 448907 [details]
WIP v2

And this is yet another approach, causing the layer surface to be copied if the source canvas is drawn to before the layer transaction is flushed.
Comment 14 Cameron McCormack (:heycam) 2022-01-13 13:58:49 PST
Created attachment 449104 [details]
Patch
Comment 15 Simon Fraser (smfr) 2022-01-13 14:23:01 PST
Comment on attachment 449104 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=449104&action=review

> Source/WebCore/html/HTMLCanvasElement.cpp:605
> +static bool imageDrawingRequiresCopyOnWrite(Page* page, GraphicsContext& context, ImageBuffer& imageBuffer)

const ImageBuffer&

I wonder if we can come up with a term that isn't "CopyOnWrite" to describe what has to happen. The fact that it's a COW is a detail. Maybe something involving "guard".

> Source/WebCore/page/Page.h:935
> +    enum class LayerTreeTransactionState {
> +        Flushed,
> +        Building,
> +        AwaitingFlush
> +    };

Not really a fan of pushing WebKit2 implementation detail down here to Page, but not sure how else to do it other than ChromeClient hooks.

You only really care about the AwaitingFlush state.

> Source/WebCore/platform/graphics/cg/ImageBufferIOSurfaceBackend.cpp:250
> +    flushContext();

Is the flushContext needed?
Comment 16 Cameron McCormack (:heycam) 2022-01-13 14:42:45 PST
(In reply to Simon Fraser (smfr) from comment #15)
> > Source/WebCore/page/Page.h:935
> > +    enum class LayerTreeTransactionState {
> > +        Flushed,
> > +        Building,
> > +        AwaitingFlush
> > +    };
> 
> Not really a fan of pushing WebKit2 implementation detail down here to Page,
> but not sure how else to do it other than ChromeClient hooks.
> 
> You only really care about the AwaitingFlush state.

I wanted to check Building because HTMLCanvasElement::paint could be called in some situation other than layer transaction building, but maybe those cases are not common enough to care about.

> > Source/WebCore/platform/graphics/cg/ImageBufferIOSurfaceBackend.cpp:250
> > +    flushContext();
> 
> Is the flushContext needed?

Yes, since the COW thing CGImage does will only happen once the no-op drawing command is actually processed.
Comment 17 Cameron McCormack (:heycam) 2022-01-13 15:18:31 PST
(In reply to Simon Fraser (smfr) from comment #15)
> I wonder if we can come up with a term that isn't "CopyOnWrite" to describe
> what has to happen. The fact that it's a COW is a detail. Maybe something
> involving "guard".

I will rename this function and the member variable to "MustGuardAgainstPendingLayerTransaction".  And I'll rename the ImageBufferBackend function to "ensureNativeImagesHaveCopiedBackingStore", as that matches the wording of the copyNativeImage call that is used to make the CGImage in the first place.
Comment 18 Cameron McCormack (:heycam) 2022-01-13 16:44:25 PST
Created attachment 449125 [details]
Patch
Comment 19 EWS 2022-01-16 13:10:55 PST
ChangeLog entry in Source/WebKit/ChangeLog contains OOPS!.
Comment 20 Cameron McCormack (:heycam) 2022-01-16 13:19:07 PST
Created attachment 449286 [details]
Patch
Comment 21 EWS 2022-01-16 14:11:09 PST
Committed r288076 (246096@main): <https://commits.webkit.org/246096@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 449286 [details].