Bug 239506 - Fix accelerated compositing for null ScrollingCoordinator and overflow scroll in LayerAncestorClippingStack.
Summary: Fix accelerated compositing for null ScrollingCoordinator and overflow scroll...
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Compositing (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Jigen Zhou
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2022-04-19 10:20 PDT by Jigen Zhou
Modified: 2022-05-02 16:33 PDT (History)
8 users (show)

See Also:


Attachments
Patch (2.46 KB, patch)
2022-04-19 20:09 PDT, Jigen Zhou
no flags Details | Formatted Diff | Diff
Patch (3.26 KB, patch)
2022-04-26 21:08 PDT, Jigen Zhou
no flags Details | Formatted Diff | Diff
Patch (4.55 KB, patch)
2022-04-26 21:27 PDT, Jigen Zhou
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (6.15 KB, patch)
2022-04-28 18:20 PDT, Jigen Zhou
simon.fraser: review-
simon.fraser: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jigen Zhou 2022-04-19 10:20:52 PDT
With null ScrollingCoordinator and overflow scroll, a crash can occur in LayerAncestorClippingStack::updateWithClipData in the following code block:

        if (existingEntry.clipData.isOverflowScroll && !clipDataEntry.isOverflowScroll) {
            ASSERT(scrollingCoordinator);
            scrollingCoordinator->unparentChildrenAndDestroyNode(existingEntry.overflowScrollProxyNodeID);
            existingEntry.overflowScrollProxyNodeID = 0;
        }

(https://github.com/WebKit/WebKit/blob/main/Source/WebCore/rendering/LayerAncestorClippingStack.cpp#L137)
Comment 1 Jigen Zhou 2022-04-19 20:09:43 PDT
Created attachment 457951 [details]
Patch
Comment 2 Simon Fraser (smfr) 2022-04-19 20:47:54 PDT
Do you have a testcase that hits this crash?
Comment 3 Jigen Zhou 2022-04-20 08:46:05 PDT
(In reply to Simon Fraser (smfr) from comment #2)
> Do you have a testcase that hits this crash?

Yes, we have a crash when running MiniBrowser built for Playstation with AC for youtube.com. Sometimes, just scrolling down and entering a clip, WebProcess can crash in the above code block.
Comment 4 Simon Fraser (smfr) 2022-04-20 09:12:07 PDT
Can you create a reduction and include a test with the patch?
Comment 5 Radar WebKit Bug Importer 2022-04-26 10:21:12 PDT
<rdar://problem/92342843>
Comment 6 Jigen Zhou 2022-04-26 21:08:50 PDT
Created attachment 458416 [details]
Patch
Comment 7 Jigen Zhou 2022-04-26 21:15:56 PDT
(In reply to Simon Fraser (smfr) from comment #4)
> Can you create a reduction and include a test with the patch?

Hi Simon, I just updated the patch with a reduction test that can reproduce the crash for Playstation build. This is the first time I am adding a layout test, please let me know I missed anything. Thanks.
Comment 8 Jigen Zhou 2022-04-26 21:27:43 PDT
Created attachment 458418 [details]
Patch

Including the new test file.
Comment 9 Jigen Zhou 2022-04-28 18:20:57 PDT
Created attachment 458561 [details]
Patch

Updated the test.
Comment 10 Simon Fraser (smfr) 2022-05-02 15:40:53 PDT
Comment on attachment 458561 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=458561&action=review

> LayoutTests/compositing/overflow/overflow-scroll-hidden-swap-crash.html:64
> +                }, 100);
> +              }, 100);
> +            }, 100);
> +          }, 100);
> +        }, 100);
> +      }, 100);

This test will take 600ms to run, and seems to be trying to catch non-deterministic behavior. It needs to run quicker (imagine if all 40,000 tests took > 500ms).

> Source/WebCore/rendering/LayerAncestorClippingStack.cpp:69
> +//
> +// If scrollingCoordinator is null, overflowScrollProxyNodeID should be 0 for all existing entries
> +// of the clipping stack. The non-zero for overflowScrollProxyNodeID can only possibily be set in
> +// RenderLayerCompositor::updateScrollingNodeForScrollingProxyRole, but it won't be called for null
> +// scrollingCoordinator.
> +//

Weird for this comment to be here and not next to the relevant code.

> Source/WebCore/rendering/LayerAncestorClippingStack.cpp:147
> +        if (existingEntry.overflowScrollProxyNodeID && existingEntry.clipData.isOverflowScroll && !clipDataEntry.isOverflowScroll) {

If existingEntry.clipData.isOverflowScroll is set, I would expect a non-0 existingEntry.overflowScrollProxyNodeID. When does this happen?
Comment 11 Simon Fraser (smfr) 2022-05-02 15:43:55 PDT
This test does not trigger a crash on macOS without the rest of the patch, so I suspect something specific to Playstation here.
Comment 12 Jigen Zhou 2022-05-02 16:28:03 PDT
(In reply to Simon Fraser (smfr) from comment #10)
> Comment on attachment 458561 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=458561&action=review
> 
> > LayoutTests/compositing/overflow/overflow-scroll-hidden-swap-crash.html:64
> > +                }, 100);
> > +              }, 100);
> > +            }, 100);
> > +          }, 100);
> > +        }, 100);
> > +      }, 100);
> 
> This test will take 600ms to run, and seems to be trying to catch
> non-deterministic behavior. It needs to run quicker (imagine if all 40,000
> tests took > 500ms).

The multiple levels of setTimeout in this test is meant to catch a crash with higher possibilities due to the special non-deterministic nature. I tested with various shorter durations, but it seems that the current one captures the crash better.

> 
> > Source/WebCore/rendering/LayerAncestorClippingStack.cpp:69
> > +//
> > +// If scrollingCoordinator is null, overflowScrollProxyNodeID should be 0 for all existing entries
> > +// of the clipping stack. The non-zero for overflowScrollProxyNodeID can only possibily be set in
> > +// RenderLayerCompositor::updateScrollingNodeForScrollingProxyRole, but it won't be called for null
> > +// scrollingCoordinator.
> > +//
> 
> Weird for this comment to be here and not next to the relevant code.
>
 
When I read the code for LayerAncestorClippingStack, it took me a while to understand why for non-zero overflowScrollProxyNodeID, scrollingCoordinator can be assumed to be valid in multiple functions in the file. If comment like this was somewhere in the code, it would have helped me a little bit as a new developer in this part. 

> > Source/WebCore/rendering/LayerAncestorClippingStack.cpp:147
> > +        if (existingEntry.overflowScrollProxyNodeID && existingEntry.clipData.isOverflowScroll && !clipDataEntry.isOverflowScroll) {
> 
> If existingEntry.clipData.isOverflowScroll is set, I would expect a non-0
> existingEntry.overflowScrollProxyNodeID. When does this happen?

Apparently this is not the case in the crash we have experienced. We have null scrollingCoordinator, overflowScrollProxyNodeID is always zero (as the above comment shows). The above test was to simulate the scenario when swapping overflow between scroll and hidden.  Then we have isOverflowScroll set for existingEntry, isOverflowScroll is not set for clipDataEntry.
Comment 13 Jigen Zhou 2022-05-02 16:33:05 PDT
(In reply to Simon Fraser (smfr) from comment #11)
> This test does not trigger a crash on macOS without the rest of the patch,
> so I suspect something specific to Playstation here.

In our case, scrollingCoordinator is always null. Otherwise, there is some randomness to capture the crash. But just analyzing the code itself, it is pretty obvious to me there is a crash scenario there.