Bug 223885

Summary: RemoteRenderingBackend should clean up CG resources after last Canvas is destroyed in Gpu process
Product: WebKit Reporter: Kimmo Kinnunen <kkinnunen>
Component: CanvasAssignee: Kimmo Kinnunen <kkinnunen>
Status: RESOLVED FIXED    
Severity: Normal CC: annulen, dino, eric.carlson, ews-watchlist, glenn, gyuyoung.kim, jer.noble, jonlee, philipj, ryuan.choi, sergio, simon.fraser, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Local Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch none

Description Kimmo Kinnunen 2021-03-29 10:55:43 PDT
RemoteRenderingBackend should clean up CG resources after last Canvas is destroyed in Gpu process
Comment 1 Kimmo Kinnunen 2021-03-29 10:56:21 PDT
<rdar://problem/75637356>
Comment 2 Kimmo Kinnunen 2021-03-29 11:01:51 PDT
Created attachment 424549 [details]
Patch
Comment 3 Simon Fraser (smfr) 2021-03-29 11:08:15 PDT
Comment on attachment 424549 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=424549&action=review

> Source/WebCore/platform/graphics/cg/GraphicsUtilitiesCG.h:32
> +WEBCORE_EXPORT void releaseUnusedCoreGraphicsResources();

These are not "core graphics" resources per se. I would call them "rendering resources" or something.

> Source/WebCore/platform/graphics/cg/GraphicsUtilitiesCG.mm:47
> +    auto devices = adoptNS(MTLCopyAllDevices());

Do we know that this isn't the first time MTLCopyAllDevices() is called, which would possibly trigger allocation of more resources?

> Source/WebKit/GPUProcess/graphics/RemoteRenderingBackend.cpp:70
> +static constexpr Seconds releaseAllResourcesIfUnusedTimeout = 0.2_s;

That seems really short.

> Source/WebKit/GPUProcess/graphics/RemoteRenderingBackend.cpp:111
> +        --remoteRenderingBackendCount;

You should ASSERT(remoteRenderingBackendCount > 0) before this.
Comment 4 Radar WebKit Bug Importer 2021-03-29 11:18:41 PDT
<rdar://problem/75963350>
Comment 5 Dean Jackson 2021-03-29 11:21:07 PDT
Comment on attachment 424549 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=424549&action=review

>> Source/WebCore/platform/graphics/cg/GraphicsUtilitiesCG.mm:47
>> +    auto devices = adoptNS(MTLCopyAllDevices());
> 
> Do we know that this isn't the first time MTLCopyAllDevices() is called, which would possibly trigger allocation of more resources?

I think we call this on process start in order to pre-warm.

>> Source/WebKit/GPUProcess/graphics/RemoteRenderingBackend.cpp:70
>> +static constexpr Seconds releaseAllResourcesIfUnusedTimeout = 0.2_s;
> 
> That seems really short.

Yeah. Isn't there a chance that this could happen on a page navigation, if the incoming page takes a while to load and initialize WebGL? Maybe 10s or so?
Comment 6 Kimmo Kinnunen 2021-03-29 11:21:50 PDT
(In reply to Simon Fraser (smfr) from comment #3)
> > Source/WebCore/platform/graphics/cg/GraphicsUtilitiesCG.mm:47
> > +    auto devices = adoptNS(MTLCopyAllDevices());
> 
> Do we know that this isn't the first time MTLCopyAllDevices() is called,
> which would possibly trigger allocation of more resources?

We only release the resources if we have had a canvas. Currently we don't track if the canvas drew, but it's probably fairly accurate estimate of whether the function has already been called (by CG)..


> > Source/WebKit/GPUProcess/graphics/RemoteRenderingBackend.cpp:70
> > +static constexpr Seconds releaseAllResourcesIfUnusedTimeout = 0.2_s;
> 
> That seems really short.

It's what WebGL uses. How long do our page switches take?
Comment 7 Simon Fraser (smfr) 2021-03-29 11:23:58 PDT
Comment on attachment 424549 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=424549&action=review

>>> Source/WebCore/platform/graphics/cg/GraphicsUtilitiesCG.mm:47
>>> +    auto devices = adoptNS(MTLCopyAllDevices());
>> 
>> Do we know that this isn't the first time MTLCopyAllDevices() is called, which would possibly trigger allocation of more resources?
> 
> I think we call this on process start in order to pre-warm.

Only in the web process, I think?
Comment 8 Simon Fraser (smfr) 2021-03-29 11:44:40 PDT
Comment on attachment 424549 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=424549&action=review

> Source/WebKit/GPUProcess/graphics/RemoteRenderingBackend.cpp:114
> +        if (!remoteRenderingBackendCount)
> +            RunLoop::main().dispatchAfter(releaseAllResourcesIfUnusedTimeout, releaseAllResourcesIfUnused);
> +    });

Maybe use HysteresisActivity
Comment 9 Kimmo Kinnunen 2021-03-29 11:50:51 PDT
Created attachment 424558 [details]
Patch
Comment 10 Kimmo Kinnunen 2021-03-30 01:28:56 PDT
>Maybe use HysteresisActivity

What benefits would this bring?

Would cause all sorts of threading issues to be worked around.
The implementation of delayed dispatch seems subset of HysterisisActivity, e.g. it has strictly smaller footprint.
Comment 11 Kimmo Kinnunen 2021-04-01 05:40:25 PDT
Created attachment 424887 [details]
Patch
Comment 12 Simon Fraser (smfr) 2021-04-01 08:45:46 PDT
Comment on attachment 424887 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=424887&action=review

> Source/WebKit/GPUProcess/graphics/ScopedRenderingResourcesRequest.h:60
> +    static void releaseRenderingResources()

I would avoid use of the term "release" here because it sounds like this function itself frees the resources. I would use incrementRequestCount and decrementRequestCount which each modify s_requests, for symmetry.

> Source/WebKit/GPUProcess/graphics/ScopedRenderingResourcesRequestCocoa.mm:70
> +}

This function will also send the background notification soon.

> Source/WebKit/GPUProcess/graphics/ScopedWebGLRenderingResourcesRequest.cpp:43
> +void ScopedWebGLRenderingResourcesRequest::scheduleFreeWebGLRenderingResources()
> +{
> +
> +}
> +
> +void ScopedWebGLRenderingResourcesRequest::freeWebGLRenderingResources()
> +{
> +
> +}

Remove the empty lines.

> Source/WebKit/GPUProcess/graphics/ScopedWebGLRenderingResourcesRequest.h:34
> +class ScopedWebGLRenderingResourcesRequest {

It's crazy to repeat all this code again. There should be a base class that does the atomic counting stuff (I'm surprised we don't have one already).

> Source/WebKit/GPUProcess/graphics/ScopedWebGLRenderingResourcesRequest.h:80
> +    ScopedRenderingResourcesRequest m_renderingResourcesRequest;

It's slightly odd that this class holds a ScopedRenderingResourcesRequest but then duplicates its functionality. Maybe the WebGL code should just hold a ScopedRenderingResourcesRequest too?
Comment 13 Kimmo Kinnunen 2021-04-01 11:42:47 PDT
Created attachment 424918 [details]
Patch
Comment 14 Kimmo Kinnunen 2021-04-01 12:00:51 PDT
(In reply to Simon Fraser (smfr) from comment #12)
> Comment on attachment 424887 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=424887&action=review
> 
> > Source/WebKit/GPUProcess/graphics/ScopedRenderingResourcesRequest.h:60
> > +    static void releaseRenderingResources()
> 
> I would avoid use of the term "release" here because it sounds like this
> function itself frees the resources. I would use incrementRequestCount and
> decrementRequestCount which each modify s_requests, for symmetry.

How about relating it with other scoped apparatus like unique_ptr and their (public) reset()?
I'm hesitant to add incrementRequestCount() { ++s_requests } that is only called once, seems confusing to me.


 
> > Source/WebKit/GPUProcess/graphics/ScopedWebGLRenderingResourcesRequest.h:34
> > +class ScopedWebGLRenderingResourcesRequest {
> 
> It's crazy to repeat all this code again. There should be a base class that
> does the atomic counting stuff (I'm surprised we don't have one already).

Well, I thought about it but as far as I can see using curiously recurring template pattern you probably write the same lines with much more complex result.
With non-template, you'd use vfuncs and still write roughly the same lines.

Extracting to static instance "static MyCountingRequest s_requests" reduces to nothing more than "using MyCountingRequest = std::atomic<unsigned>" with ++ and -- reduced. And you still write the same lines.

It is literally 5 lines of code, with couple of ++/--, compare and a function call.
I know this can be abstracted in C++, I just don't think it can be *usefully* abstracted in C++. All the approaches that I simulated in my head would produce just pointless structure without benefit to the reader.

> > Source/WebKit/GPUProcess/graphics/ScopedWebGLRenderingResourcesRequest.h:80
> > +    ScopedRenderingResourcesRequest m_renderingResourcesRequest;
> 
> It's slightly odd that this class holds a ScopedRenderingResourcesRequest
> but then duplicates its functionality. Maybe the WebGL code should just hold
> a ScopedRenderingResourcesRequest too?

What's odd in that? Like refcounted class holding a unique_ptr?

But what would your proposal concretely solve? There's no complexity in holding it in WebGL? There's a concrete intent: you cannot purge the rendering resources while intending to not purge webgl resources. Wouldn't that be a _cause_ of duplication. There's no reduced duplication by moving it out, only increased possibility bugs and duplication. Starting from the fact that it would be suboptimal to decouple then, since then the reader would ask "what's the point of providing the user the possibility of shooting themselves to foot by purging the metal while holding an intent to use the metal through webgl". I mean, even if there was no bug?
Comment 15 Simon Fraser (smfr) 2021-04-01 12:01:24 PDT
Comment on attachment 424918 [details]
Patch

r+ to get this in but I think it could be a bit cleaner.
Comment 16 EWS 2021-04-01 21:42:21 PDT
Committed r275403: <https://commits.webkit.org/r275403>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 424918 [details].