Bug 228665

Summary: [GPU Process] REGRESSION: iCloud Photos Web app may crash WebProcess once the GPUProcess is relaunched
Product: WebKit Reporter: Said Abou-Hallawa <sabouhallawa>
Component: CanvasAssignee: Said Abou-Hallawa <sabouhallawa>
Status: RESOLVED FIXED    
Severity: Normal CC: dino, mmaxfield, sam, simon.fraser, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=228969
Attachments:
Description Flags
Patch
none
Patch none

Description Said Abou-Hallawa 2021-07-30 15:50:28 PDT
If the WebProcess has sent a RemoteRenderingBackend.ReleaseRemoteResource message but the GPUProcess was terminated an has just been relaunched. The new relaunched GPUProcess will process this message for a resource which it has not recached it yet. In this scenario, the WebProcess will be mistakenly terminated.
Comment 1 Said Abou-Hallawa 2021-07-30 16:41:30 PDT
<rdar://80857877>
Comment 2 Said Abou-Hallawa 2021-07-30 16:42:54 PDT
Created attachment 434673 [details]
Patch
Comment 3 Sam Weinig 2021-08-01 10:51:09 PDT
(In reply to Said Abou-Hallawa from comment #0)
> If the WebProcess has sent a RemoteRenderingBackend.ReleaseRemoteResource
> message but the GPUProcess was terminated an has just been relaunched. The
> new relaunched GPUProcess will process this message for a resource which it
> has not recached it yet. In this scenario, the WebProcess will be mistakenly
> terminated.

Why not just make the WebProcess not send the message in this case?
Comment 4 Said Abou-Hallawa 2021-08-02 12:56:04 PDT
I put some logging and I tried it on iOS device by opening iCloud/Photos/Albums/Live Photos and I kept moving between scrolling the album and swiping the between the images after clicking on one of them and here is what I found:

com.apple.WebKit.GPU: (JavaScriptCore) shallawa: this = 0x106d5c030, in RemoteResourceCache::RemoteResourceCache()
...
com.apple.WebKit.WebContent: (JavaScriptCore) shallawa: this = 0x10109c4d0, in RemoteRenderingBackendProxy::cacheNativeImage() renderingResourceIdentifier = 1070
com.apple.WebKit.GPU: (JavaScriptCore) shallawa: this = 0x106d5c030, in RemoteResourceCache::cacheNativeImage() renderingResourceIdentifier = 1070
...
com.apple.WebKit.GPU: (JavaScriptCore) shallawa: this = 0x104944030, in RemoteResourceCache::RemoteResourceCache()
...
com.apple.WebKit.WebContent: (JavaScriptCore) shallawa: this = 0x10109c4d0, in RemoteRenderingBackendProxy::releaseRemoteResource() renderingResourceIdentifier = 1070
...
com.apple.WebKit.GPU: (JavaScriptCore) shallawa: this = 0x104944030, in RemoteResourceCache::releaseRemoteResource() resource was not found in m_resourceUseCounters renderingResourceIdentifier = 1070

And this is the explanation of these messages:

(message 1) (GPUProcess) Create a RemoteResourceCache (0x106d5c030)
(message 2) (WebProcess) Cache the NativeImage (1070) in RemoteRenderingBackendProxy (0x10109c4d0)
(message 3) (GPUProcess) Cache the NativeImage (1070) in RemoteResourceCache (0x106d5c030)
(message 4) (GPUProcess) Create a new RemoteResourceCache (0x104944030)
(message 5) (WebProcess) Release the NativeImage (1070) in RemoteRenderingBackendProxy (0x10109c4d0)
(message 6) (GPUProcess) The NativeImage (1070) could not be found in RemoteResourceCache (0x104944030)

It is clear from (message 1) and (message 4) the RemoteResourceCache (0x106d5c030) has crashed/terminated and a new one was created 0x104944030). (message 2) and (message 6) show that the NativeImage was cached in a RemoteResourceCache (0x106d5c030) but was requested to be released from another RemoteResourceCache (0x104944030). And this is why RemoteResourceCache::releaseRemoteResource() returns false in this case and terminates the WebProcess.
Comment 5 Sam Weinig 2021-08-02 14:10:28 PDT
Seems like message 5 is the issue then? 

When the GPU Process crashes, the resources all need to know they are no longer being used and therefore don't need to message for release.
Comment 6 Said Abou-Hallawa 2021-08-03 20:27:06 PDT
Created attachment 434881 [details]
Patch
Comment 7 Said Abou-Hallawa 2021-08-04 10:35:41 PDT
I think the failures on mac-AS bot are unrelated.
Comment 8 EWS 2021-08-04 10:43:23 PDT
Committed r280639 (240252@main): <https://commits.webkit.org/240252@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 434881 [details].