Bug 239107 - REGRESSION(r282117): RemoteRenderingBackend::willDestroyImageBuffer() can crash if the RemoteRenderingBackend has already been destroyed
Summary: REGRESSION(r282117): RemoteRenderingBackend::willDestroyImageBuffer() can cra...
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit Process Model (show other bugs)
Version: Safari Technology Preview
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Said Abou-Hallawa
Keywords: InRadar
Depends on:
Reported: 2022-04-11 22:20 PDT by Simon Fraser (smfr)
Modified: 2022-05-01 22:47 PDT (History)
5 users (show)

See Also:

Patch (3.93 KB, patch)
2022-04-12 17:46 PDT, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Patch (11.41 KB, patch)
2022-04-13 02:06 PDT, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Patch (14.91 KB, patch)
2022-04-13 11:58 PDT, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Patch (15.48 KB, patch)
2022-04-13 18:13 PDT, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Patch (15.86 KB, patch)
2022-04-15 12:20 PDT, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Patch (17.77 KB, patch)
2022-04-17 17:54 PDT, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Patch (12.92 KB, patch)
2022-04-29 18:10 PDT, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Patch (21.73 KB, patch)
2022-04-29 22:53 PDT, Said Abou-Hallawa
darin: review+
Details | Formatted Diff | Diff
Patch (21.71 KB, patch)
2022-05-01 19:55 PDT, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Fraser (smfr) 2022-04-11 22:20:25 PDT
I'm seeing occasional crashes when running fast/scrolling/ios/autoscroll-input-when-very-zoomed.html in the iOS simulator (via Xcode).

It looks like the RemoteRenderingBackend has been destroyed by the time a RemoteImageBuffer is destroyed on the main thread:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
  * frame #0: 0x000000016933868d WebCore`bool std::__1::__cxx_atomic_compare_exchange_weak<unsigned char>(__a=0x6e6f697461657243, __expected="", __value='\x01', __success=acquire, __failure=acquire) at atomic:1050:12
    frame #1: 0x0000000169338452 WebCore`std::__1::__atomic_base<unsigned char, false>::compare_exchange_weak(this=0x6e6f697461657243, __e=0x00007ff7b239e83f, __d='\x01', __m=acquire) at atomic:1681:17
    frame #2: 0x00000001693383fa WebCore`WTF::Atomic<unsigned char>::compareExchangeWeak(this=0x6e6f697461657243, expected='\0', desired='\x01', order=acquire) at Atomics.h:89:22
    frame #3: 0x00000001693383b1 WebCore`WTF::LockAlgorithm<unsigned char, (unsigned char)1, (unsigned char)2, WTF::EmptyLockHooks<unsigned char> >::lockFastAssumingZero(lock=0x6e6f697461657243) at LockAlgorithm.h:53:21
    frame #4: 0x0000000169338359 WebCore`WTF::Lock::lock(this=0x6e6f697461657243) at Lock.h:65:13
    frame #5: 0x0000000169338324 WebCore`WTF::Locker<WTF::Lock>::Locker(this=0x00007ff7b239e910, lock=0x6e6f697461657243) at Lock.h:158:16
    frame #6: 0x00000001693378dd WebCore`WTF::Locker<WTF::Lock>::Locker(this=0x00007ff7b239e910, lock=0x6e6f697461657243) at Lock.h:157:5
    frame #7: 0x000000016d97dcd1 WebCore`WebCore::IOSurfacePool::addSurface(this=0x6e6f697461657243, surface=WebCore::IOSurface @ 0x00007f9cecc09e00) at IOSurfacePool.cpp:180:12
    frame #8: 0x000000016b494848 WebCore`WebCore::IOSurface::moveToPool(surface=WebCore::IOSurface @ 0x00007f9cecc09e00, pool=0x6e6f697461657243) at IOSurface.mm:102:15
    frame #9: 0x000000016d9a0269 WebCore`WebCore::ImageBufferIOSurfaceBackend::releaseBufferToPool(this=0x00007f9cebf04530, pool=0x6e6f697461657243) at ImageBufferIOSurfaceBackend.cpp:267:5
    frame #10: 0x0000000128112272 WebKit`WebCore::ConcreteImageBuffer<WebKit::ImageBufferShareableMappedIOSurfaceBackend>::releaseBufferToPool(this=0x00007f9cebf05270, pool=0x6e6f697461657243) at ConcreteImageBuffer.h:333:22
    frame #11: 0x00000001280fdef2 WebKit`WebKit::RemoteRenderingBackend::willDestroyImageBuffer(this=0x00007f9cec808e90, imageBuffer=0x00007f9cebf05270) at RemoteRenderingBackend.cpp:175:21
    frame #12: 0x0000000128113708 WebKit`WebKit::RemoteImageBuffer<WebKit::ImageBufferShareableMappedIOSurfaceBackend>::~RemoteImageBuffer(this=0x00007f9cebf05270) at RemoteImageBuffer.h:76:34
    frame #13: 0x0000000128111c95 WebKit`WebKit::RemoteImageBuffer<WebKit::ImageBufferShareableMappedIOSurfaceBackend>::~RemoteImageBuffer(this=0x00007f9cebf05270) at RemoteImageBuffer.h:70:5
    frame #14: 0x0000000128111cb9 WebKit`WebKit::RemoteImageBuffer<WebKit::ImageBufferShareableMappedIOSurfaceBackend>::~RemoteImageBuffer(this=0x00007f9cebf05270) at RemoteImageBuffer.h:70:5
    frame #15: 0x0000000127d7580f WebKit`WTF::ThreadSafeRefCounted<WebCore::ImageBuffer, (WTF::DestructionThread)1>::deref(this=0x00007f9cfbfb2408) const::'lambda'()::operator()() const at ThreadSafeRefCounted.h:117:13
    frame #16: 0x0000000127d757b9 WebKit`WTF::Detail::CallableWrapper<WTF::ThreadSafeRefCounted<WebCore::ImageBuffer, (WTF::DestructionThread)1>::deref() const::'lambda'(), void>::call(this=0x00007f9cfbfb2400) at Function.h:53:39
    frame #17: 0x0000000144647ce2 JavaScriptCore`WTF::Function<void ()>::operator(this=0x00007ff7b239ead0)() const at Function.h:82:35
    frame #18: 0x00000001446d1bc2 JavaScriptCore`WTF::RunLoop::performWork(this=0x00007f9cfbfb58b0) at RunLoop.cpp:133:9
    frame #19: 0x00000001446d54ee JavaScriptCore`WTF::RunLoop::performWork(context=0x00007f9cfbfb58b0) at RunLoopCF.cpp:46:37
    frame #20: 0x000000010e3b5833 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17
    frame #21: 0x000000010e3b572b CoreFoundation`__CFRunLoopDoSource0 + 180
    frame #22: 0x000000010e3b4bf8 CoreFoundation`__CFRunLoopDoSources0 + 242
    frame #23: 0x000000010e3af2f4 CoreFoundation`__CFRunLoopRun + 871
    frame #24: 0x000000010e3aea90 CoreFoundation`CFRunLoopRunSpecific + 562
    frame #25: 0x000000010ecfee31 Foundation`-[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 213
    frame #26: 0x000000010ecff04f Foundation`-[NSRunLoop(NSRunLoop) run] + 76
    frame #27: 0x000000010f979feb libxpc.dylib`_xpc_objc_main + 440
    frame #28: 0x000000010f97bfd4 libxpc.dylib`xpc_main + 122
    frame #29: 0x000000012819aeda WebKit`WebKit::XPCServiceMain((null)=1, (null)=0x00007ff7b239fcd8) at XPCServiceMain.mm:221:5
    frame #30: 0x000000012a14e99b WebKit`WKXPCServiceMain(argc=1, argv=0x00007ff7b239fcd8) at WKMain.mm:35:12
    frame #31: 0x000000010db5ed32 com.apple.WebKit.GPU.Development`main(argc=1, argv=0x00007ff7b239fcd8) at AuxiliaryProcessMain.cpp:30:12
    frame #32: 0x000000010dd74f21 dyld_sim`start_sim + 10
    frame #33: 0x000000011299b50e dyld`start + 462
Comment 1 Simon Fraser (smfr) 2022-04-11 22:25:36 PDT
I confirmed with logging that the RemoteRenderingBackend can get destroyed under GPUConnectionToWebProcess::didClose() before the callOnMainThread() of the ThreadSafeRefcounted image buffers have run.
Comment 2 Radar WebKit Bug Importer 2022-04-11 22:25:51 PDT
Comment 3 Said Abou-Hallawa 2022-04-12 17:46:28 PDT
Created attachment 457492 [details]
Comment 4 Simon Fraser (smfr) 2022-04-12 20:22:53 PDT
Comment on attachment 457492 [details]

Some other possible approaches here: RemoteImageBuffer could have a RefPtr<RemoteRenderingBackend>. Or, RemoteImageBuffers could have their destruction managed by RemoteRenderingBackend (e.g. put into a set which is emptied on the main thread).
Comment 5 Said Abou-Hallawa 2022-04-13 02:06:27 PDT
Created attachment 457517 [details]
Comment 6 Said Abou-Hallawa 2022-04-13 11:58:03 PDT
Created attachment 457552 [details]
Comment 7 Said Abou-Hallawa 2022-04-13 12:04:26 PDT
(In reply to Simon Fraser (smfr) from comment #4)
> Comment on attachment 457492 [details]
> Patch
> Some other possible approaches here: RemoteImageBuffer could have a
> RefPtr<RemoteRenderingBackend>. Or, RemoteImageBuffers could have their
> destruction managed by RemoteRenderingBackend (e.g. put into a set which is
> emptied on the main thread).

I could not make RemoteImageBuffer hold a WeakPtr<RemoteRenderingBackend> because RemoteImageBuffer is created on the work queue thread while RemoteRenderingBackend is created on the main thread.

And I think I can't make RemoteImageBuffer hold RefPtr<RemoteRenderingBackend> because this will lead to a cyclic reference and hence a memory leak:

RemoteImageBuffer -> RefPtr<RemoteRenderingBackend> -> RemoteResourceCache -> QualifiedResourceHeap -> Ref<WebCore::ImageBuffer>.

So I took the second approach but I made RemoteResourceCache manages the destruction of its RemoteImageBuffers.
Comment 8 Said Abou-Hallawa 2022-04-13 17:37:04 PDT
This patch causes this crash. It happens because we call RenderingBackend::willDestroyImageBuffer() before the destructor of RemoteImageBuffer is called.

Thread 0 Crashed::   Dispatch queue: com.apple.main-thread
0   WebCore                       	       0x11449f2fa        WebCore::IOSurface::ensureGraphicsContext()
1   WebCore                       	       0x115403a60        WebCore::ImageBufferIOSurfaceBackend::context() const
2   WebKit                        	       0x1076a1d34        WebKit::RemoteImageBuffer<WebKit::ImageBufferShareableMappedIOSurfaceBackend>::~RemoteImageBuffer()
3   WebKit                        	       0x1076a1632        WebKit::RemoteImageBuffer<WebKit::ImageBufferShareableMappedIOSurfaceBackend>::~RemoteImageBuffer()
4   JavaScriptCore                	       0x10cbcaa9c        WTF::RunLoop::performWork()
5   JavaScriptCore                	       0x10cbcb380        WTF::RunLoop::performWork(void*)
6   CoreFoundation                	       0x7fff20369e24     __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 16 (/Library/Caches/com.apple.xbs/Sources/CoreFoundation_Sim/Foundation-1854/CoreFoundation/RunLoop.subproj/CFRunLoop.c:1972)
7   CoreFoundation                	       0x7fff20369d1c     __CFRunLoopDoSource0 + 178 (/Library/Caches/com.apple.xbs/Sources/CoreFoundation_Sim/Foundation-1854/CoreFoundation/RunLoop.subproj/CFRunLoop.c:2016)
8   CoreFoundation                	       0x7fff203691f2     __CFRunLoopDoSources0 + 242 (/Library/Caches/com.apple.xbs/Sources/CoreFoundation_Sim/Foundation-1854/CoreFoundation/RunLoop.subproj/CFRunLoop.c:2053)
9   CoreFoundation                	       0x7fff20363950     __CFRunLoopRun + 874 (/Library/Caches/com.apple.xbs/Sources/CoreFoundation_Sim/Foundation-1854/CoreFoundation/RunLoop.subproj/CFRunLoop.c:2951)
10  CoreFoundation                	       0x7fff20363102     CFRunLoopRunSpecific + 566 (/Library/Caches/com.apple.xbs/Sources/CoreFoundation_Sim/Foundation-1854/CoreFoundation/RunLoop.subproj/CFRunLoop.c:3268)
11  Foundation                    	       0x7fff2081941c     -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212 (/Library/Caches/com.apple.xbs/Sources/Foundation_Sim/Foundation-1854/Foundation/Soil.subproj/NSRunLoop.m:373)
12  Foundation                    	       0x7fff20819638     -[NSRunLoop(NSRunLoop) run] + 76 (/Library/Caches/com.apple.xbs/Sources/Foundation_Sim/Foundation-1854/Foundation/Soil.subproj/NSRunLoop.m:398)
13  libxpc.dylib                  	       0x7fff2006705e     _xpc_objc_main + 438 (/Library/Caches/com.apple.xbs/Sources/libxpc_Sim/libxpc-2235.12.1/src/main.m:246)
14  libxpc.dylib                  	       0x7fff20069050     xpc_main + 122 (/Library/Caches/com.apple.xbs/Sources/libxpc_Sim/libxpc-2235.12.1/src/init.c:1187)
15  WebKit                        	       0x1076b94f0        WebKit::XPCServiceMain(int, char const**)
16  dyld_sim                      	       0x104e5fe1e        start_sim + 10 (/Library/Caches/com.apple.xbs/Sources/dyld_Sim/dyld-932.4/dyld/dyldMain.cpp:904)
17                                	       0x1
18                                	       0x1
Comment 9 Said Abou-Hallawa 2022-04-13 18:13:39 PDT
Created attachment 457582 [details]
Comment 10 Said Abou-Hallawa 2022-04-15 12:20:52 PDT
Created attachment 457713 [details]
Comment 11 Simon Fraser (smfr) 2022-04-15 14:04:30 PDT
Comment on attachment 457713 [details]

View in context: https://bugs.webkit.org/attachment.cgi?id=457713&action=review

> Source/WebKit/GPUProcess/graphics/RemoteResourceCache.cpp:97
> +    // Unwind the context's state stack before destruction, since calls to restore may not have
> +    // been flushed yet, or the web process may have terminated.
> +    auto& context = imageBuffer.context();
> +    while (context.stackSize())
> +        context.restore();

It's weird for this code, which uses the ImageBuffer's GraphicsContext, to live here. This should be moved into an ImageBuffer function.
Comment 12 Said Abou-Hallawa 2022-04-17 17:54:50 PDT
Created attachment 457786 [details]
Comment 13 Kimmo Kinnunen 2022-04-18 11:43:41 PDT
I wonder why can we not just store the pool to the ImageBufferBackendIOSurface that uses the pool, and return the iosurface in the destructor?

These approaches seem to tie future development options fairly strongly.

For example, it would be good if the WebGL in the future could use the ImageBuffers from RRB as-is, even though RRB stops using them. With RRB returning the surfaces, it is not possible.
Comment 14 Said Abou-Hallawa 2022-04-29 16:24:56 PDT
The RemoteRenderingBackend::m_ioSurfacePool is a std::unique_ptr.

I think we can not make it a WeakPtr<IOSurfacePool> because RemoteImageBuffer is created on the work queue thread while m_ioSurfacePool is created on the main thread.

But I think I can make it RefPtr<IOSurfacePool>. There should not a cyclic reference in this case.
Comment 15 Said Abou-Hallawa 2022-04-29 18:10:21 PDT
Created attachment 458621 [details]
Comment 16 Said Abou-Hallawa 2022-04-29 22:53:37 PDT
Created attachment 458631 [details]
Comment 17 Darin Adler 2022-04-30 21:26:45 PDT
Comment on attachment 458631 [details]

View in context: https://bugs.webkit.org/attachment.cgi?id=458631&action=review

I can’t exactly follow whether this changes any whether cases join the pool or not—I understand the new timing, but hard to follow the logic in the old code to see which would and would not end up releasing to the pool—but I’m guessing it’s not changed here.

> Source/WebCore/platform/graphics/cg/IOSurfacePool.cpp:66
> +    return adoptRef(*new IOSurfacePool());

Not changed: Normally we’d omit the () after IOSurfacePool here.

> Source/WebCore/platform/graphics/cg/IOSurfacePool.h:38
> +#include <wtf/RefCounted.h>

I think we’d want to include ThreadSafeRefCounted.h, since that’s what we’re using, not RefCounted. But apparently it’s already included by something else, so maybe we don’t need to add an include at all?
Comment 18 Said Abou-Hallawa 2022-05-01 19:55:43 PDT
Created attachment 458658 [details]
Comment 19 Said Abou-Hallawa 2022-05-01 20:30:41 PDT
Comment on attachment 458631 [details]

This patch changes the timing of returning the IOSurface to the pool. In the old code, we used to return it in two separate places:

1. RemoteImageBuffer was returning the IOSurface of its backend in the RemoteImageBuffer destructor via the RemoteRenderingBackend only if its rendering purpose is RenderingPurpose::LayerBacking.

2. RemoteLayerBackingStore was returning the IOSurface of its Buffer in RemoteLayerBackingStore::Buffer::discard() before setting the imageBuffer to nullptr.

With this patch if an ImageBuffer is needed from the IOSurfacePool, the IOSurfacePool will be passed to the constructor and will be held till the destructor releases the IOSurface to the pool.
Comment 20 EWS 2022-05-01 22:47:29 PDT
Committed r293659 (250163@main): <https://commits.webkit.org/250163@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 458658 [details].