Created attachment 456139 [details] Crash log iOS layout tests show a crash with this backtrace: Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 com.apple.JavaScriptCore 0x136c83ace WTFCrash 1 com.apple.WebKit 0x11903c278 WTFCrashWithInfo(int, char const*, char const*, int) 2 com.apple.WebKit 0x119c90970 WebKit::QualifiedResourceHeap::checkInvariants() const 3 com.apple.WebKit 0x119c996f6 WebKit::QualifiedResourceHeap::hasImageBuffer() const 4 com.apple.WebKit 0x119c83f76 WebKit::RemoteResourceCache::hasActiveDrawables() const 5 com.apple.WebKit 0x119c71346 WebKit::RemoteRenderingBackend::allowsExitUnderMemoryPressure() const 6 com.apple.WebKit 0x119c70dc0 WebKit::GPUConnectionToWebProcess::allowsExitUnderMemoryPressure() const 7 com.apple.WebKit 0x119c76640 WebKit::GPUProcess::canExitUnderMemoryPressure() const 8 com.apple.WebKit 0x119c756fc WebKit::GPUProcess::tryExitIfUnused() 9 com.apple.WebKit 0x119cb10fc decltype(*(std::__1::forward<WebKit::GPUProcess*&>(fp0)).*fp()) std::__1::__invoke<void (WebKit::GPUProcess::*&)(), WebKit::GPUProcess*&, void>(void (WebKit::GPUProcess::*&)(), WebKit::GPUProcess*&) 10 com.apple.WebKit 0x119cb107a std::__1::__bind_return<void (WebKit::GPUProcess::*)(), std::__1::tuple<WebKit::GPUProcess*>, std::__1::tuple<>, __is_valid_bind_return<void (WebKit::GPUProcess::*)(), std::__1::tuple<WebKit::GPUProcess*>, std::__1::tuple<> >::value>::type std::__1::__apply_functor<void (WebKit::GPUProcess::*)(), std::__1::tuple<WebKit::GPUProcess*>, 0ul, std::__1::tuple<> >(void (WebKit::GPUProcess::*&)(), std::__1::tuple<WebKit::GPUProcess*>&, std::__1::__tuple_indices<0ul>, std::__1::tuple<>&&) 11 com.apple.WebKit 0x119cb102e std::__1::__bind_return<void (WebKit::GPUProcess::*)(), std::__1::tuple<WebKit::GPUProcess*>, std::__1::tuple<>, __is_valid_bind_return<void (WebKit::GPUProcess::*)(), std::__1::tuple<WebKit::GPUProcess*>, std::__1::tuple<> >::value>::type std::__1::__bind<void (WebKit::GPUProcess::*&)(), WebKit::GPUProcess*>::operator()<>() 12 com.apple.WebKit 0x119cb0fc6 WTF::Detail::CallableWrapper<std::__1::__bind<void (WebKit::GPUProcess::*&)(), WebKit::GPUProcess*>, void>::call() 13 com.apple.WebKit 0x11909b0c0 WTF::Function<void ()>::operator()() const 14 com.apple.WebKit 0x1195f2876 WebCore::Timer::fired() 15 com.apple.WebCore 0x1637dd2f4 WebCore::ThreadTimers::sharedTimerFiredInternal() 16 com.apple.WebCore 0x1637e772e WebCore::ThreadTimers::setSharedTimer(WebCore::SharedTimer*)::$_0::operator()() const 17 com.apple.WebCore 0x1637e76c6 WTF::Detail::CallableWrapper<WebCore::ThreadTimers::setSharedTimer(WebCore::SharedTimer*)::$_0, void>::call() 18 com.apple.WebCore 0x15f0206f0 WTF::Function<void ()>::operator()() const 19 com.apple.WebCore 0x16377521e WebCore::MainThreadSharedTimer::fired() https://build.webkit.org/results/Apple-iOS-15-Simulator-Debug-WK2-Tests/r292101%20(2082)/results.html
<rdar://problem/91052033>
Looks like a case where we access the QualifiedResourceHeap on the main thread; it should only be accessed from the thread that RemoteRenderingBackend lives on.
Created attachment 457131 [details] Patch
Comment on attachment 457131 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=457131&action=review > Source/WebKit/GPUProcess/graphics/RemoteRenderingBackend.cpp:533 > + m_workQueue->dispatchSync([&] { We already tried that and it caused hangs: https://trac.webkit.org/changeset/282693/webkit Has the code changed in ways that it is now OK?
(In reply to Chris Dumez from comment #4) > Comment on attachment 457131 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=457131&action=review > > > Source/WebKit/GPUProcess/graphics/RemoteRenderingBackend.cpp:533 > > + m_workQueue->dispatchSync([&] { > > We already tried that and it caused hangs: > https://trac.webkit.org/changeset/282693/webkit > > Has the code changed in ways that it is now OK? I don't remember all the details but I think the issue had to do with the fact that the rendering backend thread was stuck on an IPC Semaphore (because that's how we were passing display lists to the GPU process, at least at the time, not sure if this changed). As a result, the background thread would not process the dispatch_sync().
Comment on attachment 457131 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=457131&action=review > Source/WebKit/GPUProcess/graphics/RemoteRenderingBackend.cpp:534 > + hasActiveDrawables = m_remoteResourceCache.hasActiveDrawables(); Yeah, this is not really sound in many ways. I think a better approach would be that: 1) RemoteResourceCache acquires ScopedRenderingResourcesRequest every time it starts having active drawables. 2) RemoteRenderingBackend::allowsExitUnderMemoryPressure() is removed altogether 3) GPUConnectionToWebProcess::allowsExitUnderMemoryPressure() checks for new function ScopedRenderingResourcesRequest::hasActiveUsers() or similar Additionally ScopedRenderingResourcesRequest could be renamed to something else, like ScopedGPUProcessActivity or something. ScopedRenderingResourcesRequest could also be modified to not purge anything but just maintain s_requests
Comment on attachment 457131 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=457131&action=review > Source/WebKit/GPUProcess/graphics/RemoteRenderingBackend.cpp:535 > + }); A better thread safe implementation of this would be to have the dispatched function capture a normal binarysemaphore, see the implementation at WorkQueue::dispatchSync. (for your interest, it probably isn't good to do this sync call)
Comment on attachment 457131 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=457131&action=review >> Source/WebKit/GPUProcess/graphics/RemoteRenderingBackend.cpp:535 >> + }); > > A better thread safe implementation of this would be to have the dispatched function capture a normal binarysemaphore, see the implementation at WorkQueue::dispatchSync. > (for your interest, it probably isn't good to do this sync call) It seems this sync call might lead to deadlock. For instance RemoteDisplayListRecorder::paintFrameForMedia(), which runs on the "RemoteRenderingBackend work queue", makes a media call using callOnMainRunLoopAndWait(). So if RemoteRenderingBackend::allowsExitUnderMemoryPressure() blocks the execution on the main run loop till the "RemoteRenderingBackend work queue" executes the call to RemoteResourceCache::hasActiveDrawables() and there is a PaintFrameForMedia message which was dispatched to the StreamConnectionWorkQueue but has not been processed yet, a deadlock will happen. The main run loop and the "RemoteRenderingBackend work queue" will both be blocked; each is waiting for the other to finish.
Created attachment 457303 [details] Patch
Although this patch may fix the assertion, I am not convinced RemoteRenderingBackend::allowsExitUnderMemoryPressure() has the right implementation. This function returns a value which is not guaranteed to stay the same even while it is used by the caller GPUConnectionToWebProcess::allowsExitUnderMemoryPressure(). The "RemoteRenderingBackend work queue" can change the state of RemoteRenderingBackend::m_remoteResourceCache while GPUConnectionToWebProcess::allowsExitUnderMemoryPressure() makes a decision based on out-dated state.
(In reply to Said Abou-Hallawa from comment #10) > Although this patch may fix the assertion, I am not convinced > RemoteRenderingBackend::allowsExitUnderMemoryPressure() has the right > implementation. This function returns a value which is not guaranteed to > stay the same even while it is used by the caller > GPUConnectionToWebProcess::allowsExitUnderMemoryPressure(). > > The "RemoteRenderingBackend work queue" can change the state of > RemoteRenderingBackend::m_remoteResourceCache while > GPUConnectionToWebProcess::allowsExitUnderMemoryPressure() makes a decision > based on out-dated state. allowsExitUnderMemoryPressure() is known to be racy, that is fine. It is always racy anyway due to IPC. Worse case scenario, we'll exit the GPUProcess and the WebProcess will get notified it exited and it is needs it, it will relaunch it. Note that eager exit on memory pressure is no longer useful once we enable DOM rendering in the GPUProcess I think (since the GPUProcess will always be needed for DOM rendering). As a result, we should probably consider disabling this idle exit logic altogether when DOM rendering in GPUProcess is enabled.
API test failure looks related.
Created attachment 457526 [details] Alternative
Comment on attachment 457526 [details] Alternative View in context: https://bugs.webkit.org/attachment.cgi?id=457526&action=review > Source/WebKit/ChangeLog:18 > + the count is now counted by RemoteMediaPlayerProxy creation, destruction. I think this solution is a good alternative except it does not consider the native image an ActiveDrawable resource as the current code does. RemoteRenderingBackend calls updateRenderingResourceRequest() from RemoteRenderingBackend::createImageBufferWithQualifiedIdentifier() and RemoteRenderingBackend::releaseRemoteResourceWithQualifiedIdentifier(). It does not call it from RemoteRenderingBackend::cacheNativeImageWithQualifiedIdentifier(). Do we need to call it from cacheNativeImageWithQualifiedIdentifier()? Do we even need to consider the NativeImage an ActiveDrawable resource? Do we need to consider the Font also as an ActiveDrawable resource? What makes a resource be ActiveDrawable? RemoteRenderingBackend::allowsExitUnderMemoryPressure() was added in r276148. This is the original version of it: bool RemoteRenderingBackend::allowsExitUnderMemoryPressure() const { return m_remoteResourceCache.imageBuffers().isEmpty() && m_remoteResourceCache.nativeImages().isEmpty(); }
In fact this bug is a regression of r288847. This revision removes r282754 which was fixing exactly the same bug.
Created attachment 457594 [details] Another alternative
(In reply to Said Abou-Hallawa from comment #14) > I think this solution is a good alternative except it does not consider the > native image an ActiveDrawable resource as the current code does. > RemoteRenderingBackend calls updateRenderingResourceRequest() from > RemoteRenderingBackend::createImageBufferWithQualifiedIdentifier() and > RemoteRenderingBackend::releaseRemoteResourceWithQualifiedIdentifier(). It > does not call it from > RemoteRenderingBackend::cacheNativeImageWithQualifiedIdentifier(). Right, but that's just a bug about updateRenderingResourceRequest().
Created attachment 457664 [details] Patch
I talked to Simon and we agreed on counting the ImageBuffers only when allowing or disallowing the GPUP to exit under the memory pressure. Counting the NativeImages should not affect this decision because they are remotely cached only when they are drawn on the RemoteImageBuffers. The life cycle of the NativeImage in GPUP is different from the life cycle of the ImageBuffer. The NativeImages in GPUP are released only when the counterpart NativeImages in WebProcess are released. So allowing the GPUP process to exit when no ImageBuffer is cached but there are NativeImages cached should be fine.
Comment on attachment 457664 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=457664&action=review > Source/WebKit/ChangeLog:19 > + Remove RemoteMediaPlayerManagerProxy::allowsExitUnderMemoryPressure(), > + the count is now counted by RemoteMediaPlayerProxy creation, destruction. I don't understand this part. Where does RemoteMediaPlayerManagerProxy use ScopedRenderingResourcesRequest?
Created attachment 457727 [details] Patch
(In reply to Said Abou-Hallawa from comment #21) > Created attachment 457727 [details] > Patch In this patch I added a nicer approach which Simon suggested. RemoteImageBuffer will have ScopedRenderingResourcesRequest. So we are going to count the active ImageBuffers and not the RemoteRenderingBackends which have active ImageBuffers as we used to do before.
Committed r292961 (249726@main): <https://commits.webkit.org/249726@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 457727 [details].