WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
63189
Fix latch deadlock when GPU process crashes or context is lost
https://bugs.webkit.org/show_bug.cgi?id=63189
Summary
Fix latch deadlock when GPU process crashes or context is lost
John Bates
Reported
2011-06-22 14:56:28 PDT
Fix latch deadlock when GPU process crashes or context is lost
Attachments
Patch
(14.92 KB, patch)
2011-06-22 15:15 PDT
,
John Bates
no flags
Details
Formatted Diff
Diff
Patch
(16.02 KB, patch)
2011-06-23 15:59 PDT
,
John Bates
no flags
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
John Bates
Comment 1
2011-06-22 15:15:21 PDT
Created
attachment 98248
[details]
Patch
John Bates
Comment 2
2011-06-22 15:21:28 PDT
This patch solves the deadlock problem with latches when the GPU process crashes or in some cases when a context is lost because of errors. This does not recreate the failed child contexts, because that is already done by WebGLRenderingContext when the application supports it. This also does not reparent the child contexts to the new compositor context when the LayerRendererChromium is recreated. A separate patch needs to do that.
James Robinson
Comment 3
2011-06-22 15:40:45 PDT
Comment on
attachment 98248
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98248&action=review
> Source/WebCore/ChangeLog:9 > + > + Test: open particles WebGL demo in chrome, kill GPU process from Task Manager; observe no deadlock. > +
you need at least a little bit of explanation here of what's going on
> Source/WebCore/platform/graphics/chromium/LayerRendererChromium.cpp:-379 > - > - // After updateCompositorResources, set/wait latches for all child > - // contexts. This will prevent the compositor from using any of the child > - // parent textures while WebGL commands are executing from javascript *and* > - // while the final parent texture is being blit'd. copyTexImage2D > - // uses the parent texture as a temporary resolve buffer, so that's why the > - // waitLatch is below, to block the compositor from using the parent texture > - // until the next WebGL SwapBuffers (or copyTextureToParentTexture for > - // Canvas2D). > - if (hardwareCompositing() && m_contextSupportsLatch) { > - m_childContextsWereCopied = true; > - // For each child context: > - // glSetLatch(Offscreen->Compositor); > - // glWaitLatch(Compositor->Offscreen); > - ChildContextMap::iterator i = m_childContexts.begin(); > - for (; i != m_childContexts.end(); ++i) { > - Extensions3DChromium* ext = static_cast<Extensions3DChromium*>(i->first->getExtensions()); > - GC3Duint childToParentLatchId, parentToChildLatchId; > - ext->getParentToChildLatchCHROMIUM(&parentToChildLatchId); > - ext->getChildToParentLatchCHROMIUM(&childToParentLatchId); > - ext->setLatchCHROMIUM(childToParentLatchId); > - ext->waitLatchCHROMIUM(parentToChildLatchId); > - } > - }
why's this going away?
John Bates
Comment 4
2011-06-22 17:51:08 PDT
Comment on
attachment 98248
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98248&action=review
>> Source/WebCore/ChangeLog:9 >> + > > you need at least a little bit of explanation here of what's going on
Will do.
>> Source/WebCore/platform/graphics/chromium/LayerRendererChromium.cpp:-379 >> - } > > why's this going away?
It turned out to be premature to try and plan ahead for the threaded compositor with lots of nasty comments. For now, it's much safer to do all latching in one place (above in updateAndDrawLayers). It's also best to do the corresponding set before the wait, so consolidating the latching code above allows us to do that. When the threaded compositor comes online, we need to redo the latching code.
John Bates
Comment 5
2011-06-22 18:11:01 PDT
Comment on
attachment 98248
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98248&action=review
>>> Source/WebCore/ChangeLog:9 >>> + >> >> you need at least a little bit of explanation here of what's going on > > Will do.
This is what I'm adding: The main bug fix is to only set/wait latches if the child context has no errors. Additionally, the LayerChromium classes needed to be modified to not continue drawing when their corresponding contexts have errors. Otherwise, they would draw with invalid texture ids.
Kenneth Russell
Comment 6
2011-06-22 18:45:00 PDT
Comment on
attachment 98248
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98248&action=review
The logic changes look fine but there's an important documentation update that needs to be incorporated into this patch.
> Source/WebCore/platform/graphics/chromium/Canvas2DLayerChromium.cpp:68 > + && (context->getExtensions()->getGraphicsResetStatusARB() == GraphicsContext3D::NO_ERROR));
I made a mistake when originally adding getGraphicsResetStatusARB to Extensions3D.h. In the real OpenGL API, this function only returns the error state once; subsequent calls generally return NO_ERROR. In WebKit, this entry point's state needs to be sticky; it needs to continue to return the error state essentially forever, because context recovery is implemented by throwing away the GraphicsContext3D instance and creating a new one. Please update the comments in Extensions3D.h indicating that any error returned needs to be persistent for the lifetime of the GraphicsContext3D object, unlike the behavior of the actual GL_ARB_robustness extension.
John Bates
Comment 7
2011-06-23 15:59:38 PDT
Created
attachment 98423
[details]
Patch
John Bates
Comment 8
2011-06-23 16:00:53 PDT
Comment on
attachment 98248
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98248&action=review
>> Source/WebCore/platform/graphics/chromium/Canvas2DLayerChromium.cpp:68 >> + && (context->getExtensions()->getGraphicsResetStatusARB() == GraphicsContext3D::NO_ERROR)); > > I made a mistake when originally adding getGraphicsResetStatusARB to Extensions3D.h. In the real OpenGL API, this function only returns the error state once; subsequent calls generally return NO_ERROR. In WebKit, this entry point's state needs to be sticky; it needs to continue to return the error state essentially forever, because context recovery is implemented by throwing away the GraphicsContext3D instance and creating a new one. Please update the comments in Extensions3D.h indicating that any error returned needs to be persistent for the lifetime of the GraphicsContext3D object, unlike the behavior of the actual GL_ARB_robustness extension.
Done.
James Robinson
Comment 9
2011-06-23 16:01:44 PDT
Comment on
attachment 98423
[details]
Patch Seems good
Kenneth Russell
Comment 10
2011-06-23 17:01:00 PDT
Comment on
attachment 98423
[details]
Patch As we've discussed offline, I have some remaining concerns about the asynchronous nature of the loss of contexts and whether a synchronous check before issuing the latch command will really be sufficient. However this seems to be a step forward. cq=me
WebKit Review Bot
Comment 11
2011-06-23 17:12:04 PDT
Comment on
attachment 98423
[details]
Patch Clearing flags on attachment: 98423 Committed
r89635
: <
http://trac.webkit.org/changeset/89635
>
WebKit Review Bot
Comment 12
2011-06-23 17:12:08 PDT
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug