Bug 36194

Summary: WebGL layout test failures on bots after 33416
Product: WebKit Reporter: Kenneth Russell <kbr>
Component: WebGLAssignee: Zhenyao Mo <zmo>
Status: RESOLVED INVALID    
Severity: Major CC: cmarrin, eric, oliver, simon.fraser, webkit.review.bot
Priority: P1    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
patch
none
fixed style issue
none
revised patch : responding to Simon Fraser's review none

Description Kenneth Russell 2010-03-16 14:36:53 PDT
After the introduction of https://bugs.webkit.org/show_bug.cgi?id=33416 , several WebGL layout tests are failing on some of the bots. Typical failures look like:

fast/canvas/webgl/context-attributes-alpha-depth-stencil-antialias.html
fast/canvas/webgl/tex-image-and-sub-image-2d-with-image-data.html
fast/canvas/webgl/tex-sub-image-2d.html
fast/canvas/webgl/triangle.html
fast/canvas/webgl/viewport-unchanged-upon-resize.html

For details, see for example http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r56075%20(11503)/results.html . (On the console at http://build.webkit.org/console , go to r56074, click the red bubble, and under "uploaded results" click "view results".)

Generally it looks like any test reading back the frame buffer is failing because incorrect values are coming back. The best guess at this point is that the OpenGL implementation on these bots is buggy and breaks when multisampled framebuffer objects are used.

We need to be able to detect when to enable or disable multisampling, and modify the layout tests so they pass regardless of whether the implementation was able to satisfy the user's antialiasing request.
Comment 1 Kenneth Russell 2010-03-16 16:33:44 PDT
Temporarily disabling these tests on the build bots under https://bugs.webkit.org/show_bug.cgi?id=36200 to get the commit queue going again.
Comment 2 Simon Fraser (smfr) 2010-03-16 17:07:31 PDT
Be sure to file a bug at bugreporter.apple.com on the underlying OpenGL issue.

Why wasn't this seen before committing?
Comment 3 Kenneth Russell 2010-03-16 17:10:58 PDT
(In reply to comment #2)
> Be sure to file a bug at bugreporter.apple.com on the underlying OpenGL issue.

We will once we figure out what the root cause appears to be. Right now it points to EXT_framebuffer_multisample / EXT_framebuffer_blit. We also don't know the graphics hardware in the build bots.

> Why wasn't this seen before committing?

The tests pass on our local (NVIDIA) hardware and drivers. I sidestepped the commit queue due to other failing tests which were blocking it, which it looks like was a mistake.
Comment 4 Kenneth Russell 2010-03-16 17:33:09 PDT
Because the Snow Leopard bots were already failing tests, I neglected to notice that basically all of the WebGL layout tests are failing on that bot after this change. In particular, all of the following fail or crash, in addition to the other tests mentioned:

fast/canvas/webgl/bug-32364.html
fast/canvas/webgl/bug-32692.html
fast/canvas/webgl/bug-32888.html
fast/canvas/webgl/drawArraysOutOfBounds.html
fast/canvas/webgl/drawElementssOutOfBounds.html
fast/canvas/webgl/error-reporting.html
fast/canvas/webgl/getActiveTest.html
fast/canvas/webgl/gl-object-get-calls.html
fast/canvas/webgl/incorrect-context-object-behaviour.html
fast/canvas/webgl/index-validation-copies-indices.html
fast/canvas/webgl/index-validation.html
fast/canvas/webgl/invalidPassedParams.html
fast/canvas/webgl/null-object-behaviour.html
fast/canvas/webgl/tex-image-and-sub-image-2d-with-image.html
fast/canvas/webgl/texImage2DImageDataTest.html
fast/canvas/webgl/uniform-location.html

Eric Seidel pointed out that the Snow Leopard bot doesn't block the commit queue, so I'll not disable these for the moment, but we need to make sure that the fix (which will likely be to disable multisampling in some situations) addresses these failures.
Comment 5 Zhenyao Mo 2010-03-16 21:37:41 PDT
Created attachment 50870 [details]
patch
Comment 6 WebKit Review Bot 2010-03-16 21:40:03 PDT
Attachment 50870 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
WebCore/platform/graphics/mac/GraphicsContext3DMac.cpp:209:  Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side.  [whitespace/operators] [4]
WebCore/platform/graphics/mac/GraphicsContext3DMac.cpp:210:  Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side.  [whitespace/operators] [4]
WebKit/chromium/src/GraphicsContext3D.cpp:636:  Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side.  [whitespace/operators] [4]
WebKit/chromium/src/GraphicsContext3D.cpp:637:  Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side.  [whitespace/operators] [4]
Total errors found: 4 in 11 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 7 Zhenyao Mo 2010-03-16 21:44:04 PDT
Created attachment 50871 [details]
fixed style issue
Comment 8 Simon Fraser (smfr) 2010-03-17 08:04:43 PDT
Comment on attachment 50871 [details]
fixed style issue

> Index: WebCore/platform/graphics/GraphicsContext3D.h
> ===================================================================
> --- WebCore/platform/graphics/GraphicsContext3D.h	(revision 56097)
> +++ WebCore/platform/graphics/GraphicsContext3D.h	(working copy)
> @@ -713,6 +713,10 @@ namespace WebCore {
>                            AlphaOp* neededAlphaOp,
>                            unsigned int* format);
>  
> +#if PLATFORM(MAC)
> +        void validateAttributes();
> +#endif

Would this be better as a virtual method with an empty implementation here, so that
you can use it to fix up idiosyncrasies on other platforms too?


> +void GraphicsContext3D::validateAttributes()
> +{
> +    // Take into account the user's requested context creation attributes, in
> +    // particular stencil and antialias, and determine which could and could
> +    // not be honored based on the capabilities of the OpenGL implementation.
> +
> +    const char* vendor = reinterpret_cast<const char*>(::glGetString(GL_VENDOR));
> +    const char* extensions = reinterpret_cast<const char*>(::glGetString(GL_EXTENSIONS));

You only use the vendor string if m_attrs.antialias is true, so get it inside that clause.

I'll let someone more familiar with WebGL do the final review.
Comment 9 Zhenyao Mo 2010-03-17 10:53:00 PDT
Created attachment 50924 [details]
revised patch : responding to Simon Fraser's review

About the GraphicsContext3D::validateAttributes(), my feeling is that we should leave PLATFORM(MAC) there.  The reason is that for PLATFORM(CHROMIUM/QT), this functionality is implemented in GraphicsContex3DInternal::validateAttributes().  Either use GraphicsContext3D or GraphicsContext3DInternal implementation, but not both.  The same situation is with all the FBO variables: they are both in GraphicsContext3D PLATFORM(MAC) and GraphicsContext3DInternal PLATFORM(CHROMIUM/QT).  Adding the PLATFORM(MAC) is a reminder that those functions and variables are duplicated and could be combined into one single implementation.

The vendor quest is moved into the if block as suggested.
Comment 10 Kenneth Russell 2010-03-17 11:08:10 PDT
After more thought and discussion with Mo, we are going to roll back the original patch for https://bugs.webkit.org/show_bug.cgi?id=33416 and reapply it later via the commit queue. A large refactoring of the affected code is imminent, and it will be much easier to iterate on any heuristics needed to work around driver bugs if all of the changes are made after the refactoring has taken place.

Cancelling the review request. We will probably close this bug after the rollback.
Comment 11 Kenneth Russell 2010-03-17 13:02:07 PDT
Mo discovered that fast/canvas/webgl/index-validation.html is flaky after 33416 as well. Intermittent failures of this test on some of the bots (Leopard Intel Release (Tests) for example) confirm this. This must also be triggered by a driver bug since the failure is in the initial FBO setup and this code path is taken by all of the other tests.
Comment 12 Zhenyao Mo 2010-03-24 10:39:24 PDT
This bug is caused by a patch for https://bugs.webkit.org/show_bug.cgi?id=33416.  We canceled the original patch and dealt with the issue in revised patch in #33416.  Close this one.