Bug 32845 - [Chromium] WebGL crashes intermittently on Linux
Summary: [Chromium] WebGL crashes intermittently on Linux
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebGL (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC Linux
: P2 Normal
Assignee: Kenneth Russell
URL:
Keywords:
Depends on:
Blocks: 32890
  Show dependency treegraph
 
Reported: 2009-12-21 18:43 PST by Kenneth Russell
Modified: 2009-12-29 15:57 PST (History)
12 users (show)

See Also:


Attachments
Patch (11.33 KB, patch)
2009-12-21 19:37 PST, Kenneth Russell
no flags Details | Formatted Diff | Diff
Revised patch (15.26 KB, patch)
2009-12-22 14:19 PST, Kenneth Russell
no flags Details | Formatted Diff | Diff
Revised patch (15.25 KB, patch)
2009-12-22 14:30 PST, Kenneth Russell
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kenneth Russell 2009-12-21 18:43:34 PST
Testing of the WebGL demos at http://khronos.org/webgl/wiki/Demo_Repository with Chromium on Linux results in intermittent crashes and displays of the "Aw, snap!" page.
Comment 1 Kenneth Russell 2009-12-21 19:37:04 PST
Created attachment 45362 [details]
Patch

The dlclose'ing of libGL.so.1 and dlopen'ing of it each time a GraphicsContext3D was created was occasionally causing it to be re-mapped at a different base address. Since GLEW is not re-initialized every time, primarily for performance reasons, its cached function pointers were pointing to garbage. Stopped closing and re-opening libGL.so.1 each time; now it is loaded lazily, when the first 3D context is created. Also reused the X display connection since the GLX routines' correctness might hinge upon it not resulting in a change of GL implementation.
Comment 2 WebKit Review Bot 2009-12-21 19:41:36 PST
style-queue ran check-webkit-style on attachment 45362 [details] without any errors.
Comment 3 Eric Seidel (no email) 2009-12-21 21:39:51 PST
Commentary from one of the linux folks would make this review easier. :)
Comment 4 Adam Langley 2009-12-22 10:56:24 PST
LGTM. I didn't know that WebGL was even supposed to work on Linux, nor how it's getting a connection to the X server.

So the patch looks fine, but if it actually runs then there's something terribly wrong with our sandbox.
Comment 5 Kenneth Russell 2009-12-22 11:14:41 PST
(In reply to comment #4)
> LGTM. I didn't know that WebGL was even supposed to work on Linux, nor how it's
> getting a connection to the X server.
> 
> So the patch looks fine, but if it actually runs then there's something
> terribly wrong with our sandbox.

Thanks for your review.

WebGL does work in Chromium on Linux and its other supported platforms (see http://khronos.org/webgl/wiki/Getting_a_WebGL_Implementation), but yes, it currently requires disabling the sandbox which is very undesirable. We're working to remove this requirement.

Can you r+ and cq+ the patch?
Comment 6 Evan Martin 2009-12-22 12:55:04 PST
Did you consider making a class out of all the statics and then having a static instance of that class?  (It might also make the diff smaller.)
Comment 7 Adam Langley 2009-12-22 13:01:16 PST
(Note: I am not a WebKit reviewer, so I cannot r+ your patch.)
Comment 8 Kenneth Russell 2009-12-22 13:07:47 PST
(In reply to comment #6)
> Did you consider making a class out of all the statics and then having a static
> instance of that class?  (It might also make the diff smaller.)

I think that would end up being more complicated.

Note that all of the code in this file is essentially temporary until we move the 3D calls into Chromium's upcoming GPU process, so I didn't want to do extensive refactorings.
Comment 9 Kenneth Russell 2009-12-22 13:09:00 PST
(In reply to comment #7)
> (Note: I am not a WebKit reviewer, so I cannot r+ your patch.)

No problem. How can I get this committed? A co-worker needs to do work in the same portions of this file, so I would really like to get it checked in now so we don't collide.
Comment 10 Eric Seidel (no email) 2009-12-22 13:09:51 PST
I'll take another look.
Comment 11 Eric Seidel (no email) 2009-12-22 13:20:09 PST
Looking again, I agree with Evan, this would probably all be cleaner held behind one static pointer instead of adding a whole bunch of static pointers (with unclear lifetimes) to the main class.  Putting them all in one place leaves them as m_ vars and makes their lifetime clear.  You could even make them a struct instead if that would be cleaner.  Then the calls become: s_libGL->glXChooseFBConfig()

Can you explain more about how this is temporary?  If we're about to delete this code, do we need to fix it?
Comment 12 Kenneth Russell 2009-12-22 13:40:53 PST
(In reply to comment #11)
> Looking again, I agree with Evan, this would probably all be cleaner held
> behind one static pointer instead of adding a whole bunch of static pointers
> (with unclear lifetimes) to the main class.  Putting them all in one place
> leaves them as m_ vars and makes their lifetime clear.  You could even make
> them a struct instead if that would be cleaner.  Then the calls become:
> s_libGL->glXChooseFBConfig()

OK, I'll make these changes and upload a new patch.

> Can you explain more about how this is temporary?  If we're about to delete
> this code, do we need to fix it?

We anticipate moving these calls out of the renderer process next quarter. In the interim we must fix the issue as the crashes are very noticeable and poor behavior.
Comment 13 Kenneth Russell 2009-12-22 14:19:00 PST
Created attachment 45403 [details]
Revised patch

Factored new statics into a separate GLConnection class, of which there is a static instance.
Comment 14 WebKit Review Bot 2009-12-22 14:23:17 PST
Attachment 45403 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
WebKit/chromium/src/GraphicsContext3D.cpp:201:  This { should be at the end of the previous line  [whitespace/braces] [4]
WebKit/chromium/src/GraphicsContext3D.cpp:212:  render_type is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebKit/chromium/src/GraphicsContext3D.cpp:212:  share_list is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
Total errors found: 3
Comment 15 Kenneth Russell 2009-12-22 14:30:50 PST
Created attachment 45405 [details]
Revised patch

Fixed errors reported by style-queue.
Comment 16 Eric Seidel (no email) 2009-12-22 14:35:08 PST
Comment on attachment 45405 [details]
Revised patch

You beat me!  Now my comments are probably mostly obsolete:

A few naming issues.  Some of which the stylebot caught.

We generally try to encouage full names like "display" instead of "dpy"

It's unclear to me if s_initializedGLEW is still needed or if a s_gl check would be enough now?

I think some folks use sizeof(libnames[0]) in this case:
 319         for (int i = 0; i < sizeof(libNames) / sizeof(const char*); i++)
Sadly we don't have a template function for that, ojan recently filed a bug about htat.

I think we should fix the style-bot errors, but otherwise this is ready to land.  I can't remember if you're a committer yet or not.
Comment 17 Kenneth Russell 2009-12-22 14:39:21 PST
(In reply to comment #16)
> (From update of attachment 45405 [details])
> You beat me!  Now my comments are probably mostly obsolete:
> 
> A few naming issues.  Some of which the stylebot caught.
> 
> We generally try to encouage full names like "display" instead of "dpy"

I just copied these names out of the GLX headers.

> It's unclear to me if s_initializedGLEW is still needed or if a s_gl check
> would be enough now?

s_initializedGLEW is still needed on all platforms.

> I think some folks use sizeof(libnames[0]) in this case:
>  319         for (int i = 0; i < sizeof(libNames) / sizeof(const char*); i++)
> Sadly we don't have a template function for that, ojan recently filed a bug
> about htat.

Thanks for the suggestion. I'll certainly keep this in mind and if you want me to change it here now I will.

> I think we should fix the style-bot errors, but otherwise this is ready to
> land.  I can't remember if you're a committer yet or not.

Style fixes done as you've seen. I'm not a committer yet.
Comment 18 Eric Seidel (no email) 2009-12-22 14:41:00 PST
Comment on attachment 45405 [details]
Revised patch

Looks plenty good enough for now.
Comment 19 WebKit Commit Bot 2009-12-22 14:48:18 PST
Comment on attachment 45405 [details]
Revised patch

Rejecting patch 45405 from commit-queue.

Failed to run "['WebKitTools/Scripts/run-webkit-tests', '--no-launch-safari', '--exit-after-n-failures=1', '--quiet']" exit_code: 1
Last 500 characters of output:
outTests
Testing 11820 test cases.
http/tests/loading/bad-server-subframe.html -> timed out
Sampling process 7092 for 10 seconds with 10 milliseconds of run time between samples
Sampling completed, processing symbols...
Sample analysis of process 7092 written to file /Users/eseidel/Library/Logs/DumpRenderTree/HangReport.txt

Exiting early after 1 failures. 8746 tests run.
210.59s total testing time

8745 test cases (99%) succeeded
1 test case (<1%) timed out
4 test cases (<1%) had stderr output

Full output: http://webkit-commit-queue.appspot.com/results/138507
Comment 20 Eric Seidel (no email) 2009-12-22 14:52:32 PST
Comment on attachment 45405 [details]
Revised patch

Your patch was another victim of bug 30392.
Comment 21 WebKit Commit Bot 2009-12-22 15:00:27 PST
Comment on attachment 45405 [details]
Revised patch

Clearing flags on attachment: 45405

Committed r52504: <http://trac.webkit.org/changeset/52504>
Comment 22 WebKit Commit Bot 2009-12-22 15:00:37 PST
All reviewed patches have been landed.  Closing bug.
Comment 23 Evan Martin 2009-12-23 04:22:50 PST
BTW, I'm surprised you need the libNames list; dlopen knows to search the library path if you give it a plain filename without slashes (which will also make this code work for people who have libGL in /usr/local/lib or whatever)
Comment 24 Evan Martin 2009-12-29 15:57:57 PST
For those who care, I filed
  https://bugs.webkit.org/show_bug.cgi?id=33032
about the dlopen thing