Summary: | [GTK] WebCore::TransformationMatrix::multiply segfaults when loading last.fm | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Joanmarie Diggs <jdiggs> | ||||||||
Component: | Layout and Rendering | Assignee: | Zan Dobersek <zan> | ||||||||
Status: | RESOLVED WONTFIX | ||||||||||
Severity: | Normal | CC: | arjunak234, benjamin, bugzilla, cgarcia, mcatanzaro, rdieter, zan | ||||||||
Priority: | P2 | ||||||||||
Version: | 528+ (Nightly build) | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
See Also: |
https://bugzilla.redhat.com/show_bug.cgi?id=1200217 https://bugzilla.redhat.com/show_bug.cgi?id=1208989 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65690 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65715 |
||||||||||
Attachments: |
|
Description
Joanmarie Diggs
2015-03-05 03:07:08 PST
Turns out it is simple to get this crash. Load this: <html> <body> <div style="-webkit-transform: translate3d(0, 0, 0);">foo</div> </body> </html> Reliably crashes for me. Both machines are F22, MiniBrowser with current master, not using the WebKitGtk jhbuild deps at the moment. Created attachment 247950 [details]
Patch
Comment on attachment 247950 [details]
Patch
Hrm. Why is LayerTreeHostGtk::compositeLayersToContext being called before the native surface is set? That seems really suspicious.
(In reply to comment #3) > Comment on attachment 247950 [details] > Patch > > Hrm. Why is LayerTreeHostGtk::compositeLayersToContext being called before > the native surface is set? That seems really suspicious. Because the native handler ID is now set when the web view is realized, since r177075. Comment on attachment 247950 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=247950&action=review > Source/WebKit2/WebProcess/WebPage/gtk/LayerTreeHostGtk.cpp:-130 > - // The creation of the TextureMapper needs an active OpenGL context. > - if (!makeContextCurrent()) > - return; I think failing to call makeContextCurrent here is definitely a problem. If the OpenGL context has changed the commands will be issued to the incorrect context. (In reply to comment #5) > Comment on attachment 247950 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=247950&action=review > > > Source/WebKit2/WebProcess/WebPage/gtk/LayerTreeHostGtk.cpp:-130 > > - // The creation of the TextureMapper needs an active OpenGL context. > > - if (!makeContextCurrent()) > > - return; > > I think failing to call makeContextCurrent here is definitely a problem. If > the OpenGL context has changed the commands will be issued to the incorrect > context. I assumed by the comment that it was there only to create the TextureMapper. Comment on attachment 247950 [details] Patch I'm not sure this patch fixes this issue, because the bts in the debug bot are a bit different, so I've opened a new bug report. See https://bugs.webkit.org/show_bug.cgi?id=142345 Created attachment 247960 [details]
backtrace from debug build
Hope this helps.
FWIW, I just downgraded a system to F21 (current stable) and I cannot reproduce this crash on the downgraded machine. I think this crash happens 100% of the time visiting youtube.com. (In reply to comment #10) > I think this crash happens 100% of the time visiting youtube.com. but only in Fedora unstable, right? (In reply to comment #11) > but only in Fedora unstable, right? Joanie says yes, and that's what I was using when I hit the crash on youtube... so probably, yes. :( (In reply to comment #11) > (In reply to comment #10) > > I think this crash happens 100% of the time visiting youtube.com. > > but only in Fedora unstable, right? Yeah, I don't see it in F21. And presumably it's only in the 64-bit F22 -- haven't tried any other 64-bit unstable distro releases -- as my quick-and-dirty hackaround so that I could work on bug 142309 was this: --- a/Source/WebCore/platform/graphics/transforms/TransformationMatrix.cpp +++ b/Source/WebCore/platform/graphics/transforms/TransformationMatrix.cpp @@ -1241,7 +1241,7 @@ TransformationMatrix& TransformationMatrix::multiply(const TransformationMatrix& } #undef MATRIX_MULTIPLY_ONE_LINE -#elif defined(TRANSFORMATION_MATRIX_USE_X86_64_SSE2) +#elif defined(TRANSFORMATION_MATRIX_USE_X86_64_SSE2) && !PLATFORM(GTK) // x86_64 has 16 XMM registers which is enough to do the multiplication fully in registers. __m128d matrixBlockA = _mm_load_pd(&(m_matrix[0][0])); __m128d matrixBlockC = _mm_load_pd(&(m_matrix[1][0])); (And yes, with that in place youtube.com works just fine on F22.) OK thanks for the workaround, we will "gladly" carry that as long as necessary to make youtube work. As an experiment, on my F21/stable machine, I upgraded gcc to the F22 packages leaving everything else alone. I now see this crash on that machine. From yum history: Packages Altered: Updated cpp-4.9.2-6.fc21.x86_64 @updates Update 5.0.0-0.17.fc22.x86_64 @fedora/22 Updated gcc-4.9.2-6.fc21.x86_64 @updates Update 5.0.0-0.17.fc22.x86_64 @fedora/22 Updated gcc-c++-4.9.2-6.fc21.x86_64 @updates Update 5.0.0-0.17.fc22.x86_64 @fedora/22 That's all it took. HTH..... Well, I have no clue what's wrong, or whether it's our code or GCC that is broken, but this is by far our biggest problem right now ("I think this crash happens 100% of the time visiting youtube.com"), so at least let's commit a workaround ASAP. Any objections to conditionally compiling the x86_64 matrix multiplication fastpath only when using GCC < 5.0? It will obviously make things slower, but that beats a crash. Hi Ben, CCing you because this is your code (from r138866), so fates permitting, maybe you have some idea what might be wrong. None of us understand it. :) The best backtrace we have is https://bugzilla.redhat.com/attachment.cgi?id=999725 which isn't great but does show the assembler where the crash occurs. Does the crash occur in debug/non-optimized builds? Given that the problem appears to be due to changes in GCC, it'd be nice to report this to the project's Bugzilla or maybe find the corresponding bug, but I don't wish gathering all the necessary information upon anyone ... (In reply to comment #17) > Does the crash occur in debug/non-optimized builds? See comment #8. Um, the backtrace in comment #8 is better :p The reason behind the crash is that Matrix4 is not aligned to 16 bytes anymore, despite the (aligned (16)) attribute, which apparently GCC 5 doesn't respect anymore. The alignment of 16 bytes is mandatory for data processed with SSE instructions. Created attachment 250129 [details]
WIP
One possible solution, using the alignas(16) specifier.
Nice find. The significant disadvantage of your patch is that now you have to remember to use alignas each time you declare a matrix. Since I don't see any mention of this in the GCC 5 release notes, I presume the change is unintentional. I've filed a GCC bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1208989 Should be fixed in GCC; will reopen if we run into problems with the updated version. |