Summary: | PlatformDisplay crash when calling egl_terminate | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Michael Catanzaro <mcatanzaro> | ||||||
Component: | WebKitGTK | Assignee: | Nobody <webkit-unassigned> | ||||||
Status: | RESOLVED FIXED | ||||||||
Severity: | Normal | CC: | bugs-noreply, cgarcia, hur.ims, mcatanzaro, zan | ||||||
Priority: | P2 | ||||||||
Version: | Other | ||||||||
Hardware: | PC | ||||||||
OS: | Linux | ||||||||
See Also: |
https://bugs.webkit.org/show_bug.cgi?id=145832 https://bugs.webkit.org/show_bug.cgi?id=174789 https://bugs.webkit.org/show_bug.cgi?id=176490 https://bugs.webkit.org/show_bug.cgi?id=255678 |
||||||||
Bug Depends on: | |||||||||
Bug Blocks: | 81456 | ||||||||
Attachments: |
|
Description
Michael Catanzaro
2016-05-21 09:59:16 PDT
Most likely we need to add some PlatformDisplay::Destroy function to be called explicitly from... somewhere... that's not an exit handler, and switch to using NeverDestroyed. (In reply to comment #1) > Most likely we need to add some PlatformDisplay::Destroy function to be > called explicitly from... somewhere... that's not an exit handler, and > switch to using NeverDestroyed. To properly fix this we need to figure out why it's crashing. Workarounds will just hide the actual issue. This is my guess looking at mesa egl code. _eglGlobal registers two at exist callbacks one to finish the display and another one to unload drivers, the one to finish the display happens first. When our destructor is called the _eglFiniDisplay callback has already been called, so we have a valid pointer for an already finished display. Then eglTerminate tries to find the display in the global display list, but fails and for some reason it crashes when trying to return an error. The error doesn't really depend on the display, it uses thread specific data I think. But _eglGetCurrentThread should never return null, so I don't understand why it's crashing. If we're committed to destructing the PlatformDisplay object, then using atexit() manually might work better than a static std::unique_ptr<> object. The PlatformDisplayWayland constructor would do the EGL initialization, setting up exit-time callbacks in Mesa. We should register the destruction of this object with atexit() afterwards, since the callbacks passed to atexit() are called in reverse order. This showcases the problem: https://www.ibm.com/developerworks/community/blogs/5894415f-be62-4bc0-81c5-3956e82276f3/entry/be_careful_when_using_the_atexit_routine_with_static_objects?lang=en Created attachment 280228 [details]
Patch
Could someone try this to confirm it works?
Comment on attachment 280228 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=280228&action=review I confirm it fixes at least the crash running test-ephy-bookmarks. Awesome, thanks. > Source/WebCore/platform/graphics/PlatformDisplay.cpp:171 > + std::atexit([] { PlatformDisplay::sharedDisplay().terminateEGLDisplay(); }); This is harmless, but not really great, since it means terminateEGLDisplay will be redundantly called on the sharedDisplay once for each time initializeEGLDisplay is called on ANY PlatformDisplay object. Perhaps it would it make sense to check if this == PlatformDisplay::sharedDisplay before registering the atexit handler here. Alternatively, and probably better, move this to inside the lambda in PlatformDisplay::sharedDisplay, where it will be called at most once. Or, perhaps the best option is to call terminateEGLDisplay on the current PlatformDisplay (i.e. |this| rather than the sharedDisplay) and remove the call to terminateEGLDisplay from the destructor. Comment on attachment 280228 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=280228&action=review >> Source/WebCore/platform/graphics/PlatformDisplay.cpp:171 >> + std::atexit([] { PlatformDisplay::sharedDisplay().terminateEGLDisplay(); }); > > This is harmless, but not really great, since it means terminateEGLDisplay will be redundantly called on the sharedDisplay once for each time initializeEGLDisplay is called on ANY PlatformDisplay object. Perhaps it would it make sense to check if this == PlatformDisplay::sharedDisplay before registering the atexit handler here. Alternatively, and probably better, move this to inside the lambda in PlatformDisplay::sharedDisplay, where it will be called at most once. Or, perhaps the best option is to call terminateEGLDisplay on the current PlatformDisplay (i.e. |this| rather than the sharedDisplay) and remove the call to terminateEGLDisplay from the destructor. Only PlatformDisplay::sharedDisplay() should ever live during the program lifetime, so this method would only be called only once as well. Asserting !m_eglDisplayInitialized at the beginning here would be helpful in that regard. (In reply to comment #7) > Only PlatformDisplay::sharedDisplay() should ever live during the program > lifetime, so this method would only be called only once as well. Asserting > !m_eglDisplayInitialized at the beginning here would be helpful in that > regard. Are you saying that this class is intended to be a singleton? If so, then it should use our normal singleton naming convention (PlatformDisplay::singleton rather than PlatformDisplay::sharedDisplay), and should have nothing in its destructor (as it would never make sense to call egl_terminate there if it's always scheduled by atexit), and terminateEGLDisplay should be called on the this pointer rather than sharedDisplay for readability. If it's not supposed to be a singleton, then the assert you suggested would be inappropriate. The constructor is private (well, protected) so yes, it's indeed a singleton, but we really want to close displays, when needed, and that's why it's not NeverDestroyed. I agree that if we terminate in the atexit, it doesn't make sense to terminate again the destructor, since the atexit should always happen first. We can't capture this on a lambda for a pointer function. Created attachment 280238 [details]
Updated patch
Comment on attachment 280238 [details] Updated patch View in context: https://bugs.webkit.org/attachment.cgi?id=280238&action=review OK. > Source/WebCore/platform/graphics/PlatformDisplay.cpp:169 > + std::atexit([] { PlatformDisplay::sharedDisplay().terminateEGLDisplay(); }); Since this is not a static function, I think it make more sense to not use PlatformDisplay::sharedDisplay() here. You say you cannot capture |this| here, but I don't understand why not; did you try writing it this way: std::atexit([this] { this->terminateEGLDisplay(); }); Comment on attachment 280238 [details] Updated patch View in context: https://bugs.webkit.org/attachment.cgi?id=280238&action=review >> Source/WebCore/platform/graphics/PlatformDisplay.cpp:169 >> + std::atexit([] { PlatformDisplay::sharedDisplay().terminateEGLDisplay(); }); > > Since this is not a static function, I think it make more sense to not use PlatformDisplay::sharedDisplay() here. You say you cannot capture |this| here, but I don't understand why not; did you try writing it this way: > > std::atexit([this] { this->terminateEGLDisplay(); }); std::atexit() has to be passed a function pointer to a static function, since it only stores the pointer, not the whole function object. Only capture-less lambdas can be passed as such arguments, since they have the conversion operator that converts to a function pointer. http://en.cppreference.com/w/cpp/language/lambda -- see ClosureType::operator ret(*)(params)() In such cases compiler simply generates a bare struct with the call operator and this conversion operator, since it has no state to manage because nothing is captured. You'd basically be able to call the struct's call operator as if it was any other static non-method function. If anything is captured, then lambda implies an actual object of that helper struct type, generating all the relevant constructors and destructors, plus you have to address the lifetime management (limited to scope or storable to std::function<> for delayed use). Committed r201595: <http://trac.webkit.org/changeset/201595> Looks like this has somehow returned in bug #174789. |