Bug 128175

Summary: REGRESSION(r162947): Document::topDocument() returns an incorrect reference for cached Documents
Product: WebKit Reporter: Zan Dobersek <zan>
Component: AccessibilityAssignee: Zan Dobersek <zan>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, beidson, cfleizach, cgarcia, commit-queue, esprehn+autocc, kangil.han, kling, koivisto, mario, mpakulavelrutka, t.tom, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Linux   
Attachments:
Description Flags
Patch
none
Backtrace to ASSERT(&topDocument == this || !m_axObjectCache)
none
Patch koivisto: review+

Description Zan Dobersek 2014-02-04 05:00:20 PST
[GTK] CachedFrame::destroy() should clear the Document's AXObjectCache before destroying CachedFrame children
Comment 1 Zan Dobersek 2014-02-04 05:05:11 PST
Created attachment 223099 [details]
Patch
Comment 2 Mario Sanchez Prada 2014-02-04 06:04:43 PST
Comment on attachment 223099 [details]
Patch

I'm almost ok with this patch, but I just realized that this might be a problem if CachedFrame is not the main frame, since clearAXObjectCache() is meant to be called only over the top document (as returned by Document::topDocument().

Thus, could you just update this patch to make that call conditioned to this->isMainFrame() (or to m_document == &m_document->topDocument()) and check whether that fixes the crash equally well? If that's the case, I'll be happy to r+ the patch
Comment 3 Zan Dobersek 2014-02-04 06:30:08 PST
(In reply to comment #2)
> (From update of attachment 223099 [details])
> I'm almost ok with this patch, but I just realized that this might be a problem if CachedFrame is not the main frame, since clearAXObjectCache() is meant to be called only over the top document (as returned by Document::topDocument().
> 
> Thus, could you just update this patch to make that call conditioned to this->isMainFrame() (or to m_document == &m_document->topDocument()) and check whether that fixes the crash equally well? If that's the case, I'll be happy to r+ the patch

I understand what you're aiming at, after all there's an ASSERT for just that condition in Document::clearAXObjectCache().

Calling that method only for the top document doesn't fix the crashes, though. This outlines the bigger problem of non-top documents having their own AXObjectCaches created.
Comment 4 Mario Sanchez Prada 2014-02-04 06:44:15 PST
(In reply to comment #3)
> [...]
> Calling that method only for the top document doesn't fix the crashes, though. 
> This outlines the bigger problem of non-top documents having their own 
> AXObjectCaches created.

Yes, that seems to be another (and bigger) issue. However, I assume that was written that way for some reason, so before letting this in I'd rather have Chris's opinion on this (adding him to CC).
Comment 5 Zan Dobersek 2014-02-04 06:50:47 PST
Comment on attachment 223099 [details]
Patch

Removing the review flag since this patch is wrong.
Comment 6 Zan Dobersek 2014-02-04 07:16:02 PST
This regressed with r162947 which changed the way the top document is determined.
http://trac.webkit.org/changeset/162947
Comment 7 chris fleizach 2014-02-04 09:51:16 PST
(In reply to comment #3)
> (In reply to comment #2)
> > (From update of attachment 223099 [details] [details])
> > I'm almost ok with this patch, but I just realized that this might be a problem if CachedFrame is not the main frame, since clearAXObjectCache() is meant to be called only over the top document (as returned by Document::topDocument().
> > 
> > Thus, could you just update this patch to make that call conditioned to this->isMainFrame() (or to m_document == &m_document->topDocument()) and check whether that fixes the crash equally well? If that's the case, I'll be happy to r+ the patch
> 
> I understand what you're aiming at, after all there's an ASSERT for just that condition in Document::clearAXObjectCache().
> 
> Calling that method only for the top document doesn't fix the crashes, though. This outlines the bigger problem of non-top documents having their own AXObjectCaches created.

As far as I know, non top documents should not have their own ax object cache. 
Why don't existing Frame destruction methods take care of this? Is it because topDocument returns a different value than before?

Can you post a backtrace for one of these crashes?

I'm worried that this might throw out all objects if a subframe is removed
Comment 8 Zan Dobersek 2014-02-04 13:57:50 PST
Created attachment 223158 [details]
Backtrace to ASSERT(&topDocument == this || !m_axObjectCache)

This issue is being expressed in debug builds with an assertion in Document::axObjectCache().
http://trac.webkit.org/browser/trunk/Source/WebCore/dom/Document.cpp#L2183

This is happening due to the cached Document still being connected to the Frame object in which the currently-used Document resides. So when the cached Document's axObjectCache() is retrieved, the top Document is determined to be the current one.

The backtrace shows another, different problem -- the AXObjectCache::notificationPostTimerFired is called after the Document to which AXObjectCache belongs is cached. Is that acceptable? Should the method return early if the document is in cache?

This is all nicely exhibited by the LayoutTests/fast/history/timed-refresh-in-cached-frame.html layout test. When the test is done, the 'PASSED' URL is loaded, producing a new document. The notification timer is fired after that for the AXObjectCache that belongs to the now-cached Document that presented timed-refresh-in-cached-frame.html, and the topDocument() for the latter Document returns the current Document of the main frame -- the Document presenting the 'PASSED' content.
http://trac.webkit.org/browser/trunk/LayoutTests/fast/history/timed-refresh-in-cached-frame.html
Comment 9 Zan Dobersek 2014-02-05 08:09:39 PST
The issue here is that the timed-refresh-in-cached-frame.html-containing Document gets cached in FrameLoader::commitProvisionalLoad()[1].
If it weren't cached, in Frame::setDocument()[2] prepareForDestruction() would be called on that Document object, which would disconnect it from the frame and subsequently the topDocument() method would return a reference to that object itself.

For instance, adjusting WebKitTestRunner to use the document viewer cache model bypasses the problem as the page cache capacity is dropped to 0 and no caching is done.[3]

CC-ing Antti who authored r162947.

[1] http://trac.webkit.org/browser/trunk/Source/WebCore/loader/FrameLoader.cpp#L1774
[2] http://trac.webkit.org/browser/trunk/Source/WebCore/page/Frame.cpp#L291
[3] http://trac.webkit.org/browser/trunk/Source/WebCore/history/PageCache.cpp#L375
Comment 10 chris fleizach 2014-02-05 09:06:02 PST
(In reply to comment #8)
> Created an attachment (id=223158) [details]
> Backtrace to ASSERT(&topDocument == this || !m_axObjectCache)
> 
> This issue is being expressed in debug builds with an assertion in Document::axObjectCache().
> http://trac.webkit.org/browser/trunk/Source/WebCore/dom/Document.cpp#L2183
> 
> This is happening due to the cached Document still being connected to the Frame object in which the currently-used Document resides. So when the cached Document's axObjectCache() is retrieved, the top Document is determined to be the current one.
> 
> The backtrace shows another, different problem -- the AXObjectCache::notificationPostTimerFired is called after the Document to which AXObjectCache belongs is cached. Is that acceptable? Should the method return early if the document is in cache?
> 
> This is all nicely exhibited by the LayoutTests/fast/history/timed-refresh-in-cached-frame.html layout test. When the test is done, the 'PASSED' URL is loaded, producing a new document. The notification timer is fired after that for the AXObjectCache that belongs to the now-cached Document that presented timed-refresh-in-cached-frame.html, and the topDocument() for the latter Document returns the current Document of the main frame -- the Document presenting the 'PASSED' content.
> http://trac.webkit.org/browser/trunk/LayoutTests/fast/history/timed-refresh-in-cached-frame.html

So when a Document is cached, are all the render objects and nodes destroyed? Or are they kept around. Not that familiar with Document caching yet, so I'm not fully sure of the implications associated with accessibility

Regarding notifications, we should probably return early if the document has moved to the cache. It seems that at that point it is no longer visible or usable to the user, so sending further notifications would not be useful
Comment 11 Radar WebKit Bug Importer 2014-02-07 09:21:38 PST
<rdar://problem/16012340>
Comment 12 Zan Dobersek 2014-02-10 08:47:31 PST
I can confirm that reversing Document::topDocument() back to traversing up the chain of owner elements (i.e. reversing the changes made in r162947) completely fixes the problem.

Adding an early return in AXObjectCache::notificationPostTimerFired in case of document residing in the page cache only delays the problem until an assertion in the RenderObject destructor due to the m_hasAXObject flag equaling to true, indicating that AccessibilityRenderObjects are not being detached.
http://trac.webkit.org/browser/trunk/Source/WebCore/rendering/RenderObject.cpp#L122
Comment 13 chris fleizach 2014-02-10 08:49:07 PST
(In reply to comment #12)
> I can confirm that reversing Document::topDocument() back to traversing up the chain of owner elements (i.e. reversing the changes made in r162947) completely fixes the problem.
> 
> Adding an early return in AXObjectCache::notificationPostTimerFired in case of document residing in the page cache only delays the problem until an assertion in the RenderObject destructor due to the m_hasAXObject flag equaling to true, indicating that AccessibilityRenderObjects are not being detached.
> http://trac.webkit.org/browser/trunk/Source/WebCore/rendering/RenderObject.cpp#L122

Maybe we should have an AX topDocument() method that does what the old behavior does
Comment 14 Zan Dobersek 2014-02-12 03:53:06 PST
(In reply to comment #13)
> (In reply to comment #12)
> > I can confirm that reversing Document::topDocument() back to traversing up the chain of owner elements (i.e. reversing the changes made in r162947) completely fixes the problem.
> > 
> > Adding an early return in AXObjectCache::notificationPostTimerFired in case of document residing in the page cache only delays the problem until an assertion in the RenderObject destructor due to the m_hasAXObject flag equaling to true, indicating that AccessibilityRenderObjects are not being detached.
> > http://trac.webkit.org/browser/trunk/Source/WebCore/rendering/RenderObject.cpp#L122
> 
> Maybe we should have an AX topDocument() method that does what the old behavior does

I tried this approach, but it doesn't help.
Comment 15 Zan Dobersek 2014-02-12 04:32:04 PST
This only seems to be an issue on GTK and EFL WK2 ports (but not WK1) -- odd.
Comment 16 Alexey Proskuryakov 2014-02-12 11:00:36 PST
> So when a Document is cached, are all the render objects and nodes destroyed? 

There were some changes in this area, so I think that non-rendering code should be prepared for either behavior.
Comment 17 Andreas Kling 2014-02-12 11:06:45 PST
(In reply to comment #16)
> > So when a Document is cached, are all the render objects and nodes destroyed? 
> 
> There were some changes in this area, so I think that non-rendering code should be prepared for either behavior.

At this time, Documents in Page Cache will retain their entire render trees, though it's expected that you don't interact with the renderers
Comment 18 Zan Dobersek 2014-02-12 12:46:04 PST
Falling back to traversing the ownership chain when the Document is in page cache or is having its render tree destroyed fixes all problems for the GTK port, and very likely the EFL port as well.

The first case led to assertions in Document::axObjectCache() (attachment #223158 [details]) and in the second case the AXObjectCache objects were not being destroyed, leaving the AccessibilityRenderObjects attached and leading to assertions in RenderObject destructors (mentioned in comment #12 and detailed in bug #128259).
Comment 19 Zan Dobersek 2014-02-12 13:36:46 PST
Created attachment 224001 [details]
Patch
Comment 20 chris fleizach 2014-02-14 11:35:00 PST
(In reply to comment #19)
> Created an attachment (id=224001) [details]
> Patch

This is also causing problems on Mac (like https://bugs.webkit.org/show_bug.cgi?id=128738)

Applying this patch seems to fix it
Comment 21 Zan Dobersek 2014-02-16 01:37:47 PST
*** Bug 128738 has been marked as a duplicate of this bug. ***
Comment 22 Zan Dobersek 2014-02-16 01:38:35 PST
*** Bug 128259 has been marked as a duplicate of this bug. ***
Comment 23 Zan Dobersek 2014-02-18 12:13:18 PST
Given that this now affects all ports, could someone please review the patch?

The only concern I'm having is whether the fast path is chosen only for the appropriate cases of state, current and future. Is this protected well enough or should Document::topDocument() behavior change back to traversing the owner elements?
Comment 24 Antti Koivisto 2014-02-21 06:01:31 PST
Comment on attachment 224001 [details]
Patch

This looks like papering over the problem. At least in Mac case (bug 128738) the underlying issue seems to be calling topDocument in the middle of Document destruction. Calling into half-deleted objects is generally bad so the question should be how can we avoid that.
Comment 25 Zan Dobersek 2014-02-21 13:50:43 PST
(In reply to comment #24)
> (From update of attachment 224001 [details])
> This looks like papering over the problem. At least in Mac case (bug 128738) the underlying issue seems to be calling topDocument in the middle of Document destruction. Calling into half-deleted objects is generally bad so the question should be how can we avoid that.

This problem, in this specific case, only became a real-world issue after Document::topDocument() was changed in a way that makes clearing the AX cache at the top of the frame tree prone to crashes.

Can we at least roll back the Document::topDocument() behavior to pre-r162947 to avoid the still-present crashes?
Comment 26 Antti Koivisto 2014-02-26 09:25:05 PST
Comment on attachment 224001 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=224001&action=review

Ok, lets land this while we figure out a real fix.

> Source/WebCore/dom/Document.cpp:4251
> +    if (!m_inPageCache && !m_renderTreeBeingDestroyed) {

Please add a FIXME here indicating that this is a workaround for AX teardown problem.
Comment 27 Zan Dobersek 2014-02-26 09:49:31 PST
Committed r164718: <http://trac.webkit.org/changeset/164718>