Bug 20935 - refptr alone is insufficient to ensure references to objects stay around long enough
Summary: refptr alone is insufficient to ensure references to objects stay around long...
Status: RESOLVED INVALID
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 20403
  Show dependency treegraph
 
Reported: 2008-09-19 04:12 PDT by Luke Kenneth Casson Leighton
Modified: 2008-09-21 00:39 PDT (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Kenneth Casson Leighton 2008-09-19 04:12:43 PDT
this is a rather complex design flaw in webkit - please bear with me while
i endeavour to explain it correctly.

the core of the issue is the assumption that "refptr" is sufficient alone
to ensure that objects remain "in scope", when there are interdependencies between, for example, Document and Frame.

here's some pseudo-code which illustrates what's really required to solve this issue - but please bear in mind that this is a significant manual "hack":

WebKitWebFrame * webkit_web_frame_init()
{
    Frame *frame = createFrame(); /* refptr on frame goes up */
    Document *doc = frame->document();
    /* but document _controls_ frame, so refptr on doc must go up too */
    doc->addRef();
}

WebKitWebFrame * webkit_web_frame_finalize()
{
   Document *doc = m_frame->document();
   doc->decRef();
}

if you don't do this, then when Document's refcount goes to zero (as is the case in e.g. bug #20403) the destructor gets called... but... there's... still a Frame which is using that Document!  but... you didn't _tell_ it that, because you're only keeping a pointer to the Frame in WebKitWebFrame, not the Document...

right now, bug #20403 is indicative of _part_ of the problem - but the
actual problem may go a lot deeper, requiring quite a thorough code review,
where this one instance - bug #20403 - _may_ just be the start.

any object which can be "bound" to - by a JS binding, ObjectiveC
binding or Glib binding, increasing the refcount by one on that
object - e.g. a Document - should also have the refcount increased
on any other objects which COULD possibly be "bound" to by other
bindings.

so, if Document* contains a Frame*, and a Page*, and anything else:
i see FrameView* as well, m_StyleSheets - anything.  if someone gets hold of
one of those objects, it AUTOMATICALLY means that the PARENT object as well
(in this case the Document*) MUST have its refcount increased as
well.

at the moment, that's not happening.

without looking too closely, i suspect that there will be other instances
where this problem occurs.

exactly how this should be solved isn't clear: the problem seems to be
suggesting that RefPtrs should be made a virtual base class, with
addRef and decRef being overloaded on a per-class basis!!

e.g.:

class DocumentRef (RefPtr):
{
     addRef()
     {
        RefPtr::addRef();
        if (m_frame) m_frame->addRef();
        .... stylesheets, everything.
     }
     decRef()
     {
         RefPtr::decRef();
         if (m_frame) m_frame->decRef();
         .... stylesheets etc.
     }
}

with what happens when m_frame is NULL at one point but then gets _added_...
dang :)

that would mean... that would mean that... urgggh.  when m_frame is added,
you would need to _copy_ the refcount of the container document, and when
(if) it's ever removed (set to NULL), subtract it.

just taking a look at Document: _thank goodness_ there's no "setFrame()"
function!!!

but there _is_ a "clearFramePointer()" function, which means that the
following will need to be done, to maintain refcount integrity:

Document::clearFramePointer()
{
   if (!m_frame)
      return;
   m_frame->setRefPtrCount(m_frame->getRefPtrCount() - this->getRefPtrCount());
   m_frame = NULL;
}

to which i can honestly honestly say: "urghh" :)

if you're _incredibly_ incredibly lucky, Document and Frame _may_ be
the only two webkit objects which suffer from this problem - it _may_
be the case that this is a unique one-off problem, it _may_ be a special-
case because of how and where Document::detach() gets called,
resulting in m_frame getting set to NULL _without_ ... oh look!

comments in here:

"WebKit/wx/WebKitSupport/FrameLoaderClientWx.cpp"
FrameLoaderClientWx::createFrame

    FIXME: Temporarily disabling code for loading subframes. While most 
    (i)frames load and are destroyed properly, the iframe created by
    google.com in its new homepage does not get destroyed when 
    document()->detach() is called, as other (i)frames do. It is destroyed on 
    app shutdown, but until that point, this 'in limbo' frame will do things
    like steal keyboard focus and crash when clicked on. (On some platforms,
    it is actually a visible object, even though it's not in a valid state.)

... does that sound familiar?


oh look!

in "WebCore/page/Frame.cpp" - Frame::setView(FrameView* view):

        // FIXME: We don't call willRemove here. Why is that OK?
        d->m_doc->detach();

... does this ring a bell as well?

so the link between Frame's refcount and Document's refcount being
inconsistent seems to have resulted in all sorts of FIXMEs that
people are left wondering why there are problems.
Comment 1 Mark Rowe (bdash) 2008-09-20 22:07:15 PDT
The lifetime and ownership of Document and Frame are not as clear as we would like, but their approach to refcounting is not incorrect.  In particular, your suggestion that their ref and deref members should increment/decrement the refcounts of their members does not mesh at all with how refcounting works.

If an object intends wants to keep alive an object that it has a reference to (eg, keep a strong reference), it will use a RefPtr to keep track of that member.  If it does not wish to keep it alive, or knows the object to which it refers *must* outlive it, it will use a bare pointer (eg, a weak reference).

There are clearly some ownership and lifetime issues in the GTK code that are leading to the issues mentioned in bug 20403, but RefCounted being insufficient isn't the problem.