When a GObject DOM object is created we wrap the core DOM object (say, WebCore::Node) with a native GObject, add a manual C++ ref to the core object, and place the matching deref in the finalize method (note to self, this should probably be done in dispose). So far, this seems sane. Problem is, before returning the object we do the following: we place the object in a hash map, indexed by the core object, and if a wrapper for this core object is requested again we'll just return the corresponding GObject as-is. This entry in the hash map is, also, cleared on finalize. Since we do not add extra references in this process, you can get into a situation where you have two pointers to a GObject DOM binding that are actually the same object, with only one total reference; if you do g_object_unref on objects you don't need, tragedy will ensue when you try to use the second pointer (even if it's only to also unref it). The hash map concept seems very reasonable to me in general, so I think what we should do is to always add a new reference when we return an object, even if we are only getting it from the map. This way the rule "always unref DOM objects when you are done with them" will work. The only remaining problem would be the huge pain in the ass that this kind of API is for C programmers, of course...
(In reply to comment #0) > The hash map concept seems very reasonable to me in general, so I think what we should do is to always add a new reference when we return an object, even if we are only getting it from the map. This way the rule "always unref DOM objects when you are done with them" will work. the description of the problem is somewhat ambiguous, so here are some possible answers: if the intent is to create multiple gobjects which point to the same underlying webkit DOM object, this will be a complete disaster from a programming perspective by the time it gets up to high-level language bindings. many high-level applications rely on being able to do tests to see if a node is equal (isSameNode). removing such functionality can result in total project failure as, when using high-level languages, there is no other non-intrusive method for correctly identifying/comparing DOM objects. intrusive methods include adding unique IDs to DOM Nodes, using the Node "id" field which then of course interferes with other uses (such as from javascript being executed in parallel with high-level language DOM manipulation). surely the problem is solved by simply increasing the refcount when returning the corresponding GObject as-is? this seems to be the implied course of action recommended, by xan, and, if so, it is a good one.
Any updates on this issue? I know we've discussed it multiple times since the last comment, but I can't recall if there was any resolution.
(In reply to comment #2) > Any updates on this issue? I know we've discussed it multiple times since the last comment, but I can't recall if there was any resolution. We are basically stuck in finding a reasonable place to collect all the GObject wrappers created for the DOM objects in a page.
(In reply to comment #3) > (In reply to comment #2) > > Any updates on this issue? I know we've discussed it multiple times since the last comment, but I can't recall if there was any resolution. > > We are basically stuck in finding a reasonable place to collect all the GObject wrappers created for the DOM objects in a page. This of course assumes we want some sort of automatic collection, as opposed to the manual approach suggested in the first comment (which is still trivial to do but still makes the API cumbersome).
Warning: ugly solution follows. One approach would be to cache the DOM objects and then clean the cache on a timer. This relieves the user from having to worry about memory management (unless they wanted the objects to stick around) and it doesn't break the current API.
Created attachment 74295 [details] domcache.diff
Comment on attachment 74295 [details] domcache.diff View in context: https://bugs.webkit.org/attachment.cgi?id=74295&action=review Looks good, though I have a few nagging concerns (some of which I've outlined below). Another issue is what happens when a developer uses a DOM object to change the frame location. Isn't there the possibility that it will unref the DOM wrapper out from under them? > WebCore/bindings/gobject/DOMObjectCache.cpp:72 > + // Unrefing the objects removes them from the cache in their Should be "Unreffing" here. > WebCore/bindings/gobject/DOMObjectCache.cpp:77 > + DOMObjectCacheData* data = static_cast<DOMObjectCacheData*>(iter->second); I don't think this static_cast is necessary. > WebCore/bindings/gobject/DOMObjectCache.cpp:78 > + if (data && (!view || data->view == view)) Would ASSERT(data) be more appropriate, or are there cases where the data can, in fact, be NULL? > WebCore/bindings/gobject/DOMObjectCache.cpp:101 > + return g_object_ref(data->object); Won't this lead to leaks, since the cache only does one unref when cleared and DOMObjectCache::get can be called multiple times before that? > WebCore/bindings/gobject/DOMObjectCache.cpp:119 > + put(static_cast<void*>(objectHandle), wrapper); I don't think the cast is necessary to get void*. > WebCore/bindings/gobject/DOMObjectCache.h:22 > +#include <glib-object.h> Using void* instead of gpointer would prevent the need for this include, which would be a win for compile time. > WebKit/gtk/WebCoreSupport/FrameLoaderClientGtk.cpp:703 > - // this is only interesting once we provide an external API for the DOM > + if (!ready) > + DOMObjectCache::clearByView(getViewFromFrame(m_frame)); Doesn't this mean that when an iframe reloads, all cached DOM objects will be removed for the entire view? I think it would be better to track per-frame.
Created attachment 74436 [details] domcache.diff This should address the previous review.
Created attachment 74460 [details] domcache.diff
Created attachment 74462 [details] domcache.diff Make the refCount check more explicit.
Comment on attachment 74462 [details] domcache.diff View in context: https://bugs.webkit.org/attachment.cgi?id=74462&action=review > WebCore/bindings/gobject/DOMObjectCache.cpp:32 > + gint refCount; Kind of a nit, but I think I'd prefer this to be named something like timesReturned, since it isn't technically the reference count of the GObject. Also, this could be size_t. > WebCore/bindings/gobject/DOMObjectCache.cpp:82 > + Vector<DOMObjectCacheData*>::iterator last = toUnref.end(); > + for (Vector<DOMObjectCacheData*>::iterator it = toUnref.begin(); it != last; ++it) { > + DOMObjectCacheData* data = *it; > + while (data->refCount-- > 0) > + g_object_unref(data->object); > + } There's some chance here that the object might die before you finish unreffing it, right? Say the user has unreffed only some of the return values. :/ Since there doesn't appear to be a way to peak at the reference count, maybe you should install a weak reference here and stop unreffing when you detect death. Another approach would be to keep information about dead objects in the cache and clear them at some later point. Wow, this is hairy.
(In reply to comment #11) > Kind of a nit, but I think I'd prefer this to be named something like timesReturned, since it isn't technically the reference count of the GObject. Also, this could be size_t. Why size_t? > > > WebCore/bindings/gobject/DOMObjectCache.cpp:82 > > + Vector<DOMObjectCacheData*>::iterator last = toUnref.end(); > > + for (Vector<DOMObjectCacheData*>::iterator it = toUnref.begin(); it != last; ++it) { > > + DOMObjectCacheData* data = *it; > > + while (data->refCount-- > 0) > > + g_object_unref(data->object); > > + } > > There's some chance here that the object might die before you finish unreffing it, right? Say the user has unreffed only some of the return values. :/ Since there doesn't appear to be a way to peak at the reference count, maybe you should install a weak reference here and stop unreffing when you detect death. Another approach would be to keep information about dead objects in the cache and clear them at some later point. Wow, this is hairy. I think adding a weak ref at the beginning of the process and stopping if it's triggered makes sense.
Created attachment 74490 [details] domcache.diff Added the weak_ref trick and renamed refCount to timesReturned. I changed timesReturned to a 'guint' since I'm unsure why it should be a size_t.
Comment on attachment 74490 [details] domcache.diff View in context: https://bugs.webkit.org/attachment.cgi?id=74490&action=review Very nice! This was tricky. Only change I think you should make (other than the two ultra nits below) is to use Frame* instead of void* for all things that deal with frames. > WebCore/bindings/gobject/DOMObjectCache.cpp:103 > + ASSERT(data->timesReturned == 0); I think this has to be ASSERT(!data->timesReturned); to make the style gods happy. > WebKit/gtk/webkit/webkithittestresult.cpp:81 > + WebKitHitTestResultPrivate* priv = WEBKIT_HIT_TEST_RESULT(object)->priv; > + > + g_object_unref(priv->innerNode); > + > + G_OBJECT_CLASS(webkit_hit_test_result_parent_class)->dispose(object); > +} I think this should be two lines to preserve vertical space: g_object_unref(WEBKIT_HIT_TEST_RESULT(object)->priv->innerNode); G_OBJECT_CLASS(webkit_hit_test_result_parent_class)->dispose(object);
Comment on attachment 74490 [details] domcache.diff Landed as r72492 with some minor adjustments.
Closing.