Bug 8850

Summary: cocoa bindings - crash when DOM nodes are deleted while NSTreeController+NSTableView are bound
Product: WebKit Reporter: James G. Speth <speth>
Component: WebKit APIAssignee: Nobody <webkit-unassigned>
Status: RESOLVED WONTFIX    
Severity: Normal CC: andersca, darin, emacemac7
Priority: P2    
Version: 420+   
Hardware: Mac   
OS: OS X 10.4   
Attachments:
Description Flags
test html and nib files
none
patch for review
darin: review-
unfinished patch only addressing the event listening
none
patch for review
darin: review-
test case for this bug and minor change to DumpRenderTree to support it
darin: review-
patch implementing comment #11 darin: review-

Description James G. Speth 2006-05-11 09:04:10 PDT
I have a fix for a crash when the DOM structure is manipulated while using Cocoa bindings.  If there is an NSTreeController bound to the DOM and an NSTableView bound to the controller, it will crash if the nodes currently selected in the table view are removed from the DOM.

The NSTreeController needs to be made aware of the DOM changes through the appropriate KVO mechanisms.  The patch I'm adding does this by listening for DOM mutation events, and using them to trigger KVO change notifications for the childNodes attribute of the deleted node's parent.

The test I'm attaching is a manual one using the pyobjc test plugin I've submitted in bug 8843.  That system is still pretty rough, but does work to illustrate this crash (and fix).
Comment 1 James G. Speth 2006-05-11 10:48:32 PDT
Created attachment 8242 [details]
test html and nib files

see bug 8843 for the test tool to run this
Comment 2 James G. Speth 2006-05-11 11:52:33 PDT
Created attachment 8247 [details]
patch for review
Comment 3 Timothy Hatcher 2006-05-11 20:17:32 PDT
This patch looks good. However I think we only want to have the WebView listening for DOM mutation events if it is being observed. Is this possible? If not we will need to make sure this is not have a performance impact before landing.
Comment 4 Darin Adler 2006-05-11 21:40:28 PDT
Comment on attachment 8247 [details]
patch for review

The mutableCopyWithZone: thing confuses me. These are not copyable objects. Why is NSTreeController calling mutableCopy on an object that doesn't implement the NSMutableCopying protocol?

Are you sure that indexOfObjectIdenticalTo: is the only other array method we need? The reason I ask is that if we instead subclassed from NSArray instead of NSObject, we'd get *all* the methods from the NSExtendedArray category for free. And I think it will be binary compatible since there are no fields in NSArray. I think that's a better solution.

Ironically, this then means that we'll be implementing the NSCopying and NSMutableCopying protocols. But I believe NSArray will provide the copy implementations which will actually create new arrays. So subclassing from NSArray should fix both of these problems.

I don't see any code to remove the event listener. I suspect that by adding the WebView itself as a listener to the DOMDocument, we could create a situation where the entire WebView leaks. Did you test that?

We absolutely cannot have a DOM mutation event handler that's installed 100% of the time. That will definitely cause a performance regression because it will disable an optimization that we do in the very common case when there are no event listeners listening to DOM mutation events.
Comment 5 James G. Speth 2006-05-12 08:08:24 PDT
Created attachment 8266 [details]
unfinished patch only addressing the event listening

take a look at this and let me know if it's more like what you had in mind, as far as adding/removing the event listener.  i wanted to get feedback on this approach before finishing it.

i'm still looking into the other questions, and trying to figure out why exactly NSTreeController is making the calls it does.  maybe someone could just walk over and ask them? :)
Comment 6 Geoffrey Garen 2006-06-11 11:19:16 PDT
"We absolutely cannot have a DOM mutation event handler that's installed 100% of
the time. That will definitely cause a performance regression because it will
disable an optimization that we do in the very common case when there are no
event listeners listening to DOM mutation events."

Didn't Adele's new textfield work (1) eliminate that optimization after (2) proving it had no performance effect?
Comment 7 Darin Adler 2006-06-24 21:57:22 PDT
Comment on attachment 8266 [details]
unfinished patch only addressing the event listening

A DOMSubtreeModified listener is probably not a good way to implement this -- currently our implementation of that slows down all browsing when it's attached.
Comment 8 James G. Speth 2006-06-25 05:40:12 PDT
i'll submit a patch later today that does it by hooking into Node:: notifyLocalNodeListsChildrenChanged in WebCore/dom/Node.cpp.  I believe there is no performance impact when done that way.
Comment 9 James G. Speth 2006-06-27 00:37:18 PDT
Created attachment 9057 [details]
patch for review
Comment 10 Darin Adler 2006-06-27 09:41:22 PDT
Comment on attachment 9057 [details]
patch for review

At the very least, this needs to be ifdef'd properly so that it will not break platforms other than OS X.

I don't see why willChangeValueForKey and didChangeValueForKey are virtual functions.

The declaration "const char *key" should be "const char* key".

I don't think the name of the KVC key should be in the platform-independent code.

This code:

+    DOMNode *cachedNode;
+    cachedNode = getDOMWrapper(this);
+    if (cachedNode)
+        [cachedNode willChangeValueForKey:[NSString stringWithUTF8String:key]];

should be formatted like this:

    DOMNode* cachedNode = getDOMWrapper(this);
    if (cachedNode)
        [cachedNode willChangeValueForKey:[NSString stringWithUTF8String:key]];

If this code is really going to be hot, then we need to avoid calling stringWithUTF8String every time. One way we could do that is to relegate the key part to Macintosh specific code and just use @"" syntax for it. At the very least we don't want to do the autoreleased form.

Calling getDOMWrapper every time a child is added or removed is likely to cause measurable performance impact. We're going to need to figure out what to do about that. I'd like to see this structured so that you don't do everything twice.

It seems clear the willChangeValueForKey call needs to be done before the child list is changed, and the didChangeValueForKey needs to be done after. They should not be done one after another.

The part of this not related to the change notifications looks good and about ready to land, but the change notification part needs work.
Comment 11 James G. Speth 2006-06-28 00:11:23 PDT
Thanks Darin.  I agree that the notification handling was the weakest part of this, but I didn't realize the performance impact was so large.  I have an idea about how to do this differently I want to run past you before coding it.  The basic idea is to flip things around and have the DOMNode request notifications if it's being observed.  Everything would be set up when the observer is added, and torn down when the observer is removed, so there is very little work that needs to take place when the change actually occurs.  Here's how it would go:

1.  DOMNode overrides the KVO add/remove methods, calling super for each, but watching for observers of childNodes, and keeping a count of them.

2.  When childNodes gets a new observer, the DOMNode calls down into Node (setChildNodesModifiedCallback or something) and passes it a function pointer and a void * context.  These are stored as properties of the Node.  DOMNode increments the observer count.

3.  In Node::childrenChanged, if the function pointer is non-null, call it with the context.

4.  The callback function could then simply map the context back to the DOMNode (possibly just by casting it), and issue the KVO notifications for the DOMNode's childNodes.

5.  When a childNode observer is removed, decrement the count.  If count goes to zero, the function pointer is cleared.

Does that sound better?
Comment 12 James G. Speth 2006-06-28 09:49:12 PDT
Created attachment 9080 [details]
test case for this bug and minor change to DumpRenderTree to support it
Comment 13 James G. Speth 2006-06-28 10:06:42 PDT
Created attachment 9081 [details]
patch implementing comment #11

this patch pretty much follows what I outlined above.  it seems to work fine, but i have not run any performance tests on it.

Also, i did not separate the KVO messages.  Normally, I'd agree that it is wrong to have them back to back like that, but childNodes is special.. Every call to childNodes produces a different DOMNodeList object, so strictly speaking, the value did change.  As far as KVO is concerned, the contents of the DOMNodeList are opaque.  The contents would be accessed by additional keys on a keyPath, which would be responsible for triggering their own change messages.  I think this adheres to the KVO contract.

In my mind it's somewhat like an accessor that increments the value each time... willChange and didChange could be called as often as we'd like, we'd just have to make a determination as to what makes sense.  Given that the DOMNodeList itself only receives notifications after the fact, this seems right to me. Plus, it works.

But if anyone has a better idea about where to put the willChange, let's move it.
Comment 14 Darin Adler 2006-06-29 08:00:47 PDT
Comment on attachment 9081 [details]
patch implementing comment #11

A DOMNodeList is a dynamic list. The list itself shouldn't be changing -- the contents of it should be changing. I suspect it's a bug that every call to childNodes produces a distinct DOMNodeList. Existing DOMNodeList objects are automatically updated to reflect the new child nodes.  

I'm not convinced about argument that will/did should be at the same time for childNodes. I don't think it's good that for KVO we return a distinct static array of child nodes every time. Wouldn't it be better if the KVO list was dynamic just the way a DOMNodeList is? I'd think it would be straightforward to wrap the DOMNodeList in something that is a dynamic NSArray rather than allocating a static NSArray.

So for KVO, I suspect the node should not be sending a notification at all. Instead, the list should be sending a notification indicating that its contents are changing.

As far as the changes to WebCore::Node are concerned, I don't think it's acceptable to add 12 bytes to every DOM node for this feature. We should instead use a single bit to identify that a node is observed and store the ref counts, callbacks, and contexts in a separate data structure (in a hash table, I would think).

The context pointer should just be void*, not const void*.
Comment 15 Darin Adler 2006-06-29 08:02:53 PDT
Comment on attachment 9080 [details]
test case for this bug and minor change to DumpRenderTree to support it

Test case looks great as does the enhancement of DumpRenderTree, but since you put these two together but separate from the bug fix I have to review- since we can't land these until the bug is fixed and if I marked it review+ it would sit in the "to be checked in" queue.

If you give a patch with just the DumpRenderTree enhancements I could review+ that.

Or you could put this test case along with the bug fix and when it's all ready to go we can land the whole thing atomically, and so we'd review atomically too.
Comment 16 Darin Adler 2006-06-29 08:05:34 PDT
I realize now that don't undersatnd why the crash happens. The Objective-C code might have pointers to DOM nodes that are no longer in the DOM tree, but this should *not* lead to a crash, since the DOM nodes are reference counted. JavaScript can also hold pointers to DOM nodes long after they are gone from the DOM tree, and everything works fine. So while having notification about changes to child nodes is great, I think the crash should be fixed differently, because it should not happen even without the notification. The child node change notification is a separate enhancement, good to have but should not be necessary to avoid crashing.
Comment 17 James G. Speth 2006-07-04 18:06:32 PDT
(In reply to comment #15)
> If you give a patch with just the DumpRenderTree enhancements I could review+
> that.

I separated the DumpRenderTree change.  It's in bug 9655 now.  Once that lands, I'll start including this bug's test case in the patch to fix it.
Comment 18 James G. Speth 2006-07-05 15:20:21 PDT
From what I can tell (BindingAdaptors and NSTreeController internals are still a black box outside of Infinite Loop), the crash occurs because the NSOutlineView does not retain the values that it displays.  It expects the data to be there until the model notifies it (through KVO messages to the controller) that the values are changing.  So it doesn't matter, in this case, that the nodes are refcounted.  If we are going to change the model behind the scenes, using non-KVO methods (i.e. the DOM manipulation API), then we're obligated to tell it when the value for the children key path is changing.  That's why this crash and the notification of childNodes changing are tightly related.

Also, addressing some of the issues raised in comment #14:

> I suspect it's a bug that every call to childNodes produces a distinct DOMNodeList.

I agree that it's probably a bug that calling childNodes generates a unique DOMNodeList each time.  Can someone confirm that this is not the desired behavior?  If so, it should go into a new bug, since it is not really related to this (except by my using it to make a weak argument).

> I'd think it would be straightforward to wrap the DOMNodeList in something that is a dynamic
> NSArray rather than allocating a static NSArray.

This is exactly what this patch does:  [DOMNode valueForKey:@"childNodes"] gets special-cased to return [[DOMNode childNodes] valueForKey:@"items"].  What this does (in conjunction with the DOMNodeList's countOfItems/objectInItemsAtIndex: accessors) is generate an NSKeyValueArray for the childNodes list.  The KVO machinery generates this for us automatically, and it works just like we'd want it to.  NSKeyValueArray is exactly what you describe.. an NSArray subclass that is a KVO compliant proxy for an array-like object, which dynamically generates calls to the original object (the DOMNodeList).

On further consideration, I don't like how this patch special-cases the KVC access of childNodes to return the NSKeyValueArray instead of a DOMNodeList.  I think childNodes should return a DOMNodeList and there should be a new key for accessing the array, perhaps "childNodesArray"?

Finally, on the topic of whether to send notifications for the DOMNodeList as a whole or individual notifications for its contents, I agree that it would be nice to support the finer grained KVO messages for to-many relationships, but it's unnecessary here.  By substituting a logging proxy in place of the childNodes value, I've seen that the tree controller does not observe any properties of the child nodes object.  the Node's children key path is observed, and when changed, the subtree is recomputed.  we could add the further notifications for completeness, but the controller isn't listening.
Comment 19 Anders Carlsson 2016-08-04 14:15:49 PDT
Pretty sure we removed the NSTreeController bindings some time ago.