Bug 9508 - Comparing a form element group to itself returns false (form element name getter returns a new NodeList each time)
Summary: Comparing a form element group to itself returns false (form element name get...
Status: RESOLVED CONFIGURATION CHANGED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore JavaScript (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-06-19 10:33 PDT by Troy Brandt
Modified: 2022-07-27 17:53 PDT (History)
11 users (show)

See Also:


Attachments
example document (1.11 KB, text/html)
2006-06-19 10:35 PDT, Troy Brandt
no flags Details
proposed fix (8.98 KB, patch)
2007-01-23 02:09 PST, Sanjay Madhav (chmmravatar)
darin: review-
Details | Formatted Diff | Diff
caching method (14.35 KB, patch)
2007-01-30 01:52 PST, Sanjay Madhav (chmmravatar)
darin: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Troy Brandt 2006-06-19 10:33:22 PDT
Using this example produces a false result for the button group (theRadio) and a true return for element zero.

Expected, both should return true which is what happens in Firefox and Opera.

The checked / unchecked state of the buttons is not examined.

<html>

<head>

<script type='text/javascript'>

function compareObjects() {
	var alertText;
	
	with (document.theForm) {
		
		alertText = "with (document.theForm) {"
		
		alertText += "\n\t(theRadio == theRadio) == "
		if (theRadio == theRadio) {
			alertText += "true";
		} else {
			alertText += "false";
		}
		
		alertText += "\n\t(theRadio[0] == theRadio[0]) == "
		
		if (theRadio[0] == theRadio[0]) {
			alertText += "true";
		} else {
			alertText += "false";
		}
		
		alertText += "\n}";
	}
	
	alert(alertText);
}

</script>

</head>

<body>

<p>Clicking the "Compare" button below calls a JavaScript function that compares the <i>theRadio</i> radio button group to itself and compares element zero of the button group to itself, then displays the results of those comparisons in an alert. The checked/unchecked state of the radio buttons is not examined.</p>

<form name='theForm'>

<input type='button' name='theButton' value='Compare' onclick='compareObjects()' />
<p><input type='radio' name='theRadio' value='0' /> theRadio[0]<br />
<input type='radio' name='theRadio' value='1' /> theRadio[1]</p>

</form>


</body>

</html>
Comment 1 Troy Brandt 2006-06-19 10:35:08 PDT
Created attachment 8922 [details]
example document
Comment 2 Sanjay Madhav (chmmravatar) 2007-01-22 09:43:14 PST
The root issue here is that when comparing objects, the equality check in comparison.cpp does a simple pointer comparison to see whether or not two objects are the same. However, when you access a form group where there's multiple elements with the same name, you get a temporary DOMNamedNodesCollection pointer. So the equality check is checking the addresses of two distinct temporary objects, which of course will always fail.

I have an idea for a relatively clean solution, will post the patch this evening.
Comment 3 Sanjay Madhav (chmmravatar) 2007-01-23 02:09:06 PST
Created attachment 12621 [details]
proposed fix

Weighing the options, the posted patch is probably the best way to solve the issue at hand given all the factors.

While it certainly would be possible to cache requests for form groups in the HTMLFormElement, the issue becomes that if we don't cache every single unique request for the life of the HTMLFormElement, then you potentially could have a comparison occur much later that fails when it shouldn't. However, having a cache which only removes elements once the HTMLFormElement gets destroyed would add memory overhead. The comments surrounding DOMNameNodesCollection seemed to imply that it would be too costly to have them persist for long periods of time.

We could change how form elements are stored internally, so the group lookup doesn't have to create a temporary object, but it would probably end up touching a great deal of code.

With the solution used in the patch, for 99% of the equality comparisons, there'll be no discernable speed effect, and it also gives some flexibility to quickly fix similar problems in other areas, if they pop up.

I tested this change with both the LayoutTests and javascriptcore tests, and no regressions are introduced by this patch.

Also included is a test case for future regression testing.
Comment 4 David Kilzer (:ddkilzer) 2007-01-23 04:09:59 PST
Comment on attachment 12621 [details]
proposed fix

A couple random comments on this patch.  (Note that I'm not a reviewer, so take these with the proverbial grain of salt! :)

- Instead of needsDeepEqual(), how about hasDeepEqual() or useDeepEqual()?

- Would it be possible to override operator==() instead?  It would be cleaner than adding two functions, but operator overloading is not obvious when looking at source.

- You might consider using the LayoutTests/fast/js/resources/js-test-pre.js, js-test-post.js and js-test-style.css files in your layout test so you don't have to reinvent the wheel.  Just use relative paths to include these resources.  See LayoutTests/fast/js/activation-proto.html for an example of how to use them.
Comment 5 Darin Adler 2007-01-23 08:03:19 PST
Comment on attachment 12621 [details]
proposed fix

There's no need for two virtual functions here. Once we're taking the overhead of a virtual function call there's no reason to not just call it isEqual and do the pointer comparison in the base JSObject implementation. I don't see the need for the word "deep" in the name.

Since both objects are known to be ObjectType already, we should be doing a cast to JSObject, not a call to getObject -- the getObject call is much more expensive and does not add any value. In fact, I think the "equal" function would read better if the end part was a switch statement.

I don't think the change to strictEqual is correct. That's for the "===" operation, which should almost certainly distinguish two objects with identical contents like these. So I believe that function doesn't need to change.

+    const DOMNamedNodesCollection* d = reinterpret_cast<const DOMNamedNodesCollection*>(rhs);    

This should be a static_cast, not a reinterpret_cast.

+    // For performance, we can assume that two collections which have the same size and start with the
+    // same nodes are equivalent, as opposed to iterating through the entire vectors.

No. That's wrong. We do need to compare the vectors. This code should just say:

    return rhs->isObject(&DOMNamedNodesCollection::info)
        && m_nodes == static_cast<const DOMNamedNodesCollection*>(rhs)->m_nodes;
Comment 6 Darin Adler 2007-01-23 08:08:05 PST
(In reply to comment #3)
> While it certainly would be possible to cache requests for form groups in the
> HTMLFormElement, the issue becomes that if we don't cache every single unique
> request for the life of the HTMLFormElement, then you potentially could have a
> comparison occur much later that fails when it shouldn't.

I don't agree with this comment. There's no problem creating a design where the collection is registered with the form; the form points to it and when the collection is destroyed the pointer in the form gets cleared.

We don't need to make any change to the collection lifetime at all to do this -- we just need to add a pointer to HTMLFormElement and add the code to the collection class to clear this pointer when the collection is deallocated. I think this change is simple and practical and probably a better choice than adding the ability to do equality comparison to the JavaScript binding.

By the way, if we do extend the JavaScript binding in this way, we'll also want it in the C API.
Comment 7 Darin Adler 2007-01-23 09:01:34 PST
(In reply to comment #6)
> (In reply to comment #3)
> > While it certainly would be possible to cache requests for form groups in the
> > HTMLFormElement, the issue becomes that if we don't cache every single unique
> > request for the life of the HTMLFormElement, then you potentially could have a
> > comparison occur much later that fails when it shouldn't.
> 
> I don't agree with this comment. There's no problem creating a design where the
> collection is registered with the form; the form points to it and when the
> collection is destroyed the pointer in the form gets cleared.
> 
> We don't need to make any change to the collection lifetime at all to do this
> -- we just need to add a pointer to HTMLFormElement and add the code to the
> collection class to clear this pointer when the collection is deallocated. I
> think this change is simple and practical and probably a better choice than
> adding the ability to do equality comparison to the JavaScript binding.

I studied this more, and I am not sure what I said is right.

It's JSHTMLCollection that would need to keep track of DOMNamedNodesCollection objects it creates, not HTMLFormElement, because it's the class that creates the collections. We'd have to add a map of property names to DOMNamedNodesCollection and store both a pointer to the JSHTMLCollection and the property name in the DOMNamedNodesCollection objects. Then when a DOMNamedNodesCollection was destroyed it would call back to the JSHTMLCollection so it could remove it from the map, and when a JSHTMLCollection was destroyed it woud disconnect itself from all the DOMNamedNodesCollection objects still in the map. If we did this, I'd suggest moving DOMNamedNodesCollection closer to JSHTMLCollection, probably renaming it, and making the two classes more-closely-associated to accomplish this. I think it's not all that difficult to make this work.

Specifics of this case aside, I think it's a reasonable to consider making equality comparison something that JSObject subclasses can customize. I am sad this came up *after* we defined the C API, because clearly we'd want to be able to do this from C too.
Comment 8 Sanjay Madhav (chmmravatar) 2007-01-23 09:44:09 PST
I think I was concerned that temporarily caching it could create cases where the comparsion would still fail, but I guess really that doesn't make sense, since the lifetime of the object will be however long it is still referenced.

Regarding strict equal, are you saying that "strict equal in this case should be false" or "strict equal already works, so we don't need to change it." Well currently, strict equal returns false in this case (for objects it's still doing a pointer comparison), but Firefox says it should be true. So I'm pretty sure strict equal does need to be fixed as well. Making the caching change will fix it at the same time as reguar equal.

Good point about getting rid of the second virtual call and having it just do a the pointer comparison first in an isEqual() function to save virtual call overhead, but I guess that's irrelevant anyways with the caching solution.

Anyways, I agree the caching solution is a much more elegant way to solve the problem. I was going to do it that way originally, but then I had the lifetime concerns. However it seems like that shouldn't be a problem. I'll rewrite the patch and post something either tonight or tomorrow night. Hey, well at least I already have a test case for it! :)

David: While you certainly could overload operator ==, since JSObjects are almost exclusively passed around as pointers, whenever you wanted to call it you'd have to write code like (*v1) == (*v2), which certainly isn't fun :).
Comment 9 Darin Adler 2007-01-26 11:04:51 PST
(In reply to comment #8)
> Regarding strict equal, are you saying that "strict equal in this case should
> be false" or "strict equal already works, so we don't need to change it." Well
> currently, strict equal returns false in this case (for objects it's still
> doing a pointer comparison), but Firefox says it should be true. So I'm pretty
> sure strict equal does need to be fixed as well. Making the caching change will
> fix it at the same time as reguar equal.

I was saying that === should be false.

If === is true in Firefox, that indicates they use the same object, not just an equal one. So that makes me think even more strongly that the best way to do this is by "caching" the object and passing the same one out.

> Good point about getting rid of the second virtual call and having it just do a
> the pointer comparison first in an isEqual() function to save virtual call
> overhead, but I guess that's irrelevant anyways with the caching solution.

It's relevant if we ever want to give object classes a way to define what equality means. And if we do that it will be for equal but not strictEqual. It's possible we'll never need that though.
Comment 10 Sanjay Madhav (chmmravatar) 2007-01-30 01:52:42 PST
Created attachment 12780 [details]
caching method

This patch implements the "caching" method as discussed above. When a new DOMNamedNodesCollection is created, it's stored in a hash map, and subsequent requests for it return a pointer to the same object, instead of creating a new temp object just for themselves.

Additionally, I moved DOMNamedNodesCollection closer to JSHTMLCollection, and out of kjs_dom.h. Since it was only ever used internally by JSHTMLCollection, it doesn't even need to be in a header that's included all over the place, but can rather be defined in kjs_html.cpp.

One thing that may be a little questionable is the way I mangle the identifier in order to ensure that multiple forms or pages with the same form group identifier don't give false positives. Basically, rather than just using the straight identifier, I take the identifier plus the address of the first node in the collection (since the nodes contained by the collection are never temp copies, but always point to the actual form elements). Then I use the identifier+address as the string key for the hash map. Because the JSHTMLCollection object itself is very short-lived (it immediately goes out of scope after returning what it needs to), this seemed like the least roundabout way to prevent the identifier clashes. But again, I'm definitely open to suggestion if someone has a better idea.

I've also updated the test case for this bug slightly from the previous patch.
Comment 11 Darin Adler 2007-01-30 09:47:25 PST
(In reply to comment #10)
> Basically, rather than just using the
> straight identifier, I take the identifier plus the address of the first node
> in the collection (since the nodes contained by the collection are never temp
> copies, but always point to the actual form elements). Then I use the
> identifier+address as the string key for the hash map.

I think that's not acceptable. I'll give you a specific suggestion for an alternative way to do it after looking over the patch.
Comment 12 Darin Adler 2007-01-30 13:52:05 PST
Comment on attachment 12780 [details]
caching method

Good work! It looks like you're on the right track.

If you're moving the code you should probably clean things up a little bit. Details follow.

The patch has a combination of 4-character and 2-character indentation. Better to do 4-character consistently.

+    DOMNamedNodesCollection(ExecState *exec, const Vector<RefPtr<Node> >& nodes, const AtomicString& name);

Just ExecState* here, no space before * and no need for the variable name exec. Also no need to give the Vector the name "nodes".

+    static JSValue *lengthGetter(ExecState* exec, JSObject *, const Identifier&, const PropertySlot& slot);

JSValue* and JSObject*. No need for the name "exec" or "slot".

+    Vector<RefPtr<WebCore::Node> > m_nodes;

No need for the WebCore:: prefix. Probably best to use a typedef for the node vector type.

+    AtomicString propertyName;

Should call this m_propertyName. I'm not sure why this is an AtomicString. The only role for this is to be a key for the hash table, and the hash table uses a StringImpl* rather than an AtomicStringImpl*. We really can go either way -- if the strings are unique each time, then AtomicString doesn't add too much value.

To avoid aliasing during the lifetime of the DOMNamedNodesCollection, we have to do something to keep its node alive. Thus I think it needs to hold a RefPtr to the node it was made from, not just the nodes in the collection. Without this, the %p technique we are currently using is not sufficient to prevent collisions.

+typedef HashMap<StringImpl*,DOMNamedNodesCollection*> NamedNodesCache;

Need a space after the comma.

+NamedNodesCache& getNamedNodesCache()

Need to mark this static so the function gets internal linkage.

+    // Mangle the identifier string with the node pointer to the first element.

This approach is ugly. If we really need a key that is a string plus a node pointer, we can do that. HashMap allows arbitrary keys and I can help you make a appropriate hash function. But a simpler thing that will also work is to use a hash of hashes -- a primary hash table that uses the node pointer as a key and the values are pointers to hash tables that use a string as a key.

+    char ptrBuffer[15];

Strange choice of length here. On 64-bit platforms, the hex form of a pointer will typically be 18 bytes long, so you'd need a 19-byte buffer, and on 32-bit it will typically be 10 bytes long and require an 11-byte buffer. So 19 or 20 would make more sense.

+        getNamedNodesCache().add(tempString.impl(), nodeCollection);

This line of code is incorrect. The tempString will go away, and its StringImpl* might be deallocated. What you need to use as the hash key is the pointer to the string that's actually stored inside the DOMNamedNodesCollection object. You'll probably need a member function named propertyName() to get at it.

+  DOMNamedNodesCollection *thisObj = static_cast<DOMNamedNodesCollection*>(slot.slotBase());

Should move the * next to the type.

Ideally we'd move DOMNamedNodesCollection to its own source file. We're trying to phase out these kjs_xxx files. Maybe even use an idl to generate its JavaScript wrapper; we could at least auto-generate length even though we'd still need custom code for getting the array index and by-name properties.

review- because of the need to ref the node and the incorrect call to add. I'd really like to see a version that does not use %p.
Comment 13 Sanjay Madhav (chmmravatar) 2007-01-31 23:50:17 PST
I started working on a new patch based on your suggestions. I understood most of what you said, but one thing I'm not completely clear on is that the DOMNamedNodesCollection should hold a refptr to the node which created it. How would I be able to access that in JSHTMLCollection::getNamedItems given that it can be called either via the nameGetter or from JSHTMLFormElement? In the latter case, the JSHTMLCollection created is just a temp object. 

And even then, I'm not quite sure why that would be needed to prevent aliasing. The individual nodes that make up the form element group won't be destroyed throughout the life of the document, correct? If that's so, and we're hashing the Node pointer to the first node that makes up the group and the identifier, then I don't see how it's a problem.

Actually, it just occurred to me if that is the case, then actually we don't need to hash with the Node pointer to the first node PLUS the identifier. The identifier actually doesn't matter at the point where you've already grabbed the named items from that identifier, and are hashing with the node pointer to the first node in the named element group. Or, if that didn't make sense, basically the patch that I submitted on the 30th didn't need to hash with a string at all. It could have hashed just the Node pointer to the first element, and that would be that.
Comment 14 Darin Adler 2007-01-31 23:56:18 PST
(In reply to comment #13)
> I started working on a new patch based on your suggestions. I understood most
> of what you said, but one thing I'm not completely clear on is that the
> DOMNamedNodesCollection should hold a refptr to the node which created it. How
> would I be able to access that in JSHTMLCollection::getNamedItems given that it
> can be called either via the nameGetter or from JSHTMLFormElement? In the
> latter case, the JSHTMLCollection created is just a temp object. 

I see what you mean. I think what I really meant is a RefPtr to the HTMLCollection, and use that as part of the key rather than the first node pointer.

> And even then, I'm not quite sure why that would be needed to prevent aliasing.
> The individual nodes that make up the form element group won't be destroyed
> throughout the life of the document, correct?

Someone could change the nodes around with JavaScript DOM operations. For example, the name attribute of an element could be changed or it could be removed from the document or moved into another form.

I think we need to test how the other browsers behave in cases like that. Does the collection automatically change to reflect changes to the document? Or what?

> It could have hashed just the Node pointer to the first element,
> and that would be that.

Maybe we can get away with just hashing the node pointer to the first element. I think it depends on the results of the testing I'm suggesting above.
Comment 15 Sanjay Madhav (chmmravatar) 2007-02-01 00:38:32 PST
I'll look into seeing what happens with other browsers in the case of renaming the form element group.

But I just realized that due to being able to add/remove members of the form element group, the caching solution is going to have to be a bit more complex than it was in the past solution. The problem is the DOMNamedNodesCollection just has a vector which contains the nodes that existed when the object was created. But suppose you have a DOMNamedNodesCollection cached and then add or remove another form element to that group. Well now whenever you try to access the form element group, you'll still get the DOMNamedNodesCollection that has a vector to the old state of the form group.

So...the HTMLCollection pointer, then the identifier string name should be what is used as a key. But then when an HTMLCollection is modified (either a node added or removed), it has to do a lookup to see whether any DOMNamedNodesCollections are cached for that identifier, and if so has to update that DOMNamedNodesCollection to have the correct new state of the form element group.
Comment 16 Darin Adler 2007-02-01 10:07:26 PST
Sure, sounds right.

But lets not make any assumptions about the behavior when things are changed until we see what the behavior of other browsers is. In all those cases, not just the name change case.
Comment 17 Alexey Proskuryakov 2010-06-11 15:24:15 PDT
This is almost the same as bug 33696, but needs to be fixed separately (we we want to fix this).
Comment 18 Alexey Proskuryakov 2010-06-11 15:24:40 PDT
_If_ we want to fix this
Comment 19 Ahmad Saleem 2022-07-27 13:48:13 PDT
I am unable to reproduce this bug in Safari 15.6 on macOS 12.5 and all browsers produce same output as below:

with (document.theForm) {
	(theRadio == theRadio) == true
	(theRadio[0] == theRadio[0]) == true
}

I think it might have been fixed along the way and can be marked as "RESOLVED CONFIGURATION CHANGED". Thanks!
Comment 20 Ryosuke Niwa 2022-07-27 14:24:35 PDT
Yup, we've fixed this.
Comment 21 Darin Adler 2022-07-27 17:53:24 PDT
I think it was fixed with bug 81854.