12988 – First element (in document order) is not returned when other duplicate ID-ed elements were created first

Bug 12988 - First element (in document order) is not returned when other duplicate ID-ed elements were created first

Summary: First element (in document order) is not returned when other duplicate ID-ed ...

Status:	RESOLVED FIXED

Alias:	None

Product:	WebKit
Classification:	Unclassified
Component:	DOM (show other bugs)
Version:	523.x (Safari 3)
Hardware:	Mac OS X 10.4

Importance:	P2 Normal
Assignee:	Darin Adler

URL:
Keywords:	HasReduction

Depends on:
Blocks:

Reported:	2007-03-06 14:08 PST by Alice Liu
Modified:	2007-10-17 20:38 PDT (History)
CC List:	3 users (show)

See Also:

Attachments
testcase (360 bytes, text/html) 2007-03-06 14:09 PST, Alice Liu	no flags	Details
first attempt (4.53 KB, patch) 2007-08-26 09:59 PDT, Rob Buis	darin: review-	Details \| Formatted Diff \| Diff
second attempt; tested with SVG test case as well as Rob's (8.58 KB, patch) 2007-09-01 22:00 PDT, Darin Adler	no flags	Details \| Formatted Diff \| Diff
More complete patch (4.43 KB, patch) 2007-09-02 02:04 PDT, Rob Buis	darin: review-	Details \| Formatted Diff \| Diff
improved patch; one less hash lookup in the "add" function than the previous (10.21 KB, patch) 2007-09-02 07:58 PDT, Darin Adler	no flags	Details \| Formatted Diff \| Diff
corrected version of previous patch (10.21 KB, patch) 2007-09-02 09:34 PDT, Darin Adler	no flags	Details \| Formatted Diff \| Diff
alternate patch: a fixed version of Rob's patch (6.29 KB, patch) 2007-09-04 05:27 PDT, Maciej Stachowiak	no flags	Details \| Formatted Diff \| Diff
newer version of Rob's patch, improved a bit more (7.41 KB, patch) 2007-09-04 09:12 PDT, Darin Adler	aroben: review-	Details \| Formatted Diff \| Diff
Show Obsolete (6) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Alice Liu 2007-03-06 14:08:51 PST

Filing this bug on behalf of Maciej...

It seems like we don't always return the first element in document order with a given id when there are duplicate ids and getElementById is used. I thought this worked, and I'm pretty sure we mean it to.

Comment 1 Alice Liu 2007-03-06 14:09:40 PST

Created attachment 13493 [details]
testcase

Comment 2 Alice Liu 2007-03-06 14:11:34 PST

Now speaking on my own behalf:

Not a regression. Shipping Safari 2.0.4 results are 0, the last element in document order but the first one created.  Same results on TOT WebKit rt now.  Firefox and Opera return 4, the first element in document order but the last one created.

Comment 3 Rob Buis 2007-08-26 09:59:54 PDT

Created attachment 16125 [details]
first attempt

I reworked the testcase into this patch. I think there is a performance penalty when having duplicate ids, but I think duplicate ids are rare in the first place.
Cheers,

Rob.

Comment 4 Darin Adler 2007-08-31 18:32:14 PDT

Comment on attachment 16125 [details]
first attempt

Good fix. r=me

Comment 5 Rob Buis 2007-09-01 01:15:34 PDT

Landed in r25341.

Comment 6 Darin Adler 2007-09-01 20:38:32 PDT

This change seems to have broken the layout test use-instanceRoot-modifications.svg -- for some reason the value returned from getElementById is no longer correct. We may need to roll the fix out.

Comment 7 Darin Adler 2007-09-01 20:45:15 PDT

Maciej showed me that Rob already has a cut at a patch to fix the regression: <http://paste.lisp.org/display/47065>.

Comment 8 Darin Adler 2007-09-01 20:46:36 PDT

Rob, I think that the comment in the patch needs a little improvement. It says something that doesn't really match what the code does. I understand why the code does what it does (we only can be sure the element is the first if there are no duplicates), but I'm not sure other readers would.

Comment 9 Darin Adler 2007-09-01 21:24:21 PDT

I also don't think the patch is correct. I'm taking a crack at this now.

Comment 10 Darin Adler 2007-09-01 21:50:53 PDT

Reopening this because the fix caused a regression. I have a new fix.

Comment 11 Darin Adler 2007-09-01 21:59:17 PDT

Comment on attachment 16125 [details]
first attempt

Turns out this was incorrect.

Comment 12 Darin Adler 2007-09-01 22:00:08 PDT

Created attachment 16178 [details]
second attempt; tested with SVG test case as well as Rob's

Comment 13 Maciej Stachowiak 2007-09-01 22:15:52 PDT

I think Darin's version of the patch will increase memory use in the common case where there aren't any duplicates, or at least most elements with IDs don't have duplicates.

Comment 14 Rob Buis 2007-09-02 02:04:02 PDT

Created attachment 16179 [details]
More complete patch

Firsr of all, apologies for missing the fact that two svg tests broke, I kind of assumed some svg tests were broken because of other things. Second, I'd like to get it fixes asap so we can move on :)
This patch is basically the lisppaste snippet plus Darin's comments taken into account. I worded half of it myself but also took one sentence from Darin's patch. Note that the new patch does not alter removeElementById, that was a mistake in my first patch pointed out by Maciej. I have a slight tendency towards my patch since it is close to the original code, but again the main thing IMHO is too resolve the problem soon :)
Cheers,

Rob.

Comment 15 Darin Adler 2007-09-02 07:51:37 PDT

Comment on attachment 16179 [details]
More complete patch

This patch no longer applies, because I rolled out your original fix. So you need a new patch that works without first applying your old patch (a combination of both).

But your patch also has what I consider a memory leak.

If you write a loop that adds two elements with an id, then calls getElementById on that id, then removes both elements, in a loop, with your patch the m_duplicateIds set will keep getting bigger and bigger, because the code leaks a duplicate id in that case. I think we don't want that behavior.

I think my approach is better even though the patch is bigger.

Comment 16 Darin Adler 2007-09-02 07:52:00 PDT

Comment on attachment 16178 [details]
second attempt; tested with SVG test case as well as Rob's

I've got a new slightly better version of this patch coming shortly.

Comment 17 Darin Adler 2007-09-02 07:58:02 PDT

Created attachment 16180 [details]
improved patch; one less hash lookup in the "add" function than the previous

Comment 18 Darin Adler 2007-09-02 08:01:30 PDT

(In reply to comment #15)
> (From update of attachment 16179 [details] [edit])
> This patch no longer applies, because I rolled out your original fix. So you
> need a new patch that works without first applying your old patch (a
> combination of both).

Oops, my mistake. The patch applies just fine.

But my other comment stands. We will leak duplicate ids in that case. My patch is better -- more efficient and doesn't have that leak.

Comment 19 Darin Adler 2007-09-02 08:05:16 PDT

Comment on attachment 16179 [details]
More complete patch

I'm not sure that the leak issue really justifies a review-. I set this to review- because I thought the patch didn't apply, which was my mistake.

But I still think my patch is a better fix!

Comment 20 Darin Adler 2007-09-02 09:34:09 PDT

Created attachment 16181 [details]
corrected version of previous patch

Comment 21 Maciej Stachowiak 2007-09-04 05:27:00 PDT

Created attachment 16193 [details]
alternate patch: a fixed version of Rob's patch

This patch would use about 8k more memory on a page with 1000 ids, none of which were duplicates. (On the other hand, the current approach would use 2k more if every other one was a duplicate). This isn't very much memory in the grand scheme of things I guess, but the duplicate case is supposed to be rare so I think it's ok to use a few extra hash lookups to save a bit of memory.

Thinking about it, I don't see how Rob's patch would leak a duplicate Id in the situation described. The two adds would leave the duplicate ID count at 1, with a cleared cache, the getElementById call would leave it alone but restore the cache, and the two removes would reduce it by 1. But I think a sequence of add/add/remove/remove would underflow the duplicate count, since it would be increased once but reduced twice.

Here's what I think is a fixed version of Rob's patch, in case it is of interest.

Comment 22 Darin Adler 2007-09-04 08:48:40 PDT

(In reply to comment #21)
> This patch would use about 8k more memory on a page with 1000 ids, none of
> which were duplicates.

I assume you're saying 8k because there'd be 1000 hash table entries, each that is 4 bytes larger for 4k, and then roughly 1000 more empty hash table entries, each that is 4 bytes larger for another 4k.

> (On the other hand, the current approach would use 2k more if every other one was a duplicate).

My calculations say that this would be the same 8k as above. A counted set hash table entry for each duplicate is 8 bytes per duplicate. So if we have 500 duplicates with 8 bytes each, that's 4k for the used hash table entries and 4k for the empty ones.

As you say, though, my scheme is less efficient for pages with lots of non-duplicated ids.

Comment 23 Darin Adler 2007-09-04 08:50:59 PDT

Comment on attachment 16193 [details]
alternate patch: a fixed version of Rob's patch

This looks good. But there's a way to write addElementById with fewer hash table lookups. I'll make a new version.

Comment 24 Darin Adler 2007-09-04 09:12:27 PDT

Created attachment 16197 [details]
newer version of Rob's patch, improved a bit more

Comment 25 Maciej Stachowiak 2007-09-29 19:34:35 PDT

please land on feature-branch

Comment 26 Adam Roben (:aroben) 2007-09-29 19:36:05 PDT

Comment on attachment 16197 [details]
newer version of Rob's patch, improved a bit more

Somehow, duplicate-ids-document-order.html got put in the patch twice, and duplicate-ids-document-order-expected.txt is missing.

+    // excluding the one referenced in m_elementsById, if any. This means it one less than the total count

Typo: it one -> it is one

+    div = document.createElement ('div');

There's an extra space before the open parenthesis.

r- until the test results are added, then I'll r+ for feature-branch.

Comment 27 Darin Adler 2007-10-17 19:56:04 PDT

(In reply to comment #26)
> (From update of attachment 16197 [details] [edit])
> Somehow, duplicate-ids-document-order.html got put in the patch twice

Strange, the patch applied just fine.

Comment 28 Darin Adler 2007-10-17 20:36:21 PDT

(In reply to comment #26)
> (From update of attachment 16197 [details] [edit])
> Somehow, duplicate-ids-document-order.html got put in the patch twice, and
> duplicate-ids-document-order-expected.txt is missing.

That's because duplicate-ids-document-order.html is being re-added to the tree and duplicate-ids-document-order-expected.txt is already checked in.

Comment 29 Darin Adler 2007-10-17 20:38:35 PDT

Committed revision 26732.