Bug 12988

Summary: First element (in document order) is not returned when other duplicate ID-ed elements were created first
Product: WebKit Reporter: Alice Liu <alice.barraclough>
Component: DOMAssignee: Darin Adler <darin>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, mjs, mrowe
Priority: P2 Keywords: HasReduction
Version: 523.x (Safari 3)   
Hardware: Mac   
OS: OS X 10.4   
Attachments:
Description Flags
testcase
none
first attempt
darin: review-
second attempt; tested with SVG test case as well as Rob's
none
More complete patch
darin: review-
improved patch; one less hash lookup in the "add" function than the previous
none
corrected version of previous patch
none
alternate patch: a fixed version of Rob's patch
none
newer version of Rob's patch, improved a bit more aroben: review-

Alice Liu
Reported 2007-03-06 14:08:51 PST
Filing this bug on behalf of Maciej... It seems like we don't always return the first element in document order with a given id when there are duplicate ids and getElementById is used. I thought this worked, and I'm pretty sure we mean it to.
Attachments
testcase (360 bytes, text/html)
2007-03-06 14:09 PST, Alice Liu
no flags
first attempt (4.53 KB, patch)
2007-08-26 09:59 PDT, Rob Buis
darin: review-
second attempt; tested with SVG test case as well as Rob's (8.58 KB, patch)
2007-09-01 22:00 PDT, Darin Adler
no flags
More complete patch (4.43 KB, patch)
2007-09-02 02:04 PDT, Rob Buis
darin: review-
improved patch; one less hash lookup in the "add" function than the previous (10.21 KB, patch)
2007-09-02 07:58 PDT, Darin Adler
no flags
corrected version of previous patch (10.21 KB, patch)
2007-09-02 09:34 PDT, Darin Adler
no flags
alternate patch: a fixed version of Rob's patch (6.29 KB, patch)
2007-09-04 05:27 PDT, Maciej Stachowiak
no flags
newer version of Rob's patch, improved a bit more (7.41 KB, patch)
2007-09-04 09:12 PDT, Darin Adler
aroben: review-
Alice Liu
Comment 1 2007-03-06 14:09:40 PST
Created attachment 13493 [details] testcase
Alice Liu
Comment 2 2007-03-06 14:11:34 PST
Now speaking on my own behalf: Not a regression. Shipping Safari 2.0.4 results are 0, the last element in document order but the first one created. Same results on TOT WebKit rt now. Firefox and Opera return 4, the first element in document order but the last one created.
Rob Buis
Comment 3 2007-08-26 09:59:54 PDT
Created attachment 16125 [details] first attempt I reworked the testcase into this patch. I think there is a performance penalty when having duplicate ids, but I think duplicate ids are rare in the first place. Cheers, Rob.
Darin Adler
Comment 4 2007-08-31 18:32:14 PDT
Comment on attachment 16125 [details] first attempt Good fix. r=me
Rob Buis
Comment 5 2007-09-01 01:15:34 PDT
Landed in r25341.
Darin Adler
Comment 6 2007-09-01 20:38:32 PDT
This change seems to have broken the layout test use-instanceRoot-modifications.svg -- for some reason the value returned from getElementById is no longer correct. We may need to roll the fix out.
Darin Adler
Comment 7 2007-09-01 20:45:15 PDT
Maciej showed me that Rob already has a cut at a patch to fix the regression: <http://paste.lisp.org/display/47065>.
Darin Adler
Comment 8 2007-09-01 20:46:36 PDT
Rob, I think that the comment in the patch needs a little improvement. It says something that doesn't really match what the code does. I understand why the code does what it does (we only can be sure the element is the first if there are no duplicates), but I'm not sure other readers would.
Darin Adler
Comment 9 2007-09-01 21:24:21 PDT
I also don't think the patch is correct. I'm taking a crack at this now.
Darin Adler
Comment 10 2007-09-01 21:50:53 PDT
Reopening this because the fix caused a regression. I have a new fix.
Darin Adler
Comment 11 2007-09-01 21:59:17 PDT
Comment on attachment 16125 [details] first attempt Turns out this was incorrect.
Darin Adler
Comment 12 2007-09-01 22:00:08 PDT
Created attachment 16178 [details] second attempt; tested with SVG test case as well as Rob's
Maciej Stachowiak
Comment 13 2007-09-01 22:15:52 PDT
I think Darin's version of the patch will increase memory use in the common case where there aren't any duplicates, or at least most elements with IDs don't have duplicates.
Rob Buis
Comment 14 2007-09-02 02:04:02 PDT
Created attachment 16179 [details] More complete patch Firsr of all, apologies for missing the fact that two svg tests broke, I kind of assumed some svg tests were broken because of other things. Second, I'd like to get it fixes asap so we can move on :) This patch is basically the lisppaste snippet plus Darin's comments taken into account. I worded half of it myself but also took one sentence from Darin's patch. Note that the new patch does not alter removeElementById, that was a mistake in my first patch pointed out by Maciej. I have a slight tendency towards my patch since it is close to the original code, but again the main thing IMHO is too resolve the problem soon :) Cheers, Rob.
Darin Adler
Comment 15 2007-09-02 07:51:37 PDT
Comment on attachment 16179 [details] More complete patch This patch no longer applies, because I rolled out your original fix. So you need a new patch that works without first applying your old patch (a combination of both). But your patch also has what I consider a memory leak. If you write a loop that adds two elements with an id, then calls getElementById on that id, then removes both elements, in a loop, with your patch the m_duplicateIds set will keep getting bigger and bigger, because the code leaks a duplicate id in that case. I think we don't want that behavior. I think my approach is better even though the patch is bigger.
Darin Adler
Comment 16 2007-09-02 07:52:00 PDT
Comment on attachment 16178 [details] second attempt; tested with SVG test case as well as Rob's I've got a new slightly better version of this patch coming shortly.
Darin Adler
Comment 17 2007-09-02 07:58:02 PDT
Created attachment 16180 [details] improved patch; one less hash lookup in the "add" function than the previous
Darin Adler
Comment 18 2007-09-02 08:01:30 PDT
(In reply to comment #15) > (From update of attachment 16179 [details] [edit]) > This patch no longer applies, because I rolled out your original fix. So you > need a new patch that works without first applying your old patch (a > combination of both). Oops, my mistake. The patch applies just fine. But my other comment stands. We will leak duplicate ids in that case. My patch is better -- more efficient and doesn't have that leak.
Darin Adler
Comment 19 2007-09-02 08:05:16 PDT
Comment on attachment 16179 [details] More complete patch I'm not sure that the leak issue really justifies a review-. I set this to review- because I thought the patch didn't apply, which was my mistake. But I still think my patch is a better fix!
Darin Adler
Comment 20 2007-09-02 09:34:09 PDT
Created attachment 16181 [details] corrected version of previous patch
Maciej Stachowiak
Comment 21 2007-09-04 05:27:00 PDT
Created attachment 16193 [details] alternate patch: a fixed version of Rob's patch This patch would use about 8k more memory on a page with 1000 ids, none of which were duplicates. (On the other hand, the current approach would use 2k more if every other one was a duplicate). This isn't very much memory in the grand scheme of things I guess, but the duplicate case is supposed to be rare so I think it's ok to use a few extra hash lookups to save a bit of memory. Thinking about it, I don't see how Rob's patch would leak a duplicate Id in the situation described. The two adds would leave the duplicate ID count at 1, with a cleared cache, the getElementById call would leave it alone but restore the cache, and the two removes would reduce it by 1. But I think a sequence of add/add/remove/remove would underflow the duplicate count, since it would be increased once but reduced twice. Here's what I think is a fixed version of Rob's patch, in case it is of interest.
Darin Adler
Comment 22 2007-09-04 08:48:40 PDT
(In reply to comment #21) > This patch would use about 8k more memory on a page with 1000 ids, none of > which were duplicates. I assume you're saying 8k because there'd be 1000 hash table entries, each that is 4 bytes larger for 4k, and then roughly 1000 more empty hash table entries, each that is 4 bytes larger for another 4k. > (On the other hand, the current approach would use 2k more if every other one was a duplicate). My calculations say that this would be the same 8k as above. A counted set hash table entry for each duplicate is 8 bytes per duplicate. So if we have 500 duplicates with 8 bytes each, that's 4k for the used hash table entries and 4k for the empty ones. As you say, though, my scheme is less efficient for pages with lots of non-duplicated ids.
Darin Adler
Comment 23 2007-09-04 08:50:59 PDT
Comment on attachment 16193 [details] alternate patch: a fixed version of Rob's patch This looks good. But there's a way to write addElementById with fewer hash table lookups. I'll make a new version.
Darin Adler
Comment 24 2007-09-04 09:12:27 PDT
Created attachment 16197 [details] newer version of Rob's patch, improved a bit more
Maciej Stachowiak
Comment 25 2007-09-29 19:34:35 PDT
please land on feature-branch
Adam Roben (:aroben)
Comment 26 2007-09-29 19:36:05 PDT
Comment on attachment 16197 [details] newer version of Rob's patch, improved a bit more Somehow, duplicate-ids-document-order.html got put in the patch twice, and duplicate-ids-document-order-expected.txt is missing. + // excluding the one referenced in m_elementsById, if any. This means it one less than the total count Typo: it one -> it is one + div = document.createElement ('div'); There's an extra space before the open parenthesis. r- until the test results are added, then I'll r+ for feature-branch.
Darin Adler
Comment 27 2007-10-17 19:56:04 PDT
(In reply to comment #26) > (From update of attachment 16197 [details] [edit]) > Somehow, duplicate-ids-document-order.html got put in the patch twice Strange, the patch applied just fine.
Darin Adler
Comment 28 2007-10-17 20:36:21 PDT
(In reply to comment #26) > (From update of attachment 16197 [details] [edit]) > Somehow, duplicate-ids-document-order.html got put in the patch twice, and > duplicate-ids-document-order-expected.txt is missing. That's because duplicate-ids-document-order.html is being re-added to the tree and duplicate-ids-document-order-expected.txt is already checked in.
Darin Adler
Comment 29 2007-10-17 20:38:35 PDT
Committed revision 26732.
Note You need to log in before you can comment on or make changes to this bug.