Bug 105713

Summary: getAttribute does not behave correctly for mixed-case attributes on HTML elements
Product: WebKit Reporter: Boris Zbarsky <bzbarsky>
Component: DOMAssignee: Arpita Bahuguna <arpitabahuguna>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, arpitabahuguna, cmarcelo, commit-queue, darin, esprehn+autocc, kling, ojan.autocc, ojan, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 111681    
Bug Blocks:    
Attachments:
Description Flags
Patch
none
Patch
none
Performance test results
none
Performance test results on Mac
none
Updated perf test results on mac none

Boris Zbarsky
Reported 2012-12-24 01:37:55 PST
Consider this testcase: var e = document.createElement("div"); e.setAttributeNS(null, "SOMETHING", "FAIL"); e.setAttributeNS(null, "something", "PASS"); alert(e.getAttribute("SOMETHING")); Per current spec at http://www.w3.org/TR/domcore/#dom-element-getattribute this should alert "PASS", which is what Gecko and Presto do. WebKit alerts "FAIL". Note that this causes WebKit to fail some of the tests at http://w3c-test.org/webapps/DOMCore/tests/submissions/Ms2ger/attributes.html
Attachments
Patch (5.65 KB, patch)
2013-03-04 04:13 PST, Arpita Bahuguna
no flags
Patch (6.12 KB, patch)
2013-03-08 01:23 PST, Arpita Bahuguna
no flags
Performance test results (37.77 KB, text/html)
2013-04-10 06:54 PDT, Arpita Bahuguna
no flags
Performance test results on Mac (58.64 KB, text/html)
2013-04-12 06:57 PDT, Arpita Bahuguna
no flags
Updated perf test results on mac (71.62 KB, text/html)
2013-04-16 06:23 PDT, Arpita Bahuguna
no flags
Boris Zbarsky
Comment 1 2012-12-24 01:39:23 PST
Er, with Presto it might depend on the version you test, apparently...
Alexey Proskuryakov
Comment 2 2013-01-02 10:19:46 PST
See also: bug 90341.
Arpita Bahuguna
Comment 3 2013-02-18 21:55:07 PST
For the given testcase, if we were to use setAttribute() instead of setAttributeNS(), there is no issue. The specification for setAttribute() clearly states that the "name" parameter of the API should be converted to ASCII lowercase (before use) whereas, the specification for setAttributeNS() has no such mandate. [http://www.w3.org/TR/domcore/#dom-element-setattribute] This is perhaps what causes the setAttributeNS() method to store case sensitive attributes thereby failing the given testcase. The question is whether WebKit too should follow Gecko and handle setAttributeNS() in a manner similar to setAttribute().
Boris Zbarsky
Comment 4 2013-02-19 05:13:44 PST
setAttributeNS always preserves the case of the attribute name; that's true in both WebKit and in Gecko. You can see this trivially by just enumerating the attributes on the given testcase. This bug is purely about the behavior of getAttribute. In WebKit the argument to getAttribute is compared case-insensitively to all attribute names; if that fails to find a match further processing is done. So in WebKit the "FAIL" value is found. But per spec, the argument to getAttribute should be ASCII-lower-cased before doing any comparisons, so the "PASS" value should be found.
Boris Zbarsky
Comment 5 2013-02-19 05:14:10 PST
> compared case-insensitively to all attribute names I meant "compared case-sensitively".
Arpita Bahuguna
Comment 6 2013-02-19 06:22:48 PST
(In reply to comment #4) > setAttributeNS always preserves the case of the attribute name; that's true in both WebKit and in Gecko. You can see this trivially by just enumerating the attributes on the given testcase. > > This bug is purely about the behavior of getAttribute. In WebKit the argument to getAttribute is compared case-insensitively to all attribute names; if that fails to find a match further processing is done. So in WebKit the "FAIL" value is found. > > But per spec, the argument to getAttribute should be ASCII-lower-cased before doing any comparisons, so the "PASS" value should be found. Thank-you for the explanation Boris. I understand the issue now. I will check on this further.
Arpita Bahuguna
Comment 7 2013-02-22 06:24:47 PST
As I see it, there are two issues with the usage of getAttribute() and getAttributeNode() APIs in WebKit: 1. As exemplified by the given testcase, getAttribute() (and getAttributeNode()) should convert its argument to lowercase before comparing against existing attributes. This is as per the specification as well. 2. Say we have an existing attribute, stored via setAttributeNS(), in uppercase. getAttribute() (and getAttributeNode()) API will however still return the existing attribute even while comparing using lowercase. This is what causes the testcases specified in http://w3c-test.org/webapps/DOMCore/tests/submissions/Ms2ger/attributes.html for getAttribute() to fail. I shall try and address issue 1. in this bug and open another bug for the second issue.
Arpita Bahuguna
Comment 8 2013-03-04 04:13:22 PST
Andreas Kling
Comment 9 2013-03-04 06:05:21 PST
Comment on attachment 191201 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=191201&action=review r=me with one adjustment: > Source/WebCore/dom/Element.h:944 > - if (name == attribute->localName()) > + if ((shouldIgnoreAttributeCase ? name.lower() : name) == attribute->localName()) You can avoid creating a temporary string from .lower() by using WTF::equalPossiblyIgnoringCase: if (equalPossiblyIgnoringCase(name, attribute->localName(), shouldIgnoreAttributeCase)
Boris Zbarsky
Comment 10 2013-03-04 06:46:07 PST
> You can avoid creating a temporary string from .lower() by using > WTF::equalPossiblyIgnoringCase: > if (equalPossiblyIgnoringCase(name, attribute->localName(), > shouldIgnoreAttributeCase) Unless I'm totally missing something, that will give the wrong behavior, and in particular would fail the testcase this bug is filed on. The whole point is that per spec "SOMETHING" passed in to getAttribute in HTML should NOT match the attribute named "SOMETHING".
Andreas Kling
Comment 11 2013-03-04 06:52:55 PST
(In reply to comment #10) > > You can avoid creating a temporary string from .lower() by using > > WTF::equalPossiblyIgnoringCase: > > if (equalPossiblyIgnoringCase(name, attribute->localName(), > > shouldIgnoreAttributeCase) > > Unless I'm totally missing something, that will give the wrong behavior, and in particular would fail the testcase this bug is filed on. The whole point is that per spec "SOMETHING" passed in to getAttribute in HTML should NOT match the attribute named "SOMETHING". Well, duh. That's what I get for trying to infer the purpose of the change instead of reading the bug. You are right of course. Please disregard my comment, the patch is good as-is.
WebKit Review Bot
Comment 12 2013-03-04 22:59:56 PST
Comment on attachment 191201 [details] Patch Clearing flags on attachment: 191201 Committed r144726: <http://trac.webkit.org/changeset/144726>
WebKit Review Bot
Comment 13 2013-03-04 23:00:00 PST
All reviewed patches have been landed. Closing bug.
Arpita Bahuguna
Comment 14 2013-03-05 04:30:15 PST
(In reply to comment #9) > (From update of attachment 191201 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=191201&action=review > > r=me with one adjustment: > > > Source/WebCore/dom/Element.h:944 > > - if (name == attribute->localName()) > > + if ((shouldIgnoreAttributeCase ? name.lower() : name) == attribute->localName()) > > You can avoid creating a temporary string from .lower() by using WTF::equalPossiblyIgnoringCase: > if (equalPossiblyIgnoringCase(name, attribute->localName(), shouldIgnoreAttributeCase) Thanks for the r+ Kling. Sorry, I left for the day after uploading the patch, but thanks to Boris for pointing out the issue which would be caused if equalPossiblyIgnoringCase() were to be used.
Darin Adler
Comment 15 2013-03-05 09:44:18 PST
Surely we can do better for performance here. For calls inside the engine it seems silly to be calling lower() on strings known to be lowercase at the call site. And when there are multiple attributes it seems the call to lower() should be outside the loop.
Ojan Vafai
Comment 18 2013-03-06 19:24:29 PST
Can we revert first and try to figure out a more performant way to do this after? Darin has some good suggestions to try. As a matter of policy, we don't leave performance regressions in the tree for more than a few hours (or maybe a day) because it gets very difficult to make sure the regression is fully addressed when there are other performance affecting changes getting committed.
Arpita Bahuguna
Comment 19 2013-03-06 21:28:24 PST
(In reply to comment #18) > Can we revert first and try to figure out a more performant way to do this after? Darin has some good suggestions to try. > > As a matter of policy, we don't leave performance regressions in the tree for more than a few hours (or maybe a day) because it gets very difficult to make sure the regression is fully addressed when there are other performance affecting changes getting committed. Hi Ojan, sure this can be reverted. Will take Darin's suggestions and verify performance regressions before submitting the new patch.
WebKit Review Bot
Comment 20 2013-03-06 21:39:50 PST
Re-opened since this is blocked by bug 111681
Ojan Vafai
Comment 21 2013-03-07 00:18:34 PST
(In reply to comment #19) > Hi Ojan, sure this can be reverted. Will take Darin's suggestions and verify performance regressions before submitting the new patch. Thanks! I know it's a pain, but reverting and reworking the patch is the only way we've found so far to consistently avoid leaving performance regressions in.
Arpita Bahuguna
Comment 22 2013-03-08 01:23:21 PST
Arpita Bahuguna
Comment 23 2013-03-08 01:38:57 PST
(In reply to comment #15) > Surely we can do better for performance here. For calls inside the engine it seems silly to be calling lower() on strings known to be lowercase at the call site. And when there are multiple attributes it seems the call to lower() should be outside the loop. Hi Darin, Kling, Have uploaded another patch, hopefully, with not as glaring a performance regression as before. :) Have moved the conversion (to lower) outside the loop. I understand what you mean by "seems silly to be calling lower() on strings known to be lowercase at the call site" but at the places where we are converting to lowercase before calling on getAttributeItemIndex() we also pass the second param as false which would guarantee that we don't make another unnecessary conversion call within getAttributeItemIndex(). Also, I could either do the conversion within getAttributeItemIndex() or else, do it before calling this method (which is being done in some but not all the cases). These are the cases where they explicitly pass the second param as false intending it not to go through the slow check (namespace sensitive check). Perhaps we can differentiate between these bools (slow check and case insensitive check), i.e. introduce another param, or maybe always carry out the lowercase conversion before calling on getAttributeItemIndex().
Andreas Kling
Comment 24 2013-04-04 02:53:17 PDT
(In reply to comment #23) > (In reply to comment #15) > > Surely we can do better for performance here. For calls inside the engine it seems silly to be calling lower() on strings known to be lowercase at the call site. And when there are multiple attributes it seems the call to lower() should be outside the loop. > > Hi Darin, Kling, > Have uploaded another patch, hopefully, with not as glaring a performance regression as before. :) > > Have moved the conversion (to lower) outside the loop. I understand what you mean by "seems silly to be calling lower() on strings known to be lowercase at the call site" but at the places where we are converting to lowercase before calling on getAttributeItemIndex() we also pass the second param as false which would guarantee that we don't make another unnecessary conversion call within getAttributeItemIndex(). > > Also, I could either do the conversion within getAttributeItemIndex() or else, do it before calling this method (which is being done in some but not all the cases). These are the cases where they explicitly pass the second param as false intending it not to go through the slow check (namespace sensitive check). > Perhaps we can differentiate between these bools (slow check and case insensitive check), i.e. introduce another param, or maybe always carry out the lowercase conversion before calling on getAttributeItemIndex(). Could you run the Dromaeo tests Ojan mentioned with run-perf-tests before and after this patch and report results? It'd be good to have numbers here.
Arpita Bahuguna
Comment 25 2013-04-10 06:54:06 PDT
Created attachment 197261 [details] Performance test results
Arpita Bahuguna
Comment 26 2013-04-10 07:03:37 PDT
(In reply to comment #24) > > Could you run the Dromaeo tests Ojan mentioned with run-perf-tests before and after this patch and report results? It'd be good to have numbers here. Hi Kling, apologize for the mega delay in posting the performance test results. Have attached the results for the Dromeo tests (on Qt (r147518)), for both before and after the patch.
Arpita Bahuguna
Comment 27 2013-04-12 06:57:14 PDT
Created attachment 197814 [details] Performance test results on Mac Performance test results on mac r148010
Arpita Bahuguna
Comment 28 2013-04-12 07:10:49 PDT
Hi Kling, Have posted another set of Dromeo perf test results (on Mac r148010). The first set is for a clean build, the second set of results are with the patch and the third set of results are for a build with a slightly tweaked patch. For the third set, instead of: const AtomicString caseAdjustedName = shouldIgnoreAttributeCase ? name.lower() : name; have modified it as: AtomicString caseAdjustedName = name; if (shouldIgnoreAttributeCase) caseAdjustedName = name.lower(); (avoiding else) Would appreciate your thoughts on the same.
Arpita Bahuguna
Comment 29 2013-04-12 07:43:17 PDT
Comment on attachment 197814 [details] Performance test results on Mac Incorrect first run, hence obsoleting.
Andreas Kling
Comment 30 2013-04-12 07:47:12 PDT
> (In reply to comment #24) > Have attached the results for the Dromeo tests (on Qt (r147518)), for both before and after the patch. So there's a ~2% regression on the attribute-related tests? Eep. Better than 27% though :) (In reply to comment #28) > For the third set, instead of: > const AtomicString caseAdjustedName = shouldIgnoreAttributeCase ? name.lower() : name; > have modified it as: > AtomicString caseAdjustedName = name; > if (shouldIgnoreAttributeCase) > caseAdjustedName = name.lower(); > (avoiding else) > > Would appreciate your thoughts on the same. Can we use a const reference here, and avoid the extra ref()/deref()?
Arpita Bahuguna
Comment 31 2013-04-16 06:23:30 PDT
Created attachment 198323 [details] Updated perf test results on mac
Arpita Bahuguna
Comment 32 2013-04-16 07:14:23 PDT
Hi Kling, Have posted another set of results, finally (and hopefully) the correct ones this time. :) The first set (r148010) corresponds to Dromaeo results for a clean build. The second set (r148010-ternary) corresponds to the following change: const AtomicString caseAdjustedName = shouldIgnoreAttributeCase ? name.lower() : name; The third set (r148010-withIf()) corresponds to the following change: AtomicString caseAdjustedName = name; if (shouldIgnoreAttributeCase) caseAdjustedName = name.lower(); [Considering branch prediction, this should perhaps be a little faster than the previous approach. (??)] And the last set (r148010-ternaryWithRef) is for: const AtomicString& caseAdjustedName = shouldIgnoreAttributeCase ? name.lower() : name; Surprisingly the last set (with the const ref) deteriorates the performance even more! The second set (using if() with def initialization) shows an improved performance for most of the cases save for three: /Dromaeo/jslib-attr-jquery /Dromaeo/jslib-attr-prototype /Dromaeo/jslib-modify-prototype for which there is a marked performance deterioration. Would appreciate your opinion for these results.
Andreas Kling
Comment 33 2013-04-17 07:38:44 PDT
Comment on attachment 192176 [details] Patch Okay, I think this will work as-is. I'm pretty sure that the variations in your test results are noise. We'll get more stable numbers when this gets tested on the perf bots. So r=me.
WebKit Commit Bot
Comment 34 2013-04-17 09:13:22 PDT
Comment on attachment 192176 [details] Patch Clearing flags on attachment: 192176 Committed r148614: <http://trac.webkit.org/changeset/148614>
WebKit Commit Bot
Comment 35 2013-04-17 09:13:25 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.