Bug 30926 - Optimizations to Element::getAttribute
Summary: Optimizations to Element::getAttribute
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Enhancement
Assignee: Jens Alfke
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-10-29 13:49 PDT by Jens Alfke
Modified: 2009-11-10 14:29 PST (History)
2 users (show)

See Also:


Attachments
patch (2.57 KB, patch)
2009-10-29 13:56 PDT, Jens Alfke
no flags Details | Formatted Diff | Diff
oops, correct patch (2.57 KB, patch)
2009-10-29 13:59 PDT, Jens Alfke
darin: review-
Details | Formatted Diff | Diff
patch 3 (4.30 KB, patch)
2009-10-29 15:36 PDT, Jens Alfke
darin: review+
Details | Formatted Diff | Diff
patch 4 with better optimization (5.00 KB, patch)
2009-10-30 14:19 PDT, Jens Alfke
darin: review+
Details | Formatted Diff | Diff
patch with "if(Attribute*..." restored (5.10 KB, patch)
2009-11-09 13:24 PST, Jens Alfke
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jens Alfke 2009-10-29 13:49:42 PDT
Driven by some Shark profiles, I've added some optimizations to Element::getAttribute(String). This is a hot call in some of the Dromaeo DOM benchmarks.

(1) When checking whether the attribute name is 'style', do a case-insensitive compare instead of creating a temporary lowercased string.
(2) Avoid this comparison entirely if m_isStyleAttributeValid is set
(3) In NamedNodeMap::getAttributeItem, avoid a bunch of unnecessary calls to QualifiedName::toString (which generates two temp strings) and redundant string comparisons.

getAttributeItem() could be optimized further to avoid having to generate temporary strings altogether, but that's farther than I want to go right now (and will only have an effect if there are attributes with prefixed names, which I think is rare in HTML.)
Comment 1 Jens Alfke 2009-10-29 13:56:06 PDT
Created attachment 42139 [details]
patch
Comment 2 Jens Alfke 2009-10-29 13:59:06 PDT
Created attachment 42140 [details]
oops, correct patch

Oops, I attached an obsolete patch file. This is the right one.
Comment 3 Darin Adler 2009-10-29 14:44:15 PDT
Comment on attachment 42140 [details]
oops, correct patch

Patches for review need change log entries in the patch. Please don't put them up for review without the change log entry since it makes extra work for reviewers.

Since this is a performance optimization, were you able to measure the speedup?

> Index: WebCore/dom/Element.cpp
> ===================================================================
> --- WebCore/dom/Element.cpp	(revision 50224)
> +++ WebCore/dom/Element.cpp	(working copy)
> @@ -488,9 +488,12 @@ static inline bool shouldIgnoreAttribute
>  
>  const AtomicString& Element::getAttribute(const String& name) const
>  {
> -    String localName = shouldIgnoreAttributeCase(this) ? name.lower() : name;
> -    if (localName == styleAttr.localName() && !m_isStyleAttributeValid)
> -        updateStyleAttribute();
> +    bool ignoreCase = shouldIgnoreAttributeCase(this);
> +    // Update the 'style' attribute if it's invalid and being requested:
> +    if (!m_isStyleAttributeValid)
> +        if (ignoreCase ? equalIgnoringCase(name, styleAttr.localName())
> +                       : (name == styleAttr.localName()))
> +            updateStyleAttribute();

Multiple line if statement body needs braces to match WebKit style guide.

I also think you should avoid the unusual formatting here -- we frown on lining code up like the "?" and ":" here since it's high maintenance (has to be reformatted if you rename things).

One way to avoid that would be to put the comparison into an inline function that takes a boolean:

    static bool equalPossiblyIgnoringCase(const String& a, const AtomicString& b, bool ignoreCase)
    {
        if (ignoreCase)
            return equalIgnoringCase(a, b);
        return a == b;
    }

Then you could change things back to a single line if,.

    if (!m_isStyleAttributeValue && equalPossiblyIgnoringCase(name, styleAttr.localName(), ignoreCase))
        updateStyleAttribute();

Maybe you like this, or maybe you will find some other way of eliminating the unusual formatting, or maybe you will want to start a thread on webkit-dev suggesting we do this style of formatting ;-)

>      if (namedAttrMap)
> -        if (Attribute* a = namedAttrMap->getAttributeItem(name, shouldIgnoreAttributeCase(this)))
> +        if (Attribute* a = namedAttrMap->getAttributeItem(name, ignoreCase))
>              return a->value();

Since you're touching this code, you should add the braces we would have used if writing this code anew, and give the local variable a name that's a word instead of a letter.

> +            // FIXME: Would be faster to do this test without generating a temporary string.

I think the FIXME needs to explain why you didn't do it. Maybe just wording it this way: "Would be faster if we could compare a QualifiedName to a string without generating a temporary string."
Comment 4 Jens Alfke 2009-10-29 15:36:28 PDT
Created attachment 42154 [details]
patch 3

Sorry about forgetting the changelog again; don't know why I keep doing that...

I took your advice to create an inline string-comparison helper, but added it to the String class since I needed it in both source files and it could be useful elsewhere.
Comment 5 Eric Seidel (no email) 2009-10-29 15:41:09 PDT
Comment on attachment 42154 [details]
patch 3

Does this make a specific performance test demonstrably faster?
Comment 6 Eric Seidel (no email) 2009-10-29 15:41:58 PDT
(I don't mean to sound like I'm doubting that it does.  Just interested in knowing more specifics.)
Comment 7 Darin Adler 2009-10-29 15:49:53 PDT
Comment on attachment 42154 [details]
patch 3

You didn’t answer my question from the original review: "Since this is a performance optimization, were you able to measure the speedup?"

> +    bool ignoreCase = shouldIgnoreAttributeCase(this);
> +    // Update the 'style' attribute if it's invalid and being requested:
> +    if (!m_isStyleAttributeValid && equalPossiblyIgnoringCase(name, styleAttr.localName(), ignoreCase))
>          updateStyleAttribute();

I would have put a blank line before the comment to make the paragraphing look right.

> -    if (namedAttrMap)
> -        if (Attribute* a = namedAttrMap->getAttributeItem(name, shouldIgnoreAttributeCase(this)))
> -            return a->value();
> +    if (namedAttrMap) {
> +        Attribute* attribute = namedAttrMap->getAttributeItem(name, ignoreCase);
> +        if (attribute)
> +            return attribute->value();
> +    }

You took the definition of the local variable out of the if statement. Why?

> +    return ignoreCase ? equalIgnoringCase(a,b) : (a == b);

Need a space after the comma.

Since the only comments here are a small number of trivial formatting things, I'll say r=me
Comment 8 Jens Alfke 2009-10-30 11:35:14 PDT
> Since this is a performance optimization, were you able to measure the speedup?

Good question. I had seen improvement on Dromaeo's DOM Attributes test yesterday, but hadn't written the numbers down. I tried to replicate it this morning, doing at least six runs either way, but the numbers were now coming up a few points _worse_ with the patch. WTF? I must have either gotten the before/after results mixed up yesterday, or hit some lucky jitter in the test timing that made it better...

After more Shark'ing I've found that the problem is that equalIgnoringCase is much, much slower than equal, and the old getAttributeItem code was (intentionally?) optimized for the case where the very first attribute in the list exactly matches the input name. In that case it does only a single == comparison and exits. That seems to be what's going on in that Dromaeo test, judging by the sample counts. In any other situation it would have been faster with the patch.

This gives me the idea of first scanning the list case-sensitively, and then only scanning insensitively if that fails. This will be faster in the presumably-typical case where the attribute is searched for with the same case in which it appears in the document (and that most searches are for attributes that exist.) I'll try that and see how it affects the scores.
Comment 9 Jens Alfke 2009-10-30 14:19:01 PDT
Created attachment 42232 [details]
patch 4 with better optimization

OK, here is an improved optimization for NamedAttrMap. It runs through the list once doing equality checks on the prefix-less attributes, and if that fails, if necessary it runs through again doing the full check.

In Chromium, I saw a 20% speedup in the getAttribute sub-test, and a 3% speedup in the DOM Attributes test it's part of. Other core DOM tests were not noticeably affected.
In Safari, there was only a 3% improvement in the getAttribute sub-test and no noticeable change in the tests.

From a bit of Shark'ing around, it looks like the difference is because Safari spends a lot of time in JSC string copying and GC during the getAttribute test, while Chromium doesn't. So in Safari, the improvement to Element::getAttribute(String) is lost in the noise. Not sure why the difference in behavior. That also explains why Chromium is already 50% faster than Safari on the DOM Attributes test. (It sucks on some of the other tests, though.)
Comment 10 Darin Adler 2009-10-30 14:40:10 PDT
The Core DOM test suite is not a performance test and not at all representative of real use on the web.

I’m not sure we should be using it to guide our decisions about performance optimizations.
Comment 11 Darin Adler 2009-10-30 14:40:31 PDT
Oh, sorry, my bad. The test in question is Dromaeo.
Comment 12 Darin Adler 2009-10-30 14:41:54 PDT
(In reply to comment #11)
> Oh, sorry, my bad. The test in question is Dromaeo.

And that *is* a performance test.
Comment 13 Jens Alfke 2009-11-06 16:43:22 PST
Patch #4 has been out for a week; any chance of a review soon? (I know that patch 3 got a review+, but I made some nontrivial changes after that and I think it needs to be looked at again.)
Comment 14 Darin Adler 2009-11-06 16:47:09 PST
Comment on attachment 42232 [details]
patch 4 with better optimization

> +        Attribute* attribute = namedAttrMap->getAttributeItem(name, ignoreCase);
> +        if (attribute)
> +            return attribute->value();

You took the local variable definition out of the if. Why?

r=me
Comment 15 Jens Alfke 2009-11-09 11:39:08 PST
> You took the local variable definition out of the if. Why?

Oops, I forgot to answer that the first time; sorry. The answer is that I think assignments inside conditional expressions are bad, both because they're hard to notice and because the same construct infamously arises due to a typo ("if (x=5) ..."). I'll use them sometimes in a while() test because they can make the code a lot simpler, but otherwise I think they're bad news, and the one here didn't seem necessary.
Comment 16 Darin Adler 2009-11-09 11:55:18 PST
(In reply to comment #15)
> Oops, I forgot to answer that the first time; sorry. The answer is that I think
> assignments inside conditional expressions are bad, both because they're hard
> to notice and because the same construct infamously arises due to a typo ("if
> (x=5) ..."). I'll use them sometimes in a while() test because they can make
> the code a lot simpler, but otherwise I think they're bad news, and the one
> here didn't seem necessary.

This code:

    if (x = 5)

gives a warning under gcc and so can't be checked in to our project. We are talking about this code:

    if (int max = maxLength())

which seems like a separate issue to me and not at all hard to notice. I'd appreciate it if you consider either following the prevailing style in WebKit or discussing with the others on webkit-dev if you want to change the style.

I understand you personally think these are bad.
Comment 17 Jens Alfke 2009-11-09 12:57:05 PST
I'm sorry, I didn't realize this was a contentious issue. There's nothing about it one way or the other in the WebKit style guide, and I don't remember having seen any other instances of this in WebKit code; so I thought it was just a bit of cruft that crept in, and I cleaned it up while I was already editing those same lines.

(In some earlier patches I was specifically told during the review to _not_ follow a prevailing idiom in that source file, and even to optionally clean up existing usage -- an example is the usage of PassRefPtrs as local variables in StringImpl.cpp.)

> which seems like a separate issue to me and not at all hard to notice.

Well, I personally found that block confusing; it took a bit of searching to locate where the variable 'max' was declared, since I couldn't find a line beginning "int max =...". I wasn't even aware that variables declared in the scope of an 'if' would be in context inside the if's body — I don't think I've ever seen that done before.

Anyway, I'll change it back and upload a new patch in a minute.
Comment 18 Jens Alfke 2009-11-09 13:24:20 PST
Created attachment 42795 [details]
patch with "if(Attribute*..." restored

Only change is to the previously discussed 'if' statement:

-        if (Attribute* a = namedAttrMap->getAttributeItem(name, shouldIgnoreAttributeCase(this)))
-            return a->value();
+        if (Attribute* attribute = namedAttrMap->getAttributeItem(name, ignoreCase))
+            return attribute->value();
Comment 19 Jens Alfke 2009-11-10 14:29:50 PST
Committed revision 50763.