WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
165368
[CSS Parser] Eliminate in-place lowercasing in the parser.
https://bugs.webkit.org/show_bug.cgi?id=165368
Summary
[CSS Parser] Eliminate in-place lowercasing in the parser.
Dave Hyatt
Reported
2016-12-04 11:00:37 PST
[CSS Parser] Eliminate in-place lowercasing in the parser.
Attachments
Patch
(19.27 KB, patch)
2016-12-04 11:11 PST
,
Dave Hyatt
no flags
Details
Formatted Diff
Diff
Patch
(19.13 KB, patch)
2016-12-04 11:13 PST
,
Dave Hyatt
no flags
Details
Formatted Diff
Diff
Patch
(20.67 KB, patch)
2016-12-04 11:42 PST
,
Dave Hyatt
no flags
Details
Formatted Diff
Diff
Patch
(21.78 KB, patch)
2016-12-04 12:26 PST
,
Dave Hyatt
darin
: review+
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
Dave Hyatt
Comment 1
2016-12-04 11:11:27 PST
Created
attachment 296090
[details]
Patch
Dave Hyatt
Comment 2
2016-12-04 11:13:25 PST
Created
attachment 296091
[details]
Patch
Dave Hyatt
Comment 3
2016-12-04 11:42:23 PST
Created
attachment 296092
[details]
Patch
Dave Hyatt
Comment 4
2016-12-04 12:26:21 PST
Created
attachment 296097
[details]
Patch
Darin Adler
Comment 5
2016-12-04 14:18:36 PST
Comment on
attachment 296097
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=296097&action=review
Looks great!
> Source/WebCore/css/CSSSelector.cpp:89 > + AtomicString value(m_data.m_value); > + if (m_data.m_value) > + m_data.m_value->deref();
Instead of the three lines above, we should write this: AtomicString value { adoptRef(m_data.m_value) }; This avoids some reference count churn.
> Source/WebCore/css/CSSSelector.cpp:90 > + m_data.m_rareData = &RareData::create(value).leakRef();
Should use WTFMove on the AtomicString value to pass in to avoid a little reference count churn: m_data.m_rareData = &RareData::create(WTFMove(value)).leakRef();
> Source/WebCore/css/CSSSelector.cpp:826 > +CSSSelector::RareData::RareData(const AtomicString& value)
Should take AtomicString&& instead of AtomicString. Avoid one round of reference count churn.
> Source/WebCore/css/CSSSelector.h:245 > + void setValue(const AtomicString&, bool matchLowerCase = false);
I think this needs a different name because "match lower case" sounds like "should I match some lower case thing". I suggest the name shouldConvertToLowercaseForMatching. But maybe you can find a great name that is shorter.
> Source/WebCore/css/CSSSelector.h:353 > + static Ref<RareData> create(const AtomicString& value) { return adoptRef(*new RareData(value)); }
Should be AtomicString&& instead of const AtomicString& and then use WTFMove inside the call to the constructor.
> Source/WebCore/css/CSSSelector.h:374 > + RareData(const AtomicString& value);
Should be AtomicString&&. No need for the name "value". Should be marked explicit.
> Source/WebCore/css/CSSSelector.h:468 > +
Should not add the stray space here.
> Source/WebCore/css/CSSSelector.h:473 > + if (matchLowerCase && !m_hasRareData && !isASCIILowerCSSIdentifier(value)) > + createRareData();
Here’s how I would write this: AtomicString matchingValue = matchLowerCase ? value.convertToASCIILowercase() : value; if (!m_hasRareData && matchingValue != value) createRareData(); Here’s why: convertToASCIILowercase is already optimized for the case where there are no uppercase characters, and so it should be fast in that case, possibly even faster than isASCIILowerCSSIdentifier. And in the case where we already have rare data, this was something we were going to do below anyway. Then below it would be this: m_data.m_rareData->m_matchingValue = WTFMove(matchingValue); Doing that avoids adding reference count churn just because we had a temporary AtomicString.
> Source/WebCore/css/MediaQueryExp.cpp:296 > : m_mediaFeature(feature) > , m_isValid(false) > { > + if (!isASCIILowerCSSIdentifier(m_mediaFeature)) > + m_mediaFeature = m_mediaFeature.convertToASCIILowercase();
Calling isASCIILowerCSSIdentifier isn’t an effective optimization. AtomicString::convertToASCIILowercase already has a fast path for when nothing is uppercase. It should just be written like this: : m_mediaFeature(feature.convertToASCIILowercase()) No need for the if statement.
> Source/WebCore/css/parser/CSSParserIdioms.h:67 > +inline bool isASCIILowerCSSIdentifier(const AtomicString& value)
This function should just be named containsASCIIUppercase. We might also want to make version with a fast case for 8 bit; this version checks 8-bit every time through the loop. But more importantly, if you take my advice above, then you don’t need this function at all.
> Source/WebCore/css/parser/CSSPropertyParser.cpp:2127 > + if (context.isHTMLDocument && !isASCIILowerCSSIdentifier(attrName)) > + attrName = attrName.convertToASCIILowercase();
Again, the isASCIILowerCSSIdentifier check here is not an optimization. Just write this: if (context.isHTMLDocument) attrName = attrName.convertToASCIILowercase(); But it’s a tiny bit annoying that we did toAtomicString on a string that could have contained uppercase; there is a chance we will be destroying that AtomicString we made because we make the lowercase one instead. If we care to tighten that case we could probably add a CSSParserToken function that produces a possibly-lowercased AtomicString efficiently and never has to allocate two strings. But that’s probably overdoing it.
> Source/WebCore/css/parser/CSSSelectorParser.cpp:513 > + auto lowerValue = token.value().toAtomicString(); > + if (!isASCIILowerCSSIdentifier(lowerValue)) > + lowerValue.convertToASCIILowercase(); > + auto value = StringView(lowerValue);
No reason to use AtomicString here, and isASCIILowerCSSIdentifier is not an optimization. So this should be just this: auto lowercasedValue = token.value().toString().convertToASCIILowercase(); auto value = StringView { lowercasedValue }; Same thought as above about the extra temporary string here; the token class could be taught to make a string that is already lowercased rather than doing it after allocating a String. This will be slower than the old code that lowercased in place when uppercase characters are involved, because that code would never allocate two strings.
> Source/WebCore/css/parser/CSSSelectorParser.cpp:523 > + auto tokenString = value.toString(); > if (!tokenString.startsWithIgnoringASCIICase("host")) { > newValue = value.toString() + "("; > value = newValue;
This code allocates a lot of unnecessary strings. This version allocates no StringImpl when it doesn't start with "host" and allocates only one StringImpl when it does. The old code allocated 1 and 3 StringImpl in those cases! if (!value.startsWithIgnoringASCIICase(StringVIew { "host" })) { newValue = makeString(value, '('); value = newValue; } And if we have some reason to wait to optimize it even tighter, we could easily write a startsWithLettersIgnoringASCIICase function that takes a StringView and then we could have written: if (!startsLettersWithIgnoringASCIICase(value, "host")) { newValue = makeString(value, '('); value = newValue; } But that seems like maybe premature optimization whereas the other one above just seems like common sense.
Darin Adler
Comment 6
2016-12-04 14:20:23 PST
Comment on
attachment 296097
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=296097&action=review
>> Source/WebCore/css/CSSSelector.h:245 >> + void setValue(const AtomicString&, bool matchLowerCase = false); > > I think this needs a different name because "match lower case" sounds like "should I match some lower case thing". I suggest the name shouldConvertToLowercaseForMatching. But maybe you can find a great name that is shorter.
Maybe matchLowerCase is not so bad. Or maybe matchAsLowerCase.
Dave Hyatt
Comment 7
2016-12-04 14:58:58 PST
Note that the code involving the "host" check and all that is just temporary and can be removed once we have gotten rid of the old parser. I still made your suggested change to it, but it's living on borrowed time anyway.
Dave Hyatt
Comment 8
2016-12-04 16:36:11 PST
Fixed in
r209314
.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug