WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
138170
Add initial parsing functionality of :lang pseudo class in Selectors Level 4.
https://bugs.webkit.org/show_bug.cgi?id=138170
Summary
Add initial parsing functionality of :lang pseudo class in Selectors Level 4.
Dhi Aurrahman
Reported
2014-10-29 06:24:48 PDT
Extend the :lang() pseudo-class to accept comma-separated list arguments
Attachments
Patch
(11.72 KB, patch)
2014-10-29 06:26 PDT
,
Dhi Aurrahman
no flags
Details
Formatted Diff
Diff
Patch
(14.83 KB, patch)
2014-10-29 09:59 PDT
,
Dhi Aurrahman
no flags
Details
Formatted Diff
Diff
Patch
(14.28 KB, patch)
2014-10-29 23:06 PDT
,
Dhi Aurrahman
no flags
Details
Formatted Diff
Diff
Patch
(14.28 KB, patch)
2014-10-29 23:12 PDT
,
Dhi Aurrahman
no flags
Details
Formatted Diff
Diff
Patch
(19.28 KB, patch)
2014-10-30 00:59 PDT
,
Dhi Aurrahman
no flags
Details
Formatted Diff
Diff
Patch
(19.38 KB, patch)
2014-10-30 02:12 PDT
,
Dhi Aurrahman
no flags
Details
Formatted Diff
Diff
Patch
(19.38 KB, patch)
2014-10-30 16:03 PDT
,
Dhi Aurrahman
no flags
Details
Formatted Diff
Diff
Patch
(27.48 KB, patch)
2014-10-30 23:33 PDT
,
Dhi Aurrahman
no flags
Details
Formatted Diff
Diff
Patch
(28.81 KB, patch)
2014-10-31 01:31 PDT
,
Dhi Aurrahman
no flags
Details
Formatted Diff
Diff
Show Obsolete
(8)
View All
Add attachment
proposed patch, testcase, etc.
Dhi Aurrahman
Comment 1
2014-10-29 06:26:21 PDT
Created
attachment 240598
[details]
Patch
Dhi Aurrahman
Comment 2
2014-10-29 09:59:23 PDT
Created
attachment 240608
[details]
Patch
Benjamin Poulain
Comment 3
2014-10-29 18:19:32 PDT
Comment on
attachment 240608
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=240608&action=review
First round of review really quick. I did not look into the crash yet. Can you make an update from the comments above? I'll try it locally.
> Source/WebCore/ChangeLog:27 > + * css/SelectorChecker.cpp:
Let's do matching in a separate patch. There is enough complexity in parsing alone to justify a patch.
> Source/WebCore/css/CSSGrammar.y.in:1555 > +comma_separated_identifiers:
There is one more case I did not mention in my email: errors. Let's say we have ":lang(foo, bar, 25)" The grammar would work like this on "foo, bar, 25"): 1) Find that "foo" is an IDENT. 2) Get the "," with look ahead. 3) Resolve the IDENT to "comma_separated_identifiers". Create the new Vector<CSSParserString>. 4) (Resolve the maybe_space) 5) Find that bar is an IDENT. 6) Get the "," with look ahead. 7) Resolve "comma_separated_identifiers ',' maybe_space IDENT" to comma_separated_identifiers", append the new IDENT. 8) Find that "25" is not an IDENT 9) Fails to parse this rule. Now the problem is the Vector allocated in [3] was never freed by the parser. To solve that, you will need one more case handling the error and deleting the vector. You can look at the rule for "selector_list" as an example.
> Source/WebCore/css/CSSParserValues.cpp:255 > +void CSSParserSelector::adoptStringVector(Vector<CSSParserString>& argumentVector)
I think I would prefer setArgumentList() here instead of adoptStringVector(). The function does not really adopt the values or the vector, it creates a new vector based on the identifiers.
> Source/WebCore/css/CSSParserValues.cpp:256 > +{
At the function entry, it would be good to test for valid input: ASSERT_WITH_MESSAGE(!argumentVector.isEmpty(), "No CSS Selector accept an empty argument list.");
> Source/WebCore/css/CSSParserValues.cpp:257 > + auto argumentList = std::make_unique<Vector<AtomicString>>();
After creating the vector, you can initialize its memory: argumentList->reserveInitialCapacity(argumentVector->size()); That way the following loop does not need to grow the vector dynamically, the vector will already be of the right size.
> Source/WebCore/css/CSSParserValues.cpp:258 > + for (const auto& argument : argumentVector)
I would prefer the type CSSParserString instead of "auto" here to avoid any confusion between CSSParserString and AtomicString when reading the code.
> Source/WebCore/css/CSSParserValues.cpp:259 > + argumentList->append(static_cast<String>(argument));
Wouldn't this code work without the static_cast through implicit conversion? If not, the cast should be to AtomicString instead of String.
> Source/WebCore/css/CSSParserValues.h:203 > +#if ENABLE(CSS_SELECTORS_LEVEL4) > + void adoptStringVector(Vector<CSSParserString>& stringVector); > +#endif
Let's move this after the definition of adoptSelectorVector().
> Source/WebCore/css/CSSSelector.cpp:257 > +static void appendArgumentList(StringBuilder& str, const Vector<AtomicString>* argumentList)
Let's pass argumenList by reference instead of pointer.
> Source/WebCore/css/CSSSelector.cpp:260 > + if (!argumentList) > + return;
Let's remove this.
> Source/WebCore/css/CSSSelector.cpp:261 > + for (auto& argument : *argumentList) {
auto& -> const AtomicString& This is a personal preference, other reviewers would not necessary say the same. I prefer using "auto" only when the type is obvious from an operation on the same line.
> Source/WebCore/css/CSSSelector.cpp:262 > + if (argument != argumentList->last())
It is better to evaluate "argumentList->last()" out of the loop and keep the reference in a temporary variable. The reason is that the compiler will have to reevaluate this expression from memory every time. The implementation of last() is: T& last() { return at(size() - 1); } To do that, the code will have to read the size from memory, subtract 1 from the size, load the AtomicString from memory, and compare the string to argument. Why I don't think the compiler will optimize this out of the loop by itself? The reason is StringBuilder::appendLiteral() and StringBuilder::append() will eventually do one out-of-line function call. The compiler does not know what those function does to the memory. Because of that unknown, it cannot ensure that the input vector was not modified, and it cannot prove that argumentList->last() will return the same value. Thus it needs to evaluate that code every time.
> Source/WebCore/css/CSSSelector.cpp:413 > + appendArgumentList(str, cs->argumentList()); > + str.append(')');
Your grammar make it incorrect to have an empty list. Here you could do: ASSERT_WITH_MESSAGE(cs->argumentList() && !cs->argumentList().isEmpty(), "An empty :lang() is invalid and should never be generated by the parser.");
> Source/WebCore/css/CSSSelector.h:318 > AtomicString m_argument; // Used for :contains, :lang and :nth-*
Let's remove :lang from this comment.
> Source/WebCore/css/CSSSelector.h:320 > + std::unique_ptr<Vector<AtomicString>> m_argumentList; // Used for :lang
You can remove the comment, the code is clear enough IMHO.
> Source/WebCore/css/SelectorChecker.cpp:866 > + for (const auto& argument : *argumentList) { > + if (matchesLangPseudoClass(element, argument.impl()))
We will need to be more efficient than that, but we'll see that in the next patch.
> Source/WebCore/cssjit/SelectorCompiler.cpp:734 > + if (argumentList->isEmpty() || argumentList->size() > 1)
You could assert that argumentList->isEmpty() here instead of checking at runtime.
> Source/WebCore/cssjit/SelectorCompiler.cpp:737 > + const AtomicString& argument = argumentList->at(0);
at(0) = first()
> LayoutTests/ChangeLog:4 > + Extend the :lang() pseudo-class to accept comma-separated list arguments > +
https://bugs.webkit.org/show_bug.cgi?id=138170
The way you can test this without implementing SelectorChecker is through CSSOM. For example, you can see the tests added in: -
http://trac.webkit.org/changeset/173698
-
http://trac.webkit.org/changeset/174259
Dhi Aurrahman
Comment 4
2014-10-29 23:06:47 PDT
Created
attachment 240660
[details]
Patch
Dhi Aurrahman
Comment 5
2014-10-29 23:12:43 PDT
Created
attachment 240661
[details]
Patch
Benjamin Poulain
Comment 6
2014-10-30 00:14:32 PDT
Sorry, I was not clear. You don't need to implement the matching of multiple languages now but you need to keep the current functionality working. In SelectorChecker.cpp, you can do something like: #if ENABLE(CSS_SELECTORS_LEVEL4) const AtomicString& argument = selector->argumentList()->first(); #else const AtomicString& argument = selector->argument(); #endif And something similar for the compiler. Did you find what was causing the crash?
Benjamin Poulain
Comment 7
2014-10-30 00:17:37 PDT
Comment on
attachment 240661
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=240661&action=review
Quick look before going to bed:
> Source/WebCore/css/CSSSelector.cpp:263 > + str.append(',');
This should be ", " for CSSOM.
> LayoutTests/ChangeLog:12 > + * fast/css/css-selector-text-expected.txt: Updated > + * fast/css/css-selector-text.html: Updated
Let's also have tests make sure that: -empty :lang(). -invalid language cause parsing errors.
Dhi Aurrahman
Comment 8
2014-10-30 00:59:33 PDT
Created
attachment 240662
[details]
Patch
Dhi Aurrahman
Comment 9
2014-10-30 01:01:14 PDT
(In reply to
comment #6
)
> Sorry, I was not clear. > > You don't need to implement the matching of multiple languages now but you > need to keep the current functionality working. > > In SelectorChecker.cpp, you can do something like: > > #if ENABLE(CSS_SELECTORS_LEVEL4) > const AtomicString& argument = selector->argumentList()->first(); > #else > const AtomicString& argument = selector->argument(); > #endif
Yes, I'm sorry I forgot to put --no-review, I was just wanted to know which tests are affected.
> > And something similar for the compiler. > > Did you find what was causing the crash?
I cleanup the build and hey, it works.
Dhi Aurrahman
Comment 10
2014-10-30 02:12:43 PDT
Created
attachment 240663
[details]
Patch
Dhi Aurrahman
Comment 11
2014-10-30 16:03:22 PDT
Created
attachment 240711
[details]
Patch
Benjamin Poulain
Comment 12
2014-10-30 17:51:28 PDT
Comment on
attachment 240711
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=240711&action=review
The patch looks correct. cq- because I had some ideas for test coverage but everything looks good.
> Source/WebCore/css/CSSParserValues.cpp:263 > + ASSERT_WITH_MESSAGE(!stringVector.isEmpty(), "No CSS Selector accept an empty argument list.");
accept->accepts/takes
> LayoutTests/fast/css/css-selector-text.html:86 > +testSelectorRoundTrip(":lang(fr, fr-ca, fr-be)");
Can you please add a test case with: "de-DE, de-DE-1996, de-Latn-DE, de-Latf-DE, de-Latn-DE-1996, de-CH, it-CH, fr-CH, rm-CH"? They are the examples in the spec, it would be a good idea to have them in our parsing tests. Let's also have :lang(lang) to make sure the internal lang is recognized properly.
> LayoutTests/fast/css/css-selector-text.html:325 > +shouldBe("parseThenSerializeRule(':lang(a ) { }')", "':lang(a) { }'"); > +shouldBe("parseThenSerializeRule(':lang(a, b, c ) { }')", "':lang(a, b, c) { }'"); > +shouldBe("parseThenSerializeRule(':lang(a, b, c ) { }')", "':lang(a, b, c) { }'"); > +shouldBe("parseThenSerializeRule(':lang(a,b, c,d,e ) { }')", "':lang(a, b, c, d, e) { }'");
Looks like we need one ore two more cases of normalization: when there is one ore more spaces after the opening parenthesis. E.g.: :lang( a) :lang( a ) :lang( a , b , c )
> LayoutTests/fast/selectors/lang-invalid.html:28 > +description('Verify invalid :lang() selectors.');
Awesome.
> LayoutTests/fast/selectors/lang-invalid.html:30 > +shouldThrow('document.querySelectorAll(":lang()").length');
Let's add cases for various valid tokens and keyword: -:lang(:lang(en)) -:lang()) -:lang({}) -:lang({) -:lang(}}) -:lang([]) -:lang([) -:lang(]) -:lang(@media screen {}) ...that's all I can think of, feel free to add other cases that could be break parsing.
Dhi Aurrahman
Comment 13
2014-10-30 23:33:59 PDT
Created
attachment 240726
[details]
Patch
Benjamin Poulain
Comment 14
2014-10-30 23:39:38 PDT
Comment on
attachment 240726
[details]
Patch Lots of good ideas in the new tests. Let's r+ already. I'll wait for the bots to cq+.
Dhi Aurrahman
Comment 15
2014-10-31 01:31:40 PDT
Created
attachment 240729
[details]
Patch
Dhi Aurrahman
Comment 16
2014-10-31 01:38:11 PDT
(In reply to
comment #15
)
> Created
attachment 240729
[details]
> Patch
1. It removes the 'warning: unset value: $$' 2. Add some tests on multiple arguments, but some of it are invalid. e.g. shouldThrow("parseThenSerializeRule(':lang(12, b, c) { }')"); shouldThrow("parseThenSerializeRule(':lang(a, 12, c) { }')"); shouldThrow("parseThenSerializeRule(':lang(a, b, 12) { }')"); shouldThrow("parseThenSerializeRule(':lang(a, 12, 12) { }')"); shouldThrow("parseThenSerializeRule(':lang(12, b, 12) { }')");
Benjamin Poulain
Comment 17
2014-10-31 19:29:15 PDT
Comment on
attachment 240729
[details]
Patch Great job. Let's land it.
WebKit Commit Bot
Comment 18
2014-10-31 20:07:38 PDT
Comment on
attachment 240729
[details]
Patch Clearing flags on attachment: 240729 Committed
r175446
: <
http://trac.webkit.org/changeset/175446
>
WebKit Commit Bot
Comment 19
2014-10-31 20:07:43 PDT
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug