RESOLVED FIXED 138281
Add selector checker for :lang pseudo class in Selectors level 4
https://bugs.webkit.org/show_bug.cgi?id=138281
Summary Add selector checker for :lang pseudo class in Selectors level 4
Dhi Aurrahman
Reported 2014-11-01 11:24:42 PDT
Add selector checker for :lang pseudo class in Selectors level 4
Attachments
Patch (12.45 KB, patch)
2014-11-01 11:33 PDT, Dhi Aurrahman
no flags
Archive of layout-test-results from webkit-ews-09 for mac-mountainlion-wk2 (736.51 KB, application/zip)
2014-11-01 12:41 PDT, Build Bot
no flags
Archive of layout-test-results from webkit-ews-08 for mac-mountainlion (744.74 KB, application/zip)
2014-11-01 13:03 PDT, Build Bot
no flags
Patch (10.56 KB, patch)
2014-11-08 18:46 PST, Dhi Aurrahman
no flags
Patch (13.95 KB, patch)
2014-11-14 08:22 PST, Dhi Aurrahman
no flags
Patch (16.64 KB, patch)
2014-11-15 04:46 PST, Dhi Aurrahman
no flags
Patch (16.69 KB, patch)
2014-11-15 05:28 PST, Dhi Aurrahman
no flags
Patch (29.90 KB, patch)
2014-11-17 22:08 PST, Dhi Aurrahman
no flags
Patch (30.51 KB, patch)
2014-11-18 02:02 PST, Dhi Aurrahman
no flags
Patch (30.55 KB, patch)
2014-11-18 02:17 PST, Dhi Aurrahman
no flags
Patch (30.39 KB, patch)
2014-11-18 03:18 PST, Dhi Aurrahman
no flags
Patch (30.76 KB, patch)
2014-11-18 23:01 PST, Dhi Aurrahman
no flags
Dhi Aurrahman
Comment 1 2014-11-01 11:33:10 PDT
Dhi Aurrahman
Comment 2 2014-11-01 11:34:46 PDT
WIP
Build Bot
Comment 3 2014-11-01 12:41:31 PDT
Created attachment 240792 [details] Archive of layout-test-results from webkit-ews-09 for mac-mountainlion-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: webkit-ews-09 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.5
Build Bot
Comment 4 2014-11-01 13:03:52 PDT
Created attachment 240793 [details] Archive of layout-test-results from webkit-ews-08 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-08 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Dhi Aurrahman
Comment 5 2014-11-08 18:46:30 PST
Benjamin Poulain
Comment 6 2014-11-10 11:15:48 PST
Comment on attachment 241244 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=241244&action=review The spec says "case-insensitively within the ASCII range". I think equalIgnoringCase() does case insensitive comparison for any character range. I'll need to check that. We should make test for this in any case. For example: "È" and "è" should not match. We may have to do that in a separate patch though. > Source/WebCore/css/SelectorCheckerTestFunctions.h:145 > + if (value.startsWith(filter, false) && value.impl()->at(filter.length()) == '-') I am not certain this is still correct with the new definition of :lang(). For example, if you match: :lang(en-) with <div lang="en--">, should it match? > Source/WebCore/css/SelectorCheckerTestFunctions.h:156 > + if ((offset = value.find(subTag, offset, false)) == notFound) > + return false; > + offset += subTag.length(); I don't think this is correct. The subTag in value might not be surrounded by dashes. For example: :lang("fr-BE") on <div lang="fr-paddingBEpadding"> should not match. > LayoutTests/ChangeLog:11 > + * fast/selectors/lang-multiple-expected.txt: Added. > + * fast/selectors/lang-multiple.html: Added. I would also test the implementation with: -Various cases of dashes, try to break the implementation. -Case sensitivity outside the ASCII range.
Dhi Aurrahman
Comment 7 2014-11-10 16:47:15 PST
(In reply to comment #6) > Comment on attachment 241244 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=241244&action=review > > The spec says "case-insensitively within the ASCII range". I think > equalIgnoringCase() does case insensitive comparison for any character range. > I'll need to check that. We should make test for this in any case. For > example: "È" and "è" should not match. We may have to do that in a separate > patch though. > > > Source/WebCore/css/SelectorCheckerTestFunctions.h:145 > > + if (value.startsWith(filter, false) && value.impl()->at(filter.length()) == '-') > > I am not certain this is still correct with the new definition of :lang(). > > For example, if you match: :lang(en-) with <div lang="en--">, should it > match? > Should we invalidate any filters ends with dashes? > > Source/WebCore/css/SelectorCheckerTestFunctions.h:156 > > + if ((offset = value.find(subTag, offset, false)) == notFound) > > + return false; > > + offset += subTag.length(); > > I don't think this is correct. The subTag in value might not be surrounded > by dashes. > > For example: :lang("fr-BE") on <div lang="fr-paddingBEpadding"> should not > match. > > > LayoutTests/ChangeLog:11 > > + * fast/selectors/lang-multiple-expected.txt: Added. > > + * fast/selectors/lang-multiple.html: Added. > > I would also test the implementation with: > -Various cases of dashes, try to break the implementation. > -Case sensitivity outside the ASCII range.
Benjamin Poulain
Comment 8 2014-11-10 21:19:46 PST
(In reply to comment #7) > Should we invalidate any filters ends with dashes? I am not 100% sure. I have emailed that question to the standard working group: http://lists.w3.org/Archives/Public/www-style/2014Nov/0136.html
Dhi Aurrahman
Comment 9 2014-11-14 08:22:01 PST
Benjamin Poulain
Comment 10 2014-11-14 16:15:50 PST
Comment on attachment 241590 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=241590&action=review I know it is not marked for review but I made some quick comments anyway :) I like the new matchesLangPseudoClassWithCommaSeparatedArguments(). It is really easy to follow your code along the spec and verify each step. > Source/WebCore/css/SelectorCheckerTestFunctions.h:143 > + Vector<String> subVals; I prefer not abbreviating names: subValues Maybe it would be better to use the same names as the spec? -rangeSubtags -languageSubtags > Source/WebCore/css/SelectorCheckerTestFunctions.h:148 > + Extra blank line. > Source/WebCore/css/SelectorCheckerTestFunctions.h:154 > + if (subTags.size() > subVals.size()) Interesting, this is a good idea. > Source/WebCore/css/SelectorCheckerTestFunctions.h:160 > + size_t currentIndex = 1; lastMatchedLanguageTagIndex? > Source/WebCore/css/SelectorCheckerTestFunctions.h:163 > + for (size_t i = 1; i < subTags.size(); i++) { WebKit style: i++ -> ++i For nested loop, I prefer to give indices full names. This could be rangeSubtagIndex. > Source/WebCore/css/SelectorCheckerTestFunctions.h:164 > + bool notFound = true; I would reverse the boolean for clarity. bool matchedRangeSubtag = false; That makes conditions easier to read. Otherwise code like if (!notFound) gets tricky. > Source/WebCore/css/SelectorCheckerTestFunctions.h:166 > + for (size_t j = currentIndex; j < subVals.size(); j++) { WebKit style: j++ -> ++j "j" could be "languageSubtagIndex > Source/WebCore/css/SelectorCheckerTestFunctions.h:167 > + if (equalIgnoringCase(subTags.at(i), subVals.at(j)) && subVals.at(j) != "x") { I think the condition: subVals.at(j) != "x" is not enough. The specs says any singleton fails. > LayoutTests/fast/selectors/lang-multiple.html:29 > + <div lang="en"></div> > + <div lang="en-"></div> > + <div lang="en--"></div> Let's also cover the cases with "foo--bar". > LayoutTests/fast/selectors/lang-multiple.html:32 > + <div lang="fr-x"></div> > + <div lang="fr-xenomorph"></div> Let's have full extensions too: "fr-x-xenomorph". And other cases of singletons: "foo-a-bar" "foo-3-bar"
Dhi Aurrahman
Comment 11 2014-11-14 18:24:53 PST
Thanks Benjamin! I'll update the patch as soon as possible. I also created a playground (in .js) to confirm the filtering process http://codepen.io/anon/pen/wBvmzE hopefully could grab the ideas outlined in the spec.
Dhi Aurrahman
Comment 12 2014-11-14 18:29:23 PST
Dhi Aurrahman
Comment 13 2014-11-15 04:46:50 PST
Dhi Aurrahman
Comment 14 2014-11-15 05:28:44 PST
Benjamin Poulain
Comment 15 2014-11-17 15:01:32 PST
Comment on attachment 241663 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=241663&action=review Round1: only minor stuff really. The extended filtering is an important new part and it looks correct to me. The big thing will be using extended filtering when matching a single range. Extend the testing to those cases. > Source/WebCore/ChangeLog:14 > + Currently, the implementation of checking functionality of matchesLangPseudoClass and > + matchesLangPseudoClassWithCommaSeparatedArguments is similar. It should be merged. After the latest improvements, this is no longer true. The function matchesLangPseudoClassWithCommaSeparatedArguments() implements the Extended Filtering from RFC4647. The function matchesLangPseudoClass() only does the dumb matching for CSS Level 3. I think we should just rename matchesLangPseudoClass() to matchesLangPseudoClassDeprecated. The function matchesLangPseudoClassWithCommaSeparatedArguments() can then become matchesLangPseudoClass(). > Source/WebCore/css/SelectorChecker.cpp:868 > + if (selector->argumentList()->size() > 1) > + return matchesLangPseudoClassWithCommaSeparatedArguments(element, *selector->argumentList()); Given the improved algorithm, we should always use matchesLangPseudoClassWithCommaSeparatedArguments() even when size == 1. The code of matchesLangPseudoClassWithCommaSeparatedArguments() is more advanced due to extended filtering, it should become the new default for matching :lang(). > Source/WebCore/css/SelectorCheckerTestFunctions.h:141 > + It would be worth putting a link to the RFC in a comment here to explain what is implemented in the following code. > Source/WebCore/css/SelectorCheckerTestFunctions.h:145 > + String(language).split('-', true, languageSubtags); It is better to do language.string().split() and String(language).split(). The reason is that String(language) will construct a new String from the String reference of AtomicString. This in turn will cause ref() to be called to increase the ref-count. After that split is called, the destructor of String is called, which call deref(). On the other hand, language.string().split() does not cause a ref-deref of the original string. > Source/WebCore/css/SelectorCheckerTestFunctions.h:151 > + String(range).split('-', true, rangeSubtags); Ditto: range.string().split() > Source/WebCore/css/SelectorCheckerTestFunctions.h:159 > + size_t lastMatchedLanguageTagIndex = 1; lastMatchedLanguageTagIndex -> lastMatchedLanguageSubtagIndex for consistency. > Source/WebCore/css/SelectorCheckerTestFunctions.h:164 > + for (size_t languageSubtagIndex = lastMatchedLanguageTagIndex; languageSubtagIndex < languageSubtags.size(); ++languageSubtagIndex) { To clarify the code, maybe the loop should have its own function with a name describing its purpose? For example: bool findLanguageSubtagMatchingRange(const Vector<String>& languageSubtags, const String& range, size_t& position) This would let you use early return statement instead of break+boolean. > Source/WebCore/css/SelectorCheckerTestFunctions.h:165 > + String currentLanguageSubtag = languageSubtags.at(languageSubtagIndex); String currentLanguageSubtag = languageSubtags.at(languageSubtagIndex); Reference and bracket access: const String& currentLanguageSubtag = languageSubtage[i]; The bracket accessor is just a convention in WebKit. Using the reference avoids a ref-deref. It is okay to keep a reference here because we know that "languageSubtags" will outlive this scope. > Source/WebCore/css/SelectorCheckerTestFunctions.h:166 > + if (!currentLanguageSubtag.isEmpty() && currentLanguageSubtag.length() == 1) You should not need "currentLanguageSubtag.isEmpty()". The condition "currentLanguageSubtag.length() == 1" implies the language subtag cannot be empty. > Source/WebCore/css/SelectorCheckerTestFunctions.h:169 > + if (equalIgnoringCase(rangeSubtags.at(rangeSubtagIndex), currentLanguageSubtag)) { at() -> [] I would put rangeSubtags.at(rangeSubtagIndex) in a temporary outside the loop to avoid accessing the pointer for each language subtag. > Source/WebCore/cssjit/SelectorCompiler.cpp:732 > ASSERT(selector.argumentList() && !selector.argumentList()->isEmpty()); > + if (selector.argumentList()->size() > 1) > + return FunctionType::CannotCompile; > + Because of the extended matching, we should no longer matchesLangPseudoClass() even if there is a single range. For now we should just always return CannotCompile. > LayoutTests/ChangeLog:11 > + * fast/selectors/lang-multiple-expected.txt: Added. > + * fast/selectors/lang-multiple.html: Added. You have lots of interesting cases of extended filtering that are certainly not covered by any other tests. I think you should have a "lang-extended-filtering.html" with a copy of all the cases of extended filtering from lang-multiple.html
Dhi Aurrahman
Comment 16 2014-11-17 22:08:20 PST
Benjamin Poulain
Comment 17 2014-11-18 01:08:16 PST
Comment on attachment 241764 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=241764&action=review Quick review. I think you made a mistake when extracting "containslanguageSubtagMatchingRange()" ? > Source/WebCore/ChangeLog:20 > + * css/SelectorCheckerTestFunctions.h: Feel free to add your copyright in the header of this file if you want. > Source/WebCore/css/SelectorCheckerTestFunctions.h:127 > +inline bool containslanguageSubtagMatchingRange(const Vector<String>& languageSubtags, const String& range, const size_t& position) case: containsLangu... No need to pass the size_t by const reference, the size fits into a register. But shouldn't you pass position by reference and update it? (and name it lastMatchedLanguageSubtagIndex). > Source/WebCore/css/SelectorCheckerTestFunctions.h:154 > + // Implement basic and extended filterings of given language tags as specified in www.ietf.org/rfc/rfc4647.txt Missing period at the end of the sentence. > Source/WebCore/css/SelectorCheckerTestFunctions.h:179 > + lastMatchedLanguageSubtagIndex = rangeSubtagIndex + 1; This can't be right. This is mixing the language and range indices. There must be a way to cover this in tests. Maybe?: language: "foo-bar-bar-abc" range: "foo-abc-bar" "foo" matches. Then "abc" matches the fourth sub tag of the language. But the lastMatchedLanguageSubtagIndex is set to the third subtag. Next matching "bar" would match. I am just making this up, but there must be a way to test this case... > LayoutTests/fast/selectors/lang-extended-filtering.html:53 > + shouldBe('document.querySelectorAll(":lang(de-de)").length', '8'); > + shouldBe('document.querySelectorAll(":lang(de-DE)").length', '8'); > + shouldBe('document.querySelectorAll(":lang(DE-de)").length', '8'); > + shouldBe('document.querySelectorAll(":lang(dE-dE)").length', '8'); > + shouldBe('document.querySelectorAll(":lang(De-De)").length', '8'); > + shouldBe('document.querySelectorAll(":lang(DE-DE)").length', '8'); Nice. > LayoutTests/fast/selectors/lang-extended-filtering.html:71 > + debug('i = ' + i); > + shouldBeEqualToString('getComputedStyle(document.querySelectorAll(":lang(de-DE, DE-de)")[i]).color', 'rgb(1, 2, 3)'); Instead of using i as a global and using debug(), you can do this: shouldBeEqualToString('getComputedStyle(document.querySelectorAll(":lang(de-DE, DE-de)")[' + i + ']).color' That way i becomes a literal in the string.
Dhi Aurrahman
Comment 18 2014-11-18 02:02:11 PST
Dhi Aurrahman
Comment 19 2014-11-18 02:17:03 PST
Dhi Aurrahman
Comment 20 2014-11-18 03:18:22 PST
Benjamin Poulain
Comment 21 2014-11-18 20:39:01 PST
Did you add a test case for the lastMatchedLanguageSubtagIndex thingy?
Benjamin Poulain
Comment 22 2014-11-18 20:55:31 PST
Comment on attachment 241777 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=241777&action=review Everything looks good to me. A minor comment and a question about testing in the previous comment. > Source/WebCore/css/SelectorCheckerTestFunctions.h:186 > + if (!matchedRange) > + continue; > + return true; I would just inverse this for clarity. if (matchedRange) return true;
Dhi Aurrahman
Comment 23 2014-11-18 22:23:44 PST
(In reply to comment #21) > Did you add a test case for the lastMatchedLanguageSubtagIndex thingy? Yes, I have this <div lang="tic-tac-tac-toe"></div> And try to match with: 1. tic-tac-toe, it has to be matched 2. tic-toe-tac, it has to be failed. - `tic` will match then lastMatchedLanguageSubtagIndex=0, the next scanning process will start from 1 - `toe` will match then lastMatchedLanguageSubtagIndex=3, the next scanning process will start from 4. It causes no match for the next `tac` of the range. Is that enough? I'll think another case if it hasn't properly test the case.
Dhi Aurrahman
Comment 24 2014-11-18 23:01:11 PST
Benjamin Poulain
Comment 25 2014-11-19 00:18:53 PST
Comment on attachment 241847 [details] Patch Looks great! Let's land Once we have full support, I think we'll want to have some kind of test were we pass a list of ["language", "list of language range", expected=true/false]. That way we'll be able to write a huge list of inputs to test, maybe even every standardized language... ...that's for later though :)
WebKit Commit Bot
Comment 26 2014-11-19 00:57:05 PST
Comment on attachment 241847 [details] Patch Clearing flags on attachment: 241847 Committed r176313: <http://trac.webkit.org/changeset/176313>
WebKit Commit Bot
Comment 27 2014-11-19 00:57:10 PST
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.