Extend :lang() with string arguments
Created attachment 243358 [details] Patch
Comment on attachment 243358 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=243358&action=review > LayoutTests/fast/css/css-lang-selector-with-string-arguments-text-expected.txt:13 > +PASS parseThenSerializeRule(':lang("*-1997", "*-1998") { }') is ':lang(*-1997, *-1998) { }' Per David Baron's comment http://lists.w3.org/Archives/Public/www-style/2014Dec/0169.html it can be done correctly. > LayoutTests/fast/css/css-lang-selector-with-string-arguments-text-expected.txt:14 > +PASS parseThenSerializeRule(':lang(" *-1997 ", " *-1998 ") { }') is ':lang(*-1997, *-1998) { }' This is my speculation, yet my own assumption for striping the white spaces inside the "". I'm not sure if this is the expected behavior. Or, is the "" specially dedicated for escaping non-css valid values and has the same behavior with \\? Hence, :lang("<space>*-1997<space>") will give :lang("<space>*-1997<space>") instead of :lang(*-1997)? > LayoutTests/fast/css/css-lang-selector-with-string-arguments-text-expected.txt:17 > +PASS parseThenSerializeRule(':lang(" ") { }') threw exception TypeError: undefined is not an object (evaluating 'styleElement.sheet.cssRules[0].cssText'). Yet another assumption, since the argument only provides white spaces. > LayoutTests/fast/css/css-lang-selector-with-string-arguments-text-expected.txt:19 > +PASS parseThenSerializeRule(':lang("* - 1997") { }') threw exception TypeError: undefined is not an object (evaluating 'styleElement.sheet.cssRules[0].cssText'). This one is an attempt to be consistent with previous assumptions. After left and right stripping, if the value inside "" (the string) still has white spaces, then it is invalid.
Comment on attachment 243358 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=243358&action=review > Source/WebCore/css/CSSGrammar.y.in:1109 > + $1.stripWhiteSpace(); I am not sure we want to do that. I would just use the literal as given. > Source/WebCore/css/CSSGrammar.y.in:1118 > + if ($1.length() > 0) { I am not sure we should have an exception for empty strings. > Source/WebCore/css/CSSGrammar.y.in:1125 > + if ($$ && $5.length() > 0) ditto > Source/WebCore/css/CSSGrammar.y.in:1406 > + if ($4 && $4->size() > 0) { ditto >> LayoutTests/fast/css/css-lang-selector-with-string-arguments-text-expected.txt:13 >> +PASS parseThenSerializeRule(':lang("*-1997", "*-1998") { }') is ':lang(*-1997, *-1998) { }' > > Per David Baron's comment http://lists.w3.org/Archives/Public/www-style/2014Dec/0169.html it can be done correctly. Yup, that's awesome >> LayoutTests/fast/css/css-lang-selector-with-string-arguments-text-expected.txt:14 >> +PASS parseThenSerializeRule(':lang(" *-1997 ", " *-1998 ") { }') is ':lang(*-1997, *-1998) { }' > > This is my speculation, yet my own assumption for striping the white spaces inside the "". I'm not sure if this is the expected behavior. > > Or, is the "" specially dedicated for escaping non-css valid values and has the same behavior with \\? Hence, :lang("<space>*-1997<space>") will give :lang("<space>*-1997<space>") instead of :lang(*-1997)? I would not try to do any additional rules over the string. Anything passed as a string would be interpreted as it is. Using the string as-is gives a lot more power for developers, they can literally use anything for language, including space characters. > LayoutTests/fast/css/css-lang-selector-with-string-arguments-text-expected.txt:16 > +PASS parseThenSerializeRule(':lang("") { }') threw exception TypeError: undefined is not an object (evaluating 'styleElement.sheet.cssRules[0].cssText'). This case is a bit annoying. I am torn between an invalid selector or a selector that never match anything.
Comment on attachment 243358 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=243358&action=review >>> LayoutTests/fast/css/css-lang-selector-with-string-arguments-text-expected.txt:14 >>> +PASS parseThenSerializeRule(':lang(" *-1997 ", " *-1998 ") { }') is ':lang(*-1997, *-1998) { }' >> >> This is my speculation, yet my own assumption for striping the white spaces inside the "". I'm not sure if this is the expected behavior. >> >> Or, is the "" specially dedicated for escaping non-css valid values and has the same behavior with \\? Hence, :lang("<space>*-1997<space>") will give :lang("<space>*-1997<space>") instead of :lang(*-1997)? > > I would not try to do any additional rules over the string. Anything passed as a string would be interpreted as it is. > > Using the string as-is gives a lot more power for developers, they can literally use anything for language, including space characters. Understood. In that case, we can just add: lang_range: LANGRANGE | IDENT | STRING to the grammar. Nice! >> LayoutTests/fast/css/css-lang-selector-with-string-arguments-text-expected.txt:16 >> +PASS parseThenSerializeRule(':lang("") { }') threw exception TypeError: undefined is not an object (evaluating 'styleElement.sheet.cssRules[0].cssText'). > > This case is a bit annoying. I am torn between an invalid selector or a selector that never match anything. Yes. If we consider this as a valid selector, it will give us :lang() which is invalid in "normal" setup
Created attachment 243492 [details] Patch
Created attachment 243596 [details] Patch
Created attachment 243598 [details] Patch
Created attachment 243599 [details] Patch
Created attachment 243600 [details] Patch
Comment on attachment 243600 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=243600&action=review > LayoutTests/fast/selectors/lang-extended-filtering-with-string-arguments-expected.txt:10 > +PASS document.querySelectorAll(':lang("")').length is 0 test on :lang("") to give never matching. > LayoutTests/fast/selectors/lang-extended-filtering-with-string-arguments-expected.txt:12 > +PASS document.querySelectorAll(':lang("", "*-1997")').length is 1 given: en, "", de-1997. :lang("en", "*-1997") gives two, while :lang("", *-1997) gives only one.
Comment on attachment 243600 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=243600&action=review > LayoutTests/fast/css/css-lang-selector-with-string-arguments-text-expected.txt:6 > +PASS parseThenSerializeRule(':lang("a") { }') is ':lang(a) { }' I think ideally the canonicalization would keep the same form as the original token. The canonical version of :lang( a ) would be :lang(a). The canonical version of :lang( "a" ) would be :lang("a"). WebKit has other canonicalization bugs and I doubt this one is critical so I am okay with keeping the internal representation for simplicity. But I think it may be better to always canonicalize :lang() to string instead of identifier. For example, :lang(foo, \*-bar) would be canonicalized to :lang("foo", "*-bar"). Do you have an opinion on this? If we do that, we'll need a massive test result rebaseline... :(
Comment on attachment 243600 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=243600&action=review >> LayoutTests/fast/css/css-lang-selector-with-string-arguments-text-expected.txt:6 >> +PASS parseThenSerializeRule(':lang("a") { }') is ':lang(a) { }' > > I think ideally the canonicalization would keep the same form as the original token. > > The canonical version of :lang( a ) would be :lang(a). The canonical version of :lang( "a" ) would be :lang("a"). > > WebKit has other canonicalization bugs and I doubt this one is critical so I am okay with keeping the internal representation for simplicity. But I think it may be better to always canonicalize :lang() to string instead of identifier. For example, > :lang(foo, \*-bar) would be canonicalized to :lang("foo", "*-bar"). > > Do you have an opinion on this? > If we do that, we'll need a massive test result rebaseline... :( I'm not sure I'm familiar with this canonicalization in css selector context and its impact to selector checker. In order to always canonicalize :lang() to string, does that mean the result of parseThenSerializeRule() should be in string (with quotes)? Should we change the appendArgumentList() in CSSSelector too? Does this specifically only give impact to :lang()? Anyway, I'm ok to do test result rebaseline. > WebKit has other canonicalization bugs Can I get more info about this? :)
Comment on attachment 243600 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=243600&action=review > Source/WebCore/css/SelectorCheckerTestFunctions.h:161 > - if (language.isNull()) > + if (language.isNull() || language.isEmpty()) All null strings are empty, so we don’t need to check both. Should change isNull to isEmpty rather than adding a new check. > Source/WebCore/css/SelectorCheckerTestFunctions.h:173 > + if (range.isEmpty()) > + continue; Why exactly is this needed? Which of the tests fails without this line of code? What kind of failure do we see?
Comment on attachment 243600 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=243600&action=review >> Source/WebCore/css/SelectorCheckerTestFunctions.h:161 >> + if (language.isNull() || language.isEmpty()) > > All null strings are empty, so we don’t need to check both. Should change isNull to isEmpty rather than adding a new check. OK, I'll update it. Thanks. >> Source/WebCore/css/SelectorCheckerTestFunctions.h:173 >> + continue; > > Why exactly is this needed? Which of the tests fails without this line of code? What kind of failure do we see? From my understanding, the "" (empty string) argument for :lang() is always treated as never matching, e.g. if we have :lang("", "*-1997", de-DE), I believe it is better to jump right away to the next range (*-1997) as soon as the current range (e.g. the first range "") is detected as an empty string.
(In reply to comment #14) > > Why exactly is this needed? Which of the tests fails without this line of code? What kind of failure do we see? > > From my understanding, the "" (empty string) argument for :lang() is always > treated as never matching, e.g. if we have :lang("", "*-1997", de-DE), I > believe it is better to jump right away to the next range (*-1997) as soon > as the current range (e.g. the first range "") is detected as an empty > string. Darin's argument is that the empty string would not match anyway in the later code. But I am not sure we can remove the isEmpty() check. I suspect we will crash at rangeSubtags.first() since rangeSubtags would be empty.
(In reply to comment #12) > Comment on attachment 243600 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=243600&action=review > > >> LayoutTests/fast/css/css-lang-selector-with-string-arguments-text-expected.txt:6 > >> +PASS parseThenSerializeRule(':lang("a") { }') is ':lang(a) { }' > > > > I think ideally the canonicalization would keep the same form as the original token. > > > > The canonical version of :lang( a ) would be :lang(a). The canonical version of :lang( "a" ) would be :lang("a"). > > > > WebKit has other canonicalization bugs and I doubt this one is critical so I am okay with keeping the internal representation for simplicity. But I think it may be better to always canonicalize :lang() to string instead of identifier. For example, > > :lang(foo, \*-bar) would be canonicalized to :lang("foo", "*-bar"). > > > > Do you have an opinion on this? > > If we do that, we'll need a massive test result rebaseline... :( > > I'm not sure I'm familiar with this canonicalization in css selector context > and its impact to selector checker. The canonical form is defined by CSSOM: http://dev.w3.org/csswg/cssom/#serializing-selectors I *think* the only relavant entry points in WebKit are through CSSSelector::selectorText() and CSSSelectorList::selectorText(). > In order to always canonicalize :lang() to string, does that mean the result > of parseThenSerializeRule() should be in string (with quotes)? Yep. > Should we > change the appendArgumentList() in CSSSelector too? Does this specifically > only give impact to :lang()? Since appendArgumentList() is only used by :lang(), we can just update its definition. Something like appendLanguageArgumentList/appendLangArgumentList/appendStringArgumentList? > Anyway, I'm ok to do test result rebaseline. > > > WebKit has other canonicalization bugs > Can I get more info about this? :) The canonical form of Selectors is supposed to escape identifiers characters and escape some characters of strings. Take http://codepen.io/anon/pen/vEXOYx for example. Firefox serializes it nicely, WebKit just returns the parsed identifier. - Note that we have at least one more thing to do for :lang() before we can qualify it: evaluate performance. We should make sure the matching is efficient, WebKit should be faster than the other engines.
Comment on attachment 243600 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=243600&action=review >>>> LayoutTests/fast/css/css-lang-selector-with-string-arguments-text-expected.txt:6 >>>> +PASS parseThenSerializeRule(':lang("a") { }') is ':lang(a) { }' >>> >>> I think ideally the canonicalization would keep the same form as the original token. >>> >>> The canonical version of :lang( a ) would be :lang(a). The canonical version of :lang( "a" ) would be :lang("a"). >>> >>> WebKit has other canonicalization bugs and I doubt this one is critical so I am okay with keeping the internal representation for simplicity. But I think it may be better to always canonicalize :lang() to string instead of identifier. For example, >>> :lang(foo, \*-bar) would be canonicalized to :lang("foo", "*-bar"). >>> >>> Do you have an opinion on this? >>> If we do that, we'll need a massive test result rebaseline... :( >> >> I'm not sure I'm familiar with this canonicalization in css selector context and its impact to selector checker. >> >> In order to always canonicalize :lang() to string, does that mean the result of parseThenSerializeRule() should be in string (with quotes)? Should we change the appendArgumentList() in CSSSelector too? Does this specifically only give impact to :lang()? >> >> Anyway, I'm ok to do test result rebaseline. > > The canonical form is defined by CSSOM: http://dev.w3.org/csswg/cssom/#serializing-selectors > > I *think* the only relavant entry points in WebKit are through CSSSelector::selectorText() and CSSSelectorList::selectorText(). Hi Benjamin, I put the ideas to always canonicalize :lang() arguments as strings in following patch: https://bugs.webkit.org/show_bug.cgi?id=139928
Comment on attachment 243600 [details] Patch LGTM
Comment on attachment 243600 [details] Patch Clearing flags on attachment: 243600 Committed r177745: <http://trac.webkit.org/changeset/177745>
All reviewed patches have been landed. Closing bug.
Comment on attachment 243600 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=243600&action=review >>> Source/WebCore/css/SelectorCheckerTestFunctions.h:161 >>> + if (language.isNull() || language.isEmpty()) >> >> All null strings are empty, so we don’t need to check both. Should change isNull to isEmpty rather than adding a new check. > > OK, I'll update it. Thanks. Looks like the patch was landed without the change I requested here.
(In reply to comment #21) > Comment on attachment 243600 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=243600&action=review > > >>> Source/WebCore/css/SelectorCheckerTestFunctions.h:161 > >>> + if (language.isNull() || language.isEmpty()) > >> > >> All null strings are empty, so we don’t need to check both. Should change isNull to isEmpty rather than adding a new check. > > > > OK, I'll update it. Thanks. > > Looks like the patch was landed without the change I requested here. My bad, fixed here: http://trac.webkit.org/changeset/178198