Summary: | escaped characters cause parse errors in idents (for example in @namespace rules) | ||||||
---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Anne van Kesteren <annevk> | ||||
Component: | CSS | Assignee: | Nobody <webkit-unassigned> | ||||
Status: | RESOLVED CONFIGURATION CHANGED | ||||||
Severity: | Normal | CC: | ahmad.saleem792, ap, bdakin, bfulgham, cedricv, daniel, darin, hyatt, koivisto, me, mitz, mjs, rik, rniwa, simon.fraser, tonikitoo, zalan | ||||
Priority: | P2 | ||||||
Version: | 528+ (Nightly build) | ||||||
Hardware: | PC | ||||||
OS: | OS X 10.5 | ||||||
URL: | http://tc.labs.opera.com/css/namespaces/syntax-006.xml | ||||||
Attachments: |
|
Description
Anne van Kesteren
2008-10-08 06:14:30 PDT
Proof of issue shown at w3 testsuite case http://www.w3.org/Style/CSS/Test/CSS3/Namespace/20090210/syntax-006.xml *** Bug 38587 has been marked as a duplicate of this bug. *** Ok I found the root of the problem and it's NOT related to @import, it's related to "url(" !!! The test http://www.w3.org/Style/CSS/Test/CSS3/Namespace/20090210/syntax-006.xml contains the following stylesheet: @namespace x u\00072l("test"); @import url("support/fail.css"); @namespace url("test2"); x|test { background:lime } test { background:red } and the first line fails because of u\00072l("test") !!! So the second line, the @import rule, is applied when it should not. Fundamentally, this bug should have its title changed. @import and @namespace and even url() all are implemented correctly. This bug exists because escaped characters FAIL in most idents. As an example, @i\mport "foo.css"; will always fail and never import the foo.css stylesheet... Created attachment 55223 [details]
proposed fix, tested
(In reply to comment #4) > Created an attachment (id=55223) [details] > proposed fix, tested Shouldn't the tokenizer lazily unescape escaped characters so that it does not potentially complexify/slow down the processing of all stylesheets to handle this (very?) rare use case? Something like: ESCAPED_CHAR \\(0{0,4}[0-9][0-9](\r\n|[ \t\r\n\f])?) | [a-z]) { //unput is a flex function to set the next character to be scanned. //unescape is a function to implement (or likely rather reuse from somewhere) that returns an unescaped character. /*insert unescaped character */ unput(unescape(yytext)); /*eat up escaped sequence*/ } Comment on attachment 55223 [details]
proposed fix, tested
This patch has tabs in it. Please make a new one without tabs.
How did you test the effect of this change on CSS parsing performance? What was the effect?
(In reply to comment #5) > (In reply to comment #4) > > Created an attachment (id=55223) [details] [details] > > proposed fix, tested > > Shouldn't the tokenizer lazily unescape escaped characters so that it does not > potentially complexify/slow down the processing of all stylesheets to handle > this (very?) rare use case? > > Something like: > > ESCAPED_CHAR \\(0{0,4}[0-9][0-9](\r\n|[ \t\r\n\f])?) | [a-z]) > { > //unput is a flex function to set the next character to be scanned. > //unescape is a function to implement (or likely rather reuse from > somewhere) that returns an unescaped character. > > /*insert unescaped character */ > unput(unescape(yytext)); > > /*eat up escaped sequence*/ > } That will probably work better, yes. (In reply to comment #6) > (From update of attachment 55223 [details]) > This patch has tabs in it. Please make a new one without tabs. > > How did you test the effect of this change on CSS parsing performance? What was > the effect? I did not. This is my very first patch in the WebKit world and I'm not yet familiar with the environment (10 years of mozillianism is barely an excuse;-) I agree the proposed fix could be a wrong thing to do on a perf basis. But the diagnosis of the bug is at least the right one. There are two issues here: - escaped chars in tokens - WebCore::CSSParser::text assuming a fixed length for functional notations (In reply to comment #9) > But the diagnosis of the bug is at least the right one. Yes, agreed. Good work on that! Lets find an efficient fix. I think Cedric Vivier’s suggestion may be the best way to approach the tokenizer half of the problem. Yeah, I think the current patch uglifies the tokenizer too much, and as a result will be minusing it. I think what Cedric is suggesting will keep the tokenizer side much cleaner. (In reply to comment #11) > Yeah, I think the current patch uglifies the tokenizer too much, and as a result will be minusing it. I think what Cedric is suggesting will keep the tokenizer side much cleaner. Aaaaah :-) I missed hyatt's r- ;-) BTW, this bug is the last outstanding bug on the CSS 3 Namespaces radar blocking the move to PR. See http://disruptive-innovations.com/zoo/csswg/NAMESPACES_IMP_REPORT/ > http://disruptive-innovations.com/zoo/csswg/NAMESPACES_IMP_REPORT/
What do the green "fail" boxes in Safari column mean? I can't find it explained on this page.
(In reply to comment #13) > > http://disruptive-innovations.com/zoo/csswg/NAMESPACES_IMP_REPORT/ > > What do the green "fail" boxes in Safari column mean? I can't find it explained on this page. D'oh. Thanks for the heads up, I updated the classes on the table cells but forgot the prose. Fixed now. See also: bug 54838. Safari pass all tests in CSS Namespace - https://wpt.fyi/results/css/css-namespaces?label=master&label=experimental&aligned&q=namespace Is anything further needed here? Thanks! Safari even pass on Comment 03. This was probably fixed with the CSS Parser import, @Antti, can you confirm and close? Most certainly fixed by the new CSS parser. |