Bug 21472 - escaped characters cause parse errors in idents (for example in @namespace rules)
Summary: escaped characters cause parse errors in idents (for example in @namespace ru...
Status: RESOLVED CONFIGURATION CHANGED
Alias: None
Product: WebKit
Classification: Unclassified
Component: CSS (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Nobody
URL: http://tc.labs.opera.com/css/namespac...
Keywords:
: 38587 (view as bug list)
Depends on:
Blocks:
 
Reported: 2008-10-08 06:14 PDT by Anne van Kesteren
Modified: 2022-07-23 12:44 PDT (History)
17 users (show)

See Also:


Attachments
proposed fix, tested (14.37 KB, patch)
2010-05-06 03:24 PDT, Daniel Glazman
hyatt: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Anne van Kesteren 2008-10-08 06:14:30 PDT
@import is not ignored despite it coming after @namespace.

http://dev.w3.org/csswg/css3-namespace/#syntax
Comment 1 Simon Hollingshead 2010-02-11 17:01:35 PST
Proof of issue shown at w3 testsuite case http://www.w3.org/Style/CSS/Test/CSS3/Namespace/20090210/syntax-006.xml
Comment 2 Alexey Proskuryakov 2010-05-05 13:57:21 PDT
*** Bug 38587 has been marked as a duplicate of this bug. ***
Comment 3 Daniel Glazman 2010-05-06 01:33:45 PDT
Ok I found the root of the problem and it's NOT related to @import, it's related to "url(" !!!

The test http://www.w3.org/Style/CSS/Test/CSS3/Namespace/20090210/syntax-006.xml
contains the following stylesheet:

   @namespace x u\00072l("test");
   @import url("support/fail.css");
   @namespace url("test2");
   x|test { background:lime }
   test { background:red }

and the first line fails because of u\00072l("test") !!!
So the second line, the @import rule, is applied when it should not.

Fundamentally, this bug should have its title changed. @import and @namespace
and even url() all are implemented correctly. This bug exists because escaped
characters FAIL in most idents. As an example,

  @i\mport "foo.css";

will always fail and never import the foo.css stylesheet...
Comment 4 Daniel Glazman 2010-05-06 03:24:00 PDT
Created attachment 55223 [details]
proposed fix, tested
Comment 5 Cedric Vivier 2010-05-06 04:48:48 PDT
(In reply to comment #4)
> Created an attachment (id=55223) [details]
> proposed fix, tested

Shouldn't the tokenizer lazily unescape escaped characters so that it does not potentially complexify/slow down the processing of all stylesheets to handle this (very?) rare use case?

Something like:

ESCAPED_CHAR    \\(0{0,4}[0-9][0-9](\r\n|[ \t\r\n\f])?) | [a-z])
{
    //unput is a flex function to set the next character to be scanned.
    //unescape is a function to implement (or likely rather reuse from somewhere) that returns an unescaped character.

    /*insert unescaped character */
    unput(unescape(yytext)); 

    /*eat up escaped sequence*/
}
Comment 6 Darin Adler 2010-05-06 07:48:47 PDT
Comment on attachment 55223 [details]
proposed fix, tested

This patch has tabs in it. Please make a new one without tabs.

How did you test the effect of this change on CSS parsing performance? What was the effect?
Comment 7 Darin Adler 2010-05-06 07:49:21 PDT
(In reply to comment #5)
> (In reply to comment #4)
> > Created an attachment (id=55223) [details] [details]
> > proposed fix, tested
> 
> Shouldn't the tokenizer lazily unescape escaped characters so that it does not
> potentially complexify/slow down the processing of all stylesheets to handle
> this (very?) rare use case?
> 
> Something like:
> 
> ESCAPED_CHAR    \\(0{0,4}[0-9][0-9](\r\n|[ \t\r\n\f])?) | [a-z])
> {
>     //unput is a flex function to set the next character to be scanned.
>     //unescape is a function to implement (or likely rather reuse from
> somewhere) that returns an unescaped character.
> 
>     /*insert unescaped character */
>     unput(unescape(yytext)); 
> 
>     /*eat up escaped sequence*/
> }

That will probably work better, yes.
Comment 8 Alexey Proskuryakov 2010-05-06 08:34:33 PDT
See also: bug 32045.
Comment 9 Daniel Glazman 2010-05-06 08:45:20 PDT
(In reply to comment #6)

> (From update of attachment 55223 [details])
> This patch has tabs in it. Please make a new one without tabs.
> 
> How did you test the effect of this change on CSS parsing performance? What was
> the effect?

I did not. This is my very first patch in the WebKit world and I'm not yet
familiar with the environment (10 years of mozillianism is barely an excuse;-)

I agree the proposed fix could be a wrong thing to do on a perf basis. But
the diagnosis of the bug is at least the right one. There are two issues here:

  - escaped chars in tokens
  - WebCore::CSSParser::text assuming a fixed length for functional notations
Comment 10 Darin Adler 2010-05-06 09:32:57 PDT
(In reply to comment #9)
> But the diagnosis of the bug is at least the right one.

Yes, agreed. Good work on that! Lets find an efficient fix. I think Cedric Vivier’s suggestion may be the best way to approach the tokenizer half of the problem.
Comment 11 Dave Hyatt 2010-05-06 18:53:17 PDT
Yeah, I think the current patch uglifies the tokenizer too much, and as a result will be minusing it.  I think what Cedric is suggesting will keep the tokenizer side much cleaner.
Comment 12 Daniel Glazman 2010-05-10 01:19:09 PDT
(In reply to comment #11)

> Yeah, I think the current patch uglifies the tokenizer too much, and as a result will be minusing it.  I think what Cedric is suggesting will keep the tokenizer side much cleaner.

Aaaaah :-) I missed hyatt's r- ;-)

BTW, this bug is the last outstanding bug on the CSS 3 Namespaces radar blocking the move to PR. See

  http://disruptive-innovations.com/zoo/csswg/NAMESPACES_IMP_REPORT/
Comment 13 Alexey Proskuryakov 2010-05-10 09:56:36 PDT
>   http://disruptive-innovations.com/zoo/csswg/NAMESPACES_IMP_REPORT/

What do the green "fail" boxes in Safari column mean? I can't find it explained on this page.
Comment 14 Daniel Glazman 2010-05-10 10:43:36 PDT
(In reply to comment #13)
> >   http://disruptive-innovations.com/zoo/csswg/NAMESPACES_IMP_REPORT/
> 
> What do the green "fail" boxes in Safari column mean? I can't find it explained on this page.

D'oh. Thanks for the heads up, I updated the classes on the table cells
but forgot the prose. Fixed now.
Comment 15 Alexey Proskuryakov 2011-02-21 15:51:21 PST
See also: bug 54838.
Comment 16 Ahmad Saleem 2022-07-23 09:07:25 PDT
Safari pass all tests in CSS Namespace - https://wpt.fyi/results/css/css-namespaces?label=master&label=experimental&aligned&q=namespace

Is anything further needed here? Thanks!

Safari even pass on Comment 03.
Comment 17 Brent Fulgham 2022-07-23 10:32:27 PDT
This was probably fixed with the CSS Parser import, @Antti, can you confirm and close?
Comment 18 Ryosuke Niwa 2022-07-23 12:44:50 PDT
Most certainly fixed by the new CSS parser.