Bug 39363 - Fix url tokenization in CSS
: Fix url tokenization in CSS
Status: NEW
Product: WebKit
Classification: Unclassified
Component: CSS
: 528+ (Nightly build)
: Macintosh Mac OS X 10.5
: P2 Normal
Assigned To: Glenn Adams
http://test.csswg.org/source/contribu...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-05-19 09:22 PDT by Simon Fraser (smfr)
Modified: 2012-10-03 17:59 PDT (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Fraser (smfr) 2010-05-19 09:22:56 PDT
We fail a bunch of the URL tests in this testcase:
http://test.csswg.org/source/contributors/mozilla/incoming/reftests/bugs/invalid-url-handling.xht
Comment 1 Simon Fraser (smfr) 2010-05-19 09:24:23 PDT
Note that not all of the tests in that file are necessarily valid tests.
Comment 2 Alexey Proskuryakov 2010-05-19 10:51:52 PDT
See also: bug 28885.
Comment 3 Benjamin Poulain 2012-10-03 12:10:31 PDT
I suggest to make a reduction for fast/url, and upload it with failing results.

For info, there is now a working version (albeit incomplete) of WTFRL in the tree.
Comment 4 Benjamin Poulain 2012-10-03 12:27:03 PDT
Looking at the failing cases, it does not looks like a URL parsing error. All the invalid URL are considered invalid by KURL.

The tests fail due to the way we handle some invalid input in CSS.

Examples:

Test 3:
  #three { background-color: green; }
  #foo { background: url(foo"bar) }
  #three { background-color: red; }
->The test expect the last #three to be ignored due to the error on the second line. I don't see why that test is correct.

Test 4:
  /* not a URI token; the unterminated string ends at end of line */
  #foo { background: url(foo"bar) }
  ) }
  #four { background-color: green; }
 ->The URL is invalid, it is the CSS parsing that is incomplete.

etc.
Comment 5 Glenn Adams 2012-10-03 17:39:45 PDT
(In reply to comment #4)
> Looking at the failing cases, it does not looks like a URL parsing error. All the invalid URL are considered invalid by KURL.
> 
> The tests fail due to the way we handle some invalid input in CSS.
> 
> Examples:
> 
> Test 3:
>   #three { background-color: green; }
>   #foo { background: url(foo"bar) }
>   #three { background-color: red; }
> ->The test expect the last #three to be ignored due to the error on the second line. I don't see why that test is correct.

The second line should tokenize as follows (by CSS2.1 lexer rules):

...
BADURL(foo)
BADSTRING(bar ) })

causing the third line to appear as a continuation of the declaration started in the second line, and thus producing a MalformedDeclaration recovery, in which case the background-color: red is ignored

At present, WK doesn't tokenize in this manner, but instead returns

FUNCTION(url)
IDENT(foo)
"
IDENT(bar)
...

which eventually matches:

function: FUNCTION maybe_space error

since the " token is not a valid operator for combining terms in an expression.

Consequently, WK finishes the declaration on the second line without continuing to third line, and the background-color red is applied.

> 
> Test 4:
>   /* not a URI token; the unterminated string ends at end of line */
>   #foo { background: url(foo"bar) }
>   ) }
>   #four { background-color: green; }
>  ->The URL is invalid, it is the CSS parsing that is incomplete.

The same logic applies as above, but here the ") }" on third line causes the fourth line to be ignored (in WK) as a MalformedStatement. If WK had consumed the string as BADSTRING(bar ) }), then the declaration of second line would complete on the third line's }, in which case the fourth line would parse.

> 
> etc.
Comment 6 Glenn Adams 2012-10-03 17:49:47 PDT
(In reply to comment #3)
> I suggest to make a reduction for fast/url, and upload it with failing results.

Working on it now.

> For info, there is now a working version (albeit incomplete) of WTFRL in the tree.

I guess you mean WTFURL, i.e., WTF/wtf/url/api, yes?
Comment 7 Benjamin Poulain 2012-10-03 17:56:19 PDT
> (In reply to comment #3)
> > I suggest to make a reduction for fast/url, and upload it with failing results.
> 
> Working on it now.
> 
> > For info, there is now a working version (albeit incomplete) of WTFRL in the tree.
> 
> I guess you mean WTFURL, i.e., WTF/wtf/url/api, yes?

Yep, WTFURL.

After you last comment, I doubt this require tests for URLs. It looks like the CSS tokenizer is the issue here. Isn't it?
Comment 8 Glenn Adams 2012-10-03 17:59:01 PDT
(In reply to comment #7)
> > (In reply to comment #3)
> > > I suggest to make a reduction for fast/url, and upload it with failing results.
> > 
> > Working on it now.
> > 
> > > For info, there is now a working version (albeit incomplete) of WTFRL in the tree.
> > 
> > I guess you mean WTFURL, i.e., WTF/wtf/url/api, yes?
> 
> Yep, WTFURL.
> 
> After you last comment, I doubt this require tests for URLs. It looks like the CSS tokenizer is the issue here. Isn't it?

correct