Bug 39363 - Fix url tokenization in CSS
Summary: Fix url tokenization in CSS
Status: RESOLVED CONFIGURATION CHANGED
Alias: None
Product: WebKit
Classification: Unclassified
Component: CSS (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac OS X 10.5
: P2 Normal
Assignee: Glenn Adams
URL: http://test.csswg.org/source/contribu...
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-19 09:22 PDT by Simon Fraser (smfr)
Modified: 2022-07-21 17:20 PDT (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Fraser (smfr) 2010-05-19 09:22:56 PDT
We fail a bunch of the URL tests in this testcase:
http://test.csswg.org/source/contributors/mozilla/incoming/reftests/bugs/invalid-url-handling.xht
Comment 1 Simon Fraser (smfr) 2010-05-19 09:24:23 PDT
Note that not all of the tests in that file are necessarily valid tests.
Comment 2 Alexey Proskuryakov 2010-05-19 10:51:52 PDT
See also: bug 28885.
Comment 3 Benjamin Poulain 2012-10-03 12:10:31 PDT
I suggest to make a reduction for fast/url, and upload it with failing results.

For info, there is now a working version (albeit incomplete) of WTFRL in the tree.
Comment 4 Benjamin Poulain 2012-10-03 12:27:03 PDT
Looking at the failing cases, it does not looks like a URL parsing error. All the invalid URL are considered invalid by KURL.

The tests fail due to the way we handle some invalid input in CSS.

Examples:

Test 3:
  #three { background-color: green; }
  #foo { background: url(foo"bar) }
  #three { background-color: red; }
->The test expect the last #three to be ignored due to the error on the second line. I don't see why that test is correct.

Test 4:
  /* not a URI token; the unterminated string ends at end of line */
  #foo { background: url(foo"bar) }
  ) }
  #four { background-color: green; }
 ->The URL is invalid, it is the CSS parsing that is incomplete.

etc.
Comment 5 Glenn Adams 2012-10-03 17:39:45 PDT
(In reply to comment #4)
> Looking at the failing cases, it does not looks like a URL parsing error. All the invalid URL are considered invalid by KURL.
> 
> The tests fail due to the way we handle some invalid input in CSS.
> 
> Examples:
> 
> Test 3:
>   #three { background-color: green; }
>   #foo { background: url(foo"bar) }
>   #three { background-color: red; }
> ->The test expect the last #three to be ignored due to the error on the second line. I don't see why that test is correct.

The second line should tokenize as follows (by CSS2.1 lexer rules):

...
BADURL(foo)
BADSTRING(bar ) })

causing the third line to appear as a continuation of the declaration started in the second line, and thus producing a MalformedDeclaration recovery, in which case the background-color: red is ignored

At present, WK doesn't tokenize in this manner, but instead returns

FUNCTION(url)
IDENT(foo)
"
IDENT(bar)
...

which eventually matches:

function: FUNCTION maybe_space error

since the " token is not a valid operator for combining terms in an expression.

Consequently, WK finishes the declaration on the second line without continuing to third line, and the background-color red is applied.

> 
> Test 4:
>   /* not a URI token; the unterminated string ends at end of line */
>   #foo { background: url(foo"bar) }
>   ) }
>   #four { background-color: green; }
>  ->The URL is invalid, it is the CSS parsing that is incomplete.

The same logic applies as above, but here the ") }" on third line causes the fourth line to be ignored (in WK) as a MalformedStatement. If WK had consumed the string as BADSTRING(bar ) }), then the declaration of second line would complete on the third line's }, in which case the fourth line would parse.

> 
> etc.
Comment 6 Glenn Adams 2012-10-03 17:49:47 PDT
(In reply to comment #3)
> I suggest to make a reduction for fast/url, and upload it with failing results.

Working on it now.

> For info, there is now a working version (albeit incomplete) of WTFRL in the tree.

I guess you mean WTFURL, i.e., WTF/wtf/url/api, yes?
Comment 7 Benjamin Poulain 2012-10-03 17:56:19 PDT
> (In reply to comment #3)
> > I suggest to make a reduction for fast/url, and upload it with failing results.
> 
> Working on it now.
> 
> > For info, there is now a working version (albeit incomplete) of WTFRL in the tree.
> 
> I guess you mean WTFURL, i.e., WTF/wtf/url/api, yes?

Yep, WTFURL.

After you last comment, I doubt this require tests for URLs. It looks like the CSS tokenizer is the issue here. Isn't it?
Comment 8 Glenn Adams 2012-10-03 17:59:01 PDT
(In reply to comment #7)
> > (In reply to comment #3)
> > > I suggest to make a reduction for fast/url, and upload it with failing results.
> > 
> > Working on it now.
> > 
> > > For info, there is now a working version (albeit incomplete) of WTFRL in the tree.
> > 
> > I guess you mean WTFURL, i.e., WTF/wtf/url/api, yes?
> 
> Yep, WTFURL.
> 
> After you last comment, I doubt this require tests for URLs. It looks like the CSS tokenizer is the issue here. Isn't it?

correct
Comment 9 Ahmad Saleem 2022-07-20 17:00:12 PDT
I took a test case from Mozilla Firefox Source Code repo from below link:

https://searchfox.org/mozilla-central/source/layout/reftests/css-parsing/invalid-url-handling.xhtml

and turned it into following JSFiddle:

https://jsfiddle.net/3a9keLtc/show

Based on above JSFiddle, Safari 15.6 on macOS 12.5 passes all test and have green text and it is same across other browsers (Chrome Canary 105 and Firefox Nightly 104) as well.

Is something else needed to be fixed here? Thanks!
Comment 10 Ryosuke Niwa 2022-07-21 17:20:04 PDT
Yeah, I'm sure this got fixed when we copied Blink's CSS parser.