Bug 110538 - Threaded HTML parser fails resources/plain-text-unsafe.dat
Summary: Threaded HTML parser fails resources/plain-text-unsafe.dat
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Tony Gentilcore
URL:
Keywords:
Depends on:
Blocks: 106127
  Show dependency treegraph
 
Reported: 2013-02-21 17:39 PST by Adam Barth
Modified: 2013-02-22 19:30 PST (History)
5 users (show)

See Also:


Attachments
Hacks to be able to run the html5lib tests (2.67 KB, patch)
2013-02-21 17:54 PST, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (4.47 KB, patch)
2013-02-22 16:38 PST, Tony Gentilcore
no flags Details | Formatted Diff | Diff
Patch for landing (4.90 KB, patch)
2013-02-22 17:33 PST, Tony Gentilcore
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Barth 2013-02-21 17:39:24 PST
resources/plain-text-unsafe.dat:
15
16
17
18
21

Looks like a problem with replacing null characters in foreign content.

It's a slight pain to run these tests in the threaded parser at the moment.
Comment 1 Tony Gentilcore 2013-02-21 17:45:32 PST
I've been wondering why we weren't seeing any failures due to the missing setForceNullCharacterReplacement() call. If you clue me in as to the easiest way to run these tests with the threaded parser, I'm happy to fix this tomorrow.
Comment 2 Adam Barth 2013-02-21 17:54:30 PST
Created attachment 189652 [details]
Hacks to be able to run the html5lib tests

On my machine, runner.html times out, so you also need to break it down into smaller pieces.  (The document.write path is faster than the data URL path.)
Comment 3 Tony Gentilcore 2013-02-22 16:38:04 PST
Created attachment 189858 [details]
Patch
Comment 4 Eric Seidel (no email) 2013-02-22 16:42:30 PST
Comment on attachment 189858 [details]
Patch

LGTM.  Don't we also need to do null character replacement in textArea and a couple others?
Comment 5 Eric Seidel (no email) 2013-02-22 16:43:18 PST
Comment on attachment 189858 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=189858&action=review

> Source/WebCore/html/parser/BackgroundHTMLParser.cpp:117
> +    const String& tagName = token.data();

Is there not a name() accessor?
Comment 6 Adam Barth 2013-02-22 16:44:14 PST
Comment on attachment 189858 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=189858&action=review

> Source/WebCore/html/parser/BackgroundHTMLParser.cpp:115
> +static inline bool tokenExitsSVG(const CompactHTMLToken& token)

Darin would tell us that static and inline are redundant here.  :)
Comment 7 Eric Seidel (no email) 2013-02-22 16:45:43 PST
Comment on attachment 189858 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=189858&action=review

>> Source/WebCore/html/parser/BackgroundHTMLParser.cpp:115
>> +static inline bool tokenExitsSVG(const CompactHTMLToken& token)
> 
> Darin would tell us that static and inline are redundant here.  :)

Really?
Comment 8 Tony Gentilcore 2013-02-22 16:49:57 PST
(In reply to comment #7)
> (From update of attachment 189858 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=189858&action=review
> 
> >> Source/WebCore/html/parser/BackgroundHTMLParser.cpp:115
> >> +static inline bool tokenExitsSVG(const CompactHTMLToken& token)
> > 
> > Darin would tell us that static and inline are redundant here.  :)
> 
> Really?

Surprising to me too. I'm interested in the explanation.

> Don't we also need to do null character replacement in textArea and a couple others?

Yeah, those tags kick us into TextMode, hence the remaining FIXME.

> Is there not a name() accessor?

No, these are CompactHTMLTokens, everything is in data(). The other accessors would just be an assertion on the correct type and a return of m_data.
Comment 9 Adam Barth 2013-02-22 17:17:08 PST
(In reply to comment #8)
> (In reply to comment #7)
> > (From update of attachment 189858 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=189858&action=review
> > 
> > >> Source/WebCore/html/parser/BackgroundHTMLParser.cpp:115
> > >> +static inline bool tokenExitsSVG(const CompactHTMLToken& token)
> > > 
> > > Darin would tell us that static and inline are redundant here.  :)
> > 
> > Really?
> 
> Surprising to me too. I'm interested in the explanation.

Yeah, they both just mean "give this function internal linkage" in this context.
Comment 10 Eric Seidel (no email) 2013-02-22 17:19:23 PST
Comment on attachment 189858 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=189858&action=review

>>>>> Source/WebCore/html/parser/BackgroundHTMLParser.cpp:115
>>>>> +static inline bool tokenExitsSVG(const CompactHTMLToken& token)
>>>> 
>>>> Darin would tell us that static and inline are redundant here.  :)
>>> 
>>> Really?
>> 
>> Surprising to me too. I'm interested in the explanation.
> 
> Yeah, they both just mean "give this function internal linkage" in this context.

So what you're saying is that "inline" is meaningless?  (outside of aliasing to static for a free function in a .cpp file?)
Comment 11 Adam Barth 2013-02-22 17:28:24 PST
(In reply to comment #10)
> (From update of attachment 189858 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=189858&action=review
> 
> >>>>> Source/WebCore/html/parser/BackgroundHTMLParser.cpp:115
> >>>>> +static inline bool tokenExitsSVG(const CompactHTMLToken& token)
> >>>> 
> >>>> Darin would tell us that static and inline are redundant here.  :)
> >>> 
> >>> Really?
> >> 
> >> Surprising to me too. I'm interested in the explanation.
> > 
> > Yeah, they both just mean "give this function internal linkage" in this context.
> 
> So what you're saying is that "inline" is meaningless?  (outside of aliasing to static for a free function in a .cpp file?)

Not always.  Consider the following situation:

class A {
  XXX void f();
};

void A::f() { }

We can't use "static" for XXX because that would mean f wouldn't get a |this| parameter.  We can, however, use "inline" to give A::f internal linkage.
Comment 12 Tony Gentilcore 2013-02-22 17:33:19 PST
Created attachment 189873 [details]
Patch for landing
Comment 13 Tony Gentilcore 2013-02-22 17:34:40 PST
Interesting. It makes sense now that I think about it. Anyway, I killed the "inline" in front of those methods.
Comment 14 WebKit Review Bot 2013-02-22 19:30:28 PST
Comment on attachment 189873 [details]
Patch for landing

Clearing flags on attachment: 189873

Committed r143830: <http://trac.webkit.org/changeset/143830>
Comment 15 WebKit Review Bot 2013-02-22 19:30:32 PST
All reviewed patches have been landed.  Closing bug.