Bug 39571

Summary: Teach the HTML5 parser to lex DOCTYPEs
Product: WebKit Reporter: Adam Barth <abarth>
Component: New BugsAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: eric, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Other   
OS: OS X 10.5   
Bug Depends on:    
Bug Blocks: 39259    
Attachments:
Description Flags
Work in progress
none
more work in progress
none
Patch
none
Patch eric: review+

Description Adam Barth 2010-05-23 14:02:05 PDT
Teach the HTML5 parser to lex DOCTYPEs
Comment 1 Adam Barth 2010-05-23 14:02:25 PDT
Created attachment 56828 [details]
Work in progress
Comment 2 Adam Barth 2010-05-23 16:05:44 PDT
Created attachment 56836 [details]
more work in progress
Comment 3 Adam Barth 2010-05-24 14:08:29 PDT
Created attachment 56923 [details]
Patch
Comment 4 WebKit Review Bot 2010-05-24 14:11:32 PDT
Attachment 56923 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style', '--no-squash']" exit_code: 1
WebCore/html/HTML5Lexer.cpp:890:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
WebCore/html/HTML5Lexer.cpp:1072:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
WebCore/html/HTML5Lexer.cpp:1085:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
Total errors found: 3 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 5 Eric Seidel (no email) 2010-05-24 14:26:14 PDT
Comment on attachment 56923 [details]
Patch

WebCore/html/HTML5Lexer.cpp:891
 +                      source.advanceAndASSERTIgnoringCase('d');
Seems it would be simple to write a little helper method (static inline) which did:

static inline void advanceAndASSERTIgnoringCase(SegmentedString& string, char* assertionString)
{
    for (int x = 0;; x++) {
        if (!assertionString[x])
             return;
        string.advanceAndASSERTIngoringCase(assertionString[x]);
     }
}

That would read much cleaner in the callsites.

WebCore/html/HTML5TreeBuilder.cpp:100
 +      if (token.type() == HTML5Token::Uninitialized)
This seems wrong.

WebCore/platform/text/SegmentedString.h:154
 +      void advanceAndASSERTIgnoringCase(UChar expectedCharacter)
We may just want to turn this into a static inline in HTML5Lexer.cpp instead.

I think we need one more round.
Comment 6 Adam Barth 2010-05-24 14:52:54 PDT
Created attachment 56932 [details]
Patch
Comment 7 WebKit Review Bot 2010-05-24 14:55:01 PDT
Attachment 56932 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style', '--no-squash']" exit_code: 1
WebCore/html/HTML5Lexer.cpp:900:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
WebCore/html/HTML5Lexer.cpp:1077:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
WebCore/html/HTML5Lexer.cpp:1085:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
Total errors found: 3 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 8 Eric Seidel (no email) 2010-05-24 14:58:38 PDT
Comment on attachment 56932 [details]
Patch

Would be nice if we had a version of this which took a const String& so we didn't have to write "public" twice:

 1078                         advanceStringAndASSERTIgnoringCase(source, "public");


othewrise looks good.

Seems silly to intentionally violate style just to match the spec wording.
Comment 9 Adam Barth 2010-05-24 15:00:44 PDT
> Would be nice if we had a version of this which took a const String& so we didn't have to write "public" twice:

This might be a premature optimization, but I'm worried about all the string null checks...

> Seems silly to intentionally violate style just to match the spec wording.

My hope is that this problem will be sovled once we convert to using goto.
Comment 10 Adam Barth 2010-05-24 15:06:40 PDT
Committed r60087