Bug 39571 - Teach the HTML5 parser to lex DOCTYPEs
Summary: Teach the HTML5 parser to lex DOCTYPEs
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Other OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 39259
  Show dependency treegraph
 
Reported: 2010-05-23 14:02 PDT by Adam Barth
Modified: 2010-05-24 15:06 PDT (History)
2 users (show)

See Also:


Attachments
Work in progress (4.19 KB, patch)
2010-05-23 14:02 PDT, Adam Barth
no flags Details | Formatted Diff | Diff
more work in progress (19.29 KB, patch)
2010-05-23 16:05 PDT, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (28.42 KB, patch)
2010-05-24 14:08 PDT, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (27.82 KB, patch)
2010-05-24 14:52 PDT, Adam Barth
eric: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Barth 2010-05-23 14:02:05 PDT
Teach the HTML5 parser to lex DOCTYPEs
Comment 1 Adam Barth 2010-05-23 14:02:25 PDT
Created attachment 56828 [details]
Work in progress
Comment 2 Adam Barth 2010-05-23 16:05:44 PDT
Created attachment 56836 [details]
more work in progress
Comment 3 Adam Barth 2010-05-24 14:08:29 PDT
Created attachment 56923 [details]
Patch
Comment 4 WebKit Review Bot 2010-05-24 14:11:32 PDT
Attachment 56923 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style', '--no-squash']" exit_code: 1
WebCore/html/HTML5Lexer.cpp:890:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
WebCore/html/HTML5Lexer.cpp:1072:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
WebCore/html/HTML5Lexer.cpp:1085:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
Total errors found: 3 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 5 Eric Seidel (no email) 2010-05-24 14:26:14 PDT
Comment on attachment 56923 [details]
Patch

WebCore/html/HTML5Lexer.cpp:891
 +                      source.advanceAndASSERTIgnoringCase('d');
Seems it would be simple to write a little helper method (static inline) which did:

static inline void advanceAndASSERTIgnoringCase(SegmentedString& string, char* assertionString)
{
    for (int x = 0;; x++) {
        if (!assertionString[x])
             return;
        string.advanceAndASSERTIngoringCase(assertionString[x]);
     }
}

That would read much cleaner in the callsites.

WebCore/html/HTML5TreeBuilder.cpp:100
 +      if (token.type() == HTML5Token::Uninitialized)
This seems wrong.

WebCore/platform/text/SegmentedString.h:154
 +      void advanceAndASSERTIgnoringCase(UChar expectedCharacter)
We may just want to turn this into a static inline in HTML5Lexer.cpp instead.

I think we need one more round.
Comment 6 Adam Barth 2010-05-24 14:52:54 PDT
Created attachment 56932 [details]
Patch
Comment 7 WebKit Review Bot 2010-05-24 14:55:01 PDT
Attachment 56932 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style', '--no-squash']" exit_code: 1
WebCore/html/HTML5Lexer.cpp:900:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
WebCore/html/HTML5Lexer.cpp:1077:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
WebCore/html/HTML5Lexer.cpp:1085:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
Total errors found: 3 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 8 Eric Seidel (no email) 2010-05-24 14:58:38 PDT
Comment on attachment 56932 [details]
Patch

Would be nice if we had a version of this which took a const String& so we didn't have to write "public" twice:

 1078                         advanceStringAndASSERTIgnoringCase(source, "public");


othewrise looks good.

Seems silly to intentionally violate style just to match the spec wording.
Comment 9 Adam Barth 2010-05-24 15:00:44 PDT
> Would be nice if we had a version of this which took a const String& so we didn't have to write "public" twice:

This might be a premature optimization, but I'm worried about all the string null checks...

> Seems silly to intentionally violate style just to match the spec wording.

My hope is that this problem will be sovled once we convert to using goto.
Comment 10 Adam Barth 2010-05-24 15:06:40 PDT
Committed r60087