RESOLVED FIXED Bug 39537
Teach HTML5 parser how to lex comments correctly
https://bugs.webkit.org/show_bug.cgi?id=39537
Summary Teach HTML5 parser how to lex comments correctly
Adam Barth
Reported 2010-05-22 13:59:42 PDT
Teach HTML5 parser how to lex comments correctly
Attachments
Work in progress (9.32 KB, patch)
2010-05-22 14:00 PDT, Adam Barth
no flags
Patch (15.58 KB, patch)
2010-05-23 10:40 PDT, Adam Barth
no flags
Patch (18.58 KB, patch)
2010-05-23 14:49 PDT, Adam Barth
no flags
Patch for landing (18.62 KB, patch)
2010-05-23 14:59 PDT, Adam Barth
no flags
Adam Barth
Comment 1 2010-05-22 14:00:17 PDT
Created attachment 56789 [details] Work in progress
Adam Barth
Comment 2 2010-05-23 10:40:02 PDT
Eric Seidel (no email)
Comment 3 2010-05-23 13:25:18 PDT
Comment on attachment 56825 [details] Patch WebCore/html/HTML5Lexer.cpp:880 + ASSERT(*source == '-'); We should add an advanceAndASSERT() method to SegmentedString which does this for you. WebCore/html/HTML5Lexer.cpp:879 + if (peek == SegmentedString::DidMatch) { What does DidMatch mean? WebCore/html/HTML5Lexer.cpp:915 + m_token->appendToComment('-'); Where does the '-' come from? WebCore/html/HTML5Token.h:167 + return String(StringImpl::adopt(m_data)); This method can't be called more than once. That seems bad, no? Should we ASSERT that m_data is non-empty here? WebCore/html/HTML5Tokenizer.cpp:69 + default: We should remove this default: and instead list all the missing states. WebCore/platform/text/SegmentedString.h:120 + ASSERT_NOT_REACHED(); SegmentedString seems like a rather mature API, seems lame to add an imature method to it. I think this needs one more round. SEe above.
Adam Barth
Comment 4 2010-05-23 14:49:36 PDT
WebKit Review Bot
Comment 5 2010-05-23 14:53:41 PDT
Attachment 56831 [details] did not pass style-queue: Failed to run "['WebKitTools/Scripts/check-webkit-style', '--no-squash']" exit_code: 1 WebCore/html/HTML5Lexer.cpp:879: An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement. [readability/control_flow] [4] Total errors found: 1 in 9 files If any of these errors are false positives, please file a bug against check-webkit-style.
Eric Seidel (no email)
Comment 6 2010-05-23 14:57:08 PDT
Comment on attachment 56831 [details] Patch WebCore/html/HTML5Token.h:159 + return cachedData(); cachedDataString()? WebCore/html/HTML5Tokenizer.cpp:46 + case HTML5Token::EndOfFile: Sooooo much better! WebCore/platform/text/SegmentedString.h:135 + void advanceAndASSERT(UChar expectedCharacter) This will fail in RElease builds. We need UNUSED_PARAM I think. Otherwise looks OK.
Adam Barth
Comment 7 2010-05-23 14:59:55 PDT
Created attachment 56832 [details] Patch for landing
Eric Seidel (no email)
Comment 8 2010-05-23 15:01:25 PDT
WebKit Commit Bot
Comment 9 2010-05-23 15:23:05 PDT
Comment on attachment 56832 [details] Patch for landing Clearing flags on attachment: 56832 Committed r60051: <http://trac.webkit.org/changeset/60051>
WebKit Commit Bot
Comment 10 2010-05-23 15:23:11 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.