Bug 40199

Summary: HTML5 parser should normalize line endings
Product: WebKit Reporter: Adam Barth <abarth>
Component: New BugsAssignee: Adam Barth <abarth>
Status: RESOLVED FIXED    
Severity: Normal CC: eric
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Other   
OS: OS X 10.5   
Bug Depends on:    
Bug Blocks: 39259    
Attachments:
Description Flags
Patch
none
Patch eric: review+

Description Adam Barth 2010-06-04 19:31:28 PDT
HTML5 parser should normalize line endings
Comment 1 Adam Barth 2010-06-04 19:48:14 PDT
Created attachment 57947 [details]
Patch
Comment 2 Eric Seidel (no email) 2010-06-04 22:50:08 PDT
Comment on attachment 57947 [details]
Patch

WebCore/html/HTML5Lexer.cpp:335
 +  #define PEEK_AND_RECONSUME_IN(StateName)                                   \
Please add comments to these macros about when to use which.  It's not clear when to use PEEK_AND_ vs. normal RECONSUME

WebCore/html/HTML5Lexer.cpp:338
 +          if (!m_inputStreamPreprocessor.peek(source, m_lineNumber))         \
This might want to be a macro itself, sicne it's used other places.  EXIT_IF_PEEK_FAIL or something?

WebCore/html/HTML5Lexer.cpp:413
 +      if (m_skipLeadingNewLineForListing) {
We might want to document in the header why both the input stream preprocessor "skip next newline" and this m_skipLeadingNewLineForListing are needed.  They're both needed.  Mostly to cover the <pre>\r\n case cleanly.

WebCore/html/HTML5Lexer.cpp:1160
 +              // We ignore the return value because it's checked by the loop.
This comment is not clear.  What's the return value?  How is it checked by which loop?

WebCore/html/HTML5Lexer.h:138
 +              UChar nextInputCharacter() { return m_nextInputCharacter; }
const?

WebCore/html/HTML5Lexer.h:136
 +              InputStreamPreprocessor() : m_nextInputCharacter('\0'), m_skipNextNewLine(false) { }
separate lines?

WebCore/html/HTML5Lexer.h:147
 +                          return false;
So I'm not sure what a false vs. true return here mean.  False means that there is not enough data in the stream to be able to do the \r\n handling?

This just isn't clear.  You talked about adding more ASSERTs in InputStreamPreprocessor to prevent double-peeking too.
Comment 3 Adam Barth 2010-06-07 11:19:52 PDT
Created attachment 58051 [details]
Patch
Comment 4 Eric Seidel (no email) 2010-06-07 11:30:53 PDT
Comment on attachment 58051 [details]
Patch

Looks OK.  Thank you for all the extra commenting.
Comment 5 Adam Barth 2010-06-07 11:34:18 PDT
Committed r60790: <http://trac.webkit.org/changeset/60790>