HTML5Tokenizer needs to tell the InspectorTimelineAgent before and after it writes
I'm not 100% certain if we're supposed to call the willWrite/didWrite methods before every time we pump the lexer, or just during the write() call. a write() call might not actually finish the block in question. If the timeline agent is trying to know every time we're pumping the lexer it will want slightly different calls. Then again, I dont' think the old tokenizer "correctly" recorded resumes etc. either, so I think this code may just need some further tweaking from the inspector timeline folks. It would be easy to move these calls to inside pumpLexer() if that's what's desired.
Created attachment 58359 [details] Patch
Comment on attachment 58359 [details] Patch Thinking about this more, I think we do want to move this to inside pumpLexer. We'd need a more complicated test which caused the parser to pause due to script loads to see any difference. I think the old code reported the wrong source since when the parser resumed it would report an empty SegmentedString to the inspector.
Comment on attachment 58359 [details] Patch http://wkrietveld.appspot.com/40417/diff/1/3 File WebCore/html/HTML5Tokenizer.cpp (right): http://wkrietveld.appspot.com/40417/diff/1/3#newcode160 WebCore/html/HTML5Tokenizer.cpp:160: willWriteHTML(source); Are we supposed to call this when appendData is true? If so, why don't we call when write() is nested and appendData is true?
I think we're supposed to call this in pumpLexer. The question is what SegementedString we're supposed to pass it then. The old tokenizer used write() as the pump lexer, basically. I also think that even in the old system this code commonly got strange results for the SegementedString.
Created attachment 58431 [details] Another wip patch
This current WIP causes this diff: --- /tmp/layout-test-results/inspector/timeline-script-tag-1-expected.txt 2010-06-10 18:18:48.000000000 -0700 +++ /tmp/layout-test-results/inspector/timeline-script-tag-1-actual.txt 2010-06-10 18:18:48.000000000 -0700 @@ -1,6 +1,7 @@ Tests the Timeline API instrumentation of an HTML script tag. ParseHTML +ParseHTML ----> EvaluateScript --------> MarkTimeline : SCRIPT TAG EvaluateScript Properties: @@ -16,4 +17,5 @@ usedHeapSize : <number> totalHeapSize : <number> } +ParseHTML I think we could avoid that by changing the pump loop into a do {} while, and having it check to see if we have any tokens to process before we inform the timeline agent. Then again, I'm still not really sure what the InspectorTimelineAgent is looking for here. CCing some inspector folks.
The intent is to bracket HTML processing (initiated from the network or document.write() or .innerHTML) with willwrite/didwrite calls so that something reasonable can be placed on the timeline. The string itself is just used for char count, which is supposed to be the number of characters processed but I'm pretty sure it's buggy in some cases due to pause/resumes. The char count is of marginal utility anyway - the important bit is having some way for the developer looking at the timeline to know what piece of HTML was being processed if it ended up taking a long time.
I talked with James on #webkit. He agrees we should give the inspector information about every pump and it should ignore any 0-length pumps, etc. So the current patch is correct. I'll upload a copy with more comments/explanation. James and I also discussed that eventually the inspector should move the charCount from willWrite to didWrite at which time the HTML5Lexer could provide exact numbers of how many chars it processed in that pump, instead of the current number which is always the count of how many chars it has left in its buffer. The old Tokenizer always provided the number of chars passed to write() which was 0 in the case of parser resumes. So both are wrong in different ways with the current behavior of passing the charCount to willWrite() instead of didWrite(). But that's a separate issue. :)
Created attachment 58512 [details] Updated after conversation with James, ready for review
Comment on attachment 58512 [details] Updated after conversation with James, ready for review okiedokes
Comment on attachment 58512 [details] Updated after conversation with James, ready for review Clearing flags on attachment: 58512 Committed r61055: <http://trac.webkit.org/changeset/61055>
All reviewed patches have been landed. Closing bug.