40417 – HTML5Tokenizer needs to tell the InspectorTimelineAgent before and after it writes

RESOLVED FIXED 40417

HTML5Tokenizer needs to tell the InspectorTimelineAgent before and after it writes

https://bugs.webkit.org/show_bug.cgi?id=40417

Summary HTML5Tokenizer needs to tell the InspectorTimelineAgent before and after it w...

Eric Seidel (no email)

Reported 2010-06-10 03:57:05 PDT

HTML5Tokenizer needs to tell the InspectorTimelineAgent before and after it writes

Attachments
Patch (2.92 KB, patch) 2010-06-10 04:01 PDT, Eric Seidel (no email)	no flags	Details Formatted Diff Diff
Another wip patch (3.03 KB, patch) 2010-06-10 18:25 PDT, Eric Seidel (no email)	no flags	Details Formatted Diff Diff
Updated after conversation with James, ready for review (5.18 KB, patch) 2010-06-11 15:16 PDT, Eric Seidel (no email)	no flags	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

Eric Seidel (no email)

Comment 1 2010-06-10 04:00:54 PDT

I'm not 100% certain if we're supposed to call the willWrite/didWrite methods before every time we pump the lexer, or just during the write() call. a write() call might not actually finish the block in question. If the timeline agent is trying to know every time we're pumping the lexer it will want slightly different calls. Then again, I dont' think the old tokenizer "correctly" recorded resumes etc. either, so I think this code may just need some further tweaking from the inspector timeline folks. It would be easy to move these calls to inside pumpLexer() if that's what's desired.

Eric Seidel (no email)

Comment 2 2010-06-10 04:01:03 PDT

Created attachment 58359 [details] Patch

Eric Seidel (no email)

Comment 3 2010-06-10 04:03:55 PDT

Comment on attachment 58359 [details] Patch Thinking about this more, I think we do want to move this to inside pumpLexer. We'd need a more complicated test which caused the parser to pause due to script loads to see any difference. I think the old code reported the wrong source since when the parser resumed it would report an empty SegmentedString to the inspector.

Adam Barth

Comment 4 2010-06-10 10:16:37 PDT

Comment on attachment 58359 [details] Patch http://wkrietveld.appspot.com/40417/diff/1/3 File WebCore/html/HTML5Tokenizer.cpp (right): http://wkrietveld.appspot.com/40417/diff/1/3#newcode160 WebCore/html/HTML5Tokenizer.cpp:160: willWriteHTML(source); Are we supposed to call this when appendData is true? If so, why don't we call when write() is nested and appendData is true?

Eric Seidel (no email)

Comment 5 2010-06-10 11:16:43 PDT

I think we're supposed to call this in pumpLexer. The question is what SegementedString we're supposed to pass it then. The old tokenizer used write() as the pump lexer, basically. I also think that even in the old system this code commonly got strange results for the SegementedString.

Eric Seidel (no email)

Comment 6 2010-06-10 18:25:02 PDT

Created attachment 58431 [details] Another wip patch

Eric Seidel (no email)

Comment 7 2010-06-10 18:28:37 PDT

This current WIP causes this diff: --- /tmp/layout-test-results/inspector/timeline-script-tag-1-expected.txt 2010-06-10 18:18:48.000000000 -0700 +++ /tmp/layout-test-results/inspector/timeline-script-tag-1-actual.txt 2010-06-10 18:18:48.000000000 -0700 @@ -1,6 +1,7 @@ Tests the Timeline API instrumentation of an HTML script tag. ParseHTML +ParseHTML ----> EvaluateScript --------> MarkTimeline : SCRIPT TAG EvaluateScript Properties: @@ -16,4 +17,5 @@ usedHeapSize : <number> totalHeapSize : <number> } +ParseHTML I think we could avoid that by changing the pump loop into a do {} while, and having it check to see if we have any tokens to process before we inform the timeline agent. Then again, I'm still not really sure what the InspectorTimelineAgent is looking for here. CCing some inspector folks.

James Robinson

Comment 8 2010-06-11 13:16:58 PDT

The intent is to bracket HTML processing (initiated from the network or document.write() or .innerHTML) with willwrite/didwrite calls so that something reasonable can be placed on the timeline. The string itself is just used for char count, which is supposed to be the number of characters processed but I'm pretty sure it's buggy in some cases due to pause/resumes. The char count is of marginal utility anyway - the important bit is having some way for the developer looking at the timeline to know what piece of HTML was being processed if it ended up taking a long time.

Eric Seidel (no email)

Comment 9 2010-06-11 14:05:27 PDT

I talked with James on #webkit. He agrees we should give the inspector information about every pump and it should ignore any 0-length pumps, etc. So the current patch is correct. I'll upload a copy with more comments/explanation. James and I also discussed that eventually the inspector should move the charCount from willWrite to didWrite at which time the HTML5Lexer could provide exact numbers of how many chars it processed in that pump, instead of the current number which is always the count of how many chars it has left in its buffer. The old Tokenizer always provided the number of chars passed to write() which was 0 in the case of parser resumes. So both are wrong in different ways with the current behavior of passing the charCount to willWrite() instead of didWrite(). But that's a separate issue. :)

Eric Seidel (no email)

Comment 10 2010-06-11 15:16:30 PDT

Created attachment 58512 [details] Updated after conversation with James, ready for review

Adam Barth

Comment 11 2010-06-11 15:39:20 PDT

Comment on attachment 58512 [details] Updated after conversation with James, ready for review okiedokes

WebKit Commit Bot

Comment 12 2010-06-12 00:23:29 PDT

Comment on attachment 58512 [details] Updated after conversation with James, ready for review Clearing flags on attachment: 58512 Committed r61055: <http://trac.webkit.org/changeset/61055>

WebKit Commit Bot

Comment 13 2010-06-12 00:23:37 PDT

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Other

OS OS X 10.5

Product WebKit

Component New Bugs

Assignee

Eric Seidel (no email)

Reported

2010-06-10 03:57 PDT

Modified

2010-06-12 00:23 PDT History

CC List

6 users Show

URL

Keywords

Depends on

Blocks

39259

Dependencies

tree graph