RESOLVED FIXED 40417
HTML5Tokenizer needs to tell the InspectorTimelineAgent before and after it writes
https://bugs.webkit.org/show_bug.cgi?id=40417
Summary HTML5Tokenizer needs to tell the InspectorTimelineAgent before and after it w...
Eric Seidel (no email)
Reported 2010-06-10 03:57:05 PDT
HTML5Tokenizer needs to tell the InspectorTimelineAgent before and after it writes
Attachments
Patch (2.92 KB, patch)
2010-06-10 04:01 PDT, Eric Seidel (no email)
no flags
Another wip patch (3.03 KB, patch)
2010-06-10 18:25 PDT, Eric Seidel (no email)
no flags
Updated after conversation with James, ready for review (5.18 KB, patch)
2010-06-11 15:16 PDT, Eric Seidel (no email)
no flags
Eric Seidel (no email)
Comment 1 2010-06-10 04:00:54 PDT
I'm not 100% certain if we're supposed to call the willWrite/didWrite methods before every time we pump the lexer, or just during the write() call. a write() call might not actually finish the block in question. If the timeline agent is trying to know every time we're pumping the lexer it will want slightly different calls. Then again, I dont' think the old tokenizer "correctly" recorded resumes etc. either, so I think this code may just need some further tweaking from the inspector timeline folks. It would be easy to move these calls to inside pumpLexer() if that's what's desired.
Eric Seidel (no email)
Comment 2 2010-06-10 04:01:03 PDT
Eric Seidel (no email)
Comment 3 2010-06-10 04:03:55 PDT
Comment on attachment 58359 [details] Patch Thinking about this more, I think we do want to move this to inside pumpLexer. We'd need a more complicated test which caused the parser to pause due to script loads to see any difference. I think the old code reported the wrong source since when the parser resumed it would report an empty SegmentedString to the inspector.
Adam Barth
Comment 4 2010-06-10 10:16:37 PDT
Comment on attachment 58359 [details] Patch http://wkrietveld.appspot.com/40417/diff/1/3 File WebCore/html/HTML5Tokenizer.cpp (right): http://wkrietveld.appspot.com/40417/diff/1/3#newcode160 WebCore/html/HTML5Tokenizer.cpp:160: willWriteHTML(source); Are we supposed to call this when appendData is true? If so, why don't we call when write() is nested and appendData is true?
Eric Seidel (no email)
Comment 5 2010-06-10 11:16:43 PDT
I think we're supposed to call this in pumpLexer. The question is what SegementedString we're supposed to pass it then. The old tokenizer used write() as the pump lexer, basically. I also think that even in the old system this code commonly got strange results for the SegementedString.
Eric Seidel (no email)
Comment 6 2010-06-10 18:25:02 PDT
Created attachment 58431 [details] Another wip patch
Eric Seidel (no email)
Comment 7 2010-06-10 18:28:37 PDT
This current WIP causes this diff: --- /tmp/layout-test-results/inspector/timeline-script-tag-1-expected.txt 2010-06-10 18:18:48.000000000 -0700 +++ /tmp/layout-test-results/inspector/timeline-script-tag-1-actual.txt 2010-06-10 18:18:48.000000000 -0700 @@ -1,6 +1,7 @@ Tests the Timeline API instrumentation of an HTML script tag. ParseHTML +ParseHTML ----> EvaluateScript --------> MarkTimeline : SCRIPT TAG EvaluateScript Properties: @@ -16,4 +17,5 @@ usedHeapSize : <number> totalHeapSize : <number> } +ParseHTML I think we could avoid that by changing the pump loop into a do {} while, and having it check to see if we have any tokens to process before we inform the timeline agent. Then again, I'm still not really sure what the InspectorTimelineAgent is looking for here. CCing some inspector folks.
James Robinson
Comment 8 2010-06-11 13:16:58 PDT
The intent is to bracket HTML processing (initiated from the network or document.write() or .innerHTML) with willwrite/didwrite calls so that something reasonable can be placed on the timeline. The string itself is just used for char count, which is supposed to be the number of characters processed but I'm pretty sure it's buggy in some cases due to pause/resumes. The char count is of marginal utility anyway - the important bit is having some way for the developer looking at the timeline to know what piece of HTML was being processed if it ended up taking a long time.
Eric Seidel (no email)
Comment 9 2010-06-11 14:05:27 PDT
I talked with James on #webkit. He agrees we should give the inspector information about every pump and it should ignore any 0-length pumps, etc. So the current patch is correct. I'll upload a copy with more comments/explanation. James and I also discussed that eventually the inspector should move the charCount from willWrite to didWrite at which time the HTML5Lexer could provide exact numbers of how many chars it processed in that pump, instead of the current number which is always the count of how many chars it has left in its buffer. The old Tokenizer always provided the number of chars passed to write() which was 0 in the case of parser resumes. So both are wrong in different ways with the current behavior of passing the charCount to willWrite() instead of didWrite(). But that's a separate issue. :)
Eric Seidel (no email)
Comment 10 2010-06-11 15:16:30 PDT
Created attachment 58512 [details] Updated after conversation with James, ready for review
Adam Barth
Comment 11 2010-06-11 15:39:20 PDT
Comment on attachment 58512 [details] Updated after conversation with James, ready for review okiedokes
WebKit Commit Bot
Comment 12 2010-06-12 00:23:29 PDT
Comment on attachment 58512 [details] Updated after conversation with James, ready for review Clearing flags on attachment: 58512 Committed r61055: <http://trac.webkit.org/changeset/61055>
WebKit Commit Bot
Comment 13 2010-06-12 00:23:37 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.