RESOLVED DUPLICATE of bug 112069 110547
Free up background parser's checkpoints when speculation succeeds
https://bugs.webkit.org/show_bug.cgi?id=110547
Summary Free up background parser's checkpoints when speculation succeeds
Tony Gentilcore
Reported 2013-02-21 19:56:30 PST
When speculation fails, we send a resumeFrom message to the background thread parser which causes it to free up all its checkpoints. We should send a message when speculation succeeds so that it can free up memory used for previous checkpoints.
Attachments
wip (6.68 KB, patch)
2013-03-05 02:14 PST, Eric Seidel (no email)
no flags
Builds, but crashes fast/preloader tests (6.54 KB, patch)
2013-03-05 02:52 PST, Eric Seidel (no email)
no flags
Patch (7.59 KB, patch)
2013-03-05 03:50 PST, Eric Seidel (no email)
no flags
Hopefully addressed Adam's concerns (8.75 KB, patch)
2013-03-08 15:16 PST, Eric Seidel (no email)
no flags
Eric Seidel (no email)
Comment 1 2013-03-05 02:14:07 PST
Eric Seidel (no email)
Comment 2 2013-03-05 02:14:38 PST
I'm not even sure that compiles yet (my laptop is slow). But I welcome your feedback. I'm likely to go to sleep before finishing it tonight.
Eric Seidel (no email)
Comment 3 2013-03-05 02:52:43 PST
Created attachment 191448 [details] Builds, but crashes fast/preloader tests
Eric Seidel (no email)
Comment 4 2013-03-05 02:53:13 PST
Comment on attachment 191448 [details] Builds, but crashes fast/preloader tests Old habits. :) Didn't mean to mark for review.
Eric Seidel (no email)
Comment 5 2013-03-05 03:50:09 PST
Eric Seidel (no email)
Comment 6 2013-03-05 04:07:28 PST
Comment on attachment 191458 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=191458&action=review > Source/WebCore/ChangeLog:10 > + pages, as we will no longer hold multiple copies of every > + source byte during the whole parse. This is slightly misleading, as this is really only letting go of one copy. The m_segments and m_checkpoint.input hold onto refcounted StringImpls, so there is just one shared copy which we're now releasing earlier. I expect this will only effect particularly large loads.
Eric Seidel (no email)
Comment 7 2013-03-06 22:55:50 PST
I believe this patch is ready for review and landing?
Daniel Bates
Comment 8 2013-03-06 23:45:18 PST
Comment on attachment 191458 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=191458&action=review > Source/WebCore/ChangeLog:12 > + Many LayoutTests exercise this code path, and I've manually (debugger and printf) I think you're missing the word "verified" in this sentence. > Source/WebCore/html/parser/BackgroundHTMLInputStream.cpp:58 > + const Checkpoint& checkpoint = m_checkpoints[checkpointIndex]; Would it make sense to assert the invariant checkpointIndex < m_checkpoints.size() in this function?
Adam Barth
Comment 9 2013-03-07 00:06:31 PST
Comment on attachment 191458 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=191458&action=review I should look at this again when I'm less tired. It's not as beautiful as I had hoped. > Source/WebCore/html/parser/BackgroundHTMLInputStream.cpp:60 > + for (size_t i = 0; i < checkpoint.numberOfSegmentsAlreadyAppended; ++i) > + m_segments[i] = String(); Does this work correctly in the case where there are multiple checkpoints per append? Probably. Also, this algorithm is n^2. Should we keep some state for the smallest index that has actual data and start the loop from there? > Source/WebCore/html/parser/HTMLDocumentParser.cpp:462 > + checkpointIndex = std::max(chunk->inputCheckpoint, checkpointIndex); Why is this needed? > Source/WebCore/html/parser/HTMLDocumentParser.cpp:470 > + // Processing a chunk can call document.write, causing us to invalidate any remaining speculations. > + // Similarly, if we're stopped just abort and don't bother to tell the parser about passing this checkpoint. > + if (m_speculations.isEmpty() || isStopped()) { > + checkpointIndex = 0; > + break; > + } This is pretty ugly....
Adam Barth
Comment 10 2013-03-08 11:16:28 PST
Comment on attachment 191458 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=191458&action=review I realize my complaints below are mostly aesthetic... > Source/WebCore/html/parser/BackgroundHTMLInputStream.cpp:56 > +void BackgroundHTMLInputStream::releaseDataUpThroughCheckpoint(HTMLInputCheckpoint checkpointIndex) invalidateCheckpointsBefore ? >> Source/WebCore/html/parser/BackgroundHTMLInputStream.cpp:58 >> + const Checkpoint& checkpoint = m_checkpoints[checkpointIndex]; > > Would it make sense to assert the invariant checkpointIndex < m_checkpoints.size() in this function? We should ASSERT that checkpoint hasn't been invalidated yet. >> Source/WebCore/html/parser/BackgroundHTMLInputStream.cpp:60 >> + m_segments[i] = String(); > > Does this work correctly in the case where there are multiple checkpoints per append? Probably. > > Also, this algorithm is n^2. Should we keep some state for the smallest index that has actual data and start the loop from there? Please fix the n^2 issue. >> Source/WebCore/html/parser/HTMLDocumentParser.cpp:470 >> + // Processing a chunk can call document.write, causing us to invalidate any remaining speculations. >> + // Similarly, if we're stopped just abort and don't bother to tell the parser about passing this checkpoint. >> + if (m_speculations.isEmpty() || isStopped()) { >> + checkpointIndex = 0; >> + break; >> + } > > This is pretty ugly.... There must be a way to structure this code so that it's less subtle. Maybe we should do this work in validateSpeculations instead?
Eric Seidel (no email)
Comment 11 2013-03-08 11:36:18 PST
Comment on attachment 191458 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=191458&action=review >> Source/WebCore/html/parser/BackgroundHTMLInputStream.cpp:56 >> +void BackgroundHTMLInputStream::releaseDataUpThroughCheckpoint(HTMLInputCheckpoint checkpointIndex) > > invalidateCheckpointsBefore ? Happy to name it whatever. The tricky part for me was that it's inclusive of the checkpoint you pass it invalidateCheckpointsUntil(checkpoint)? >>> Source/WebCore/html/parser/BackgroundHTMLInputStream.cpp:60 >>> + m_segments[i] = String(); >> >> Does this work correctly in the case where there are multiple checkpoints per append? Probably. >> >> Also, this algorithm is n^2. Should we keep some state for the smallest index that has actual data and start the loop from there? > > Please fix the n^2 issue. OK. I'll keep a m_lastInvalidSegment or similar. >> Source/WebCore/html/parser/HTMLDocumentParser.cpp:462 >> + checkpointIndex = std::max(chunk->inputCheckpoint, checkpointIndex); > > Why is this needed? Since we can assume the chunks have checkpoints in ascending order, it's not. We just need the last one. >>> Source/WebCore/html/parser/HTMLDocumentParser.cpp:470 >>> + } >> >> This is pretty ugly.... > > There must be a way to structure this code so that it's less subtle. Maybe we should do this work in validateSpeculations instead? We need to know when the chunk is processed. I'll have to check when validateSpeculations is called.
Eric Seidel (no email)
Comment 12 2013-03-08 15:16:24 PST
Created attachment 192294 [details] Hopefully addressed Adam's concerns
Adam Barth
Comment 13 2013-03-08 15:21:02 PST
Comment on attachment 192294 [details] Hopefully addressed Adam's concerns View in context: https://bugs.webkit.org/attachment.cgi?id=192294&action=review I can't tell if you changed HTMLDocumentParser::pumpPendingSpeculations or if I've just read it enough times to understand what it's doing. :) > Source/WebCore/html/parser/BackgroundHTMLInputStream.h:72 > + // Note: These indicies may === vector.size(), in which case there are no valid checkpoints/segments at this time. === -> == (Triple-equals is JavaScript silliness.)
Adam Barth
Comment 14 2013-03-08 15:22:36 PST
Comment on attachment 192294 [details] Hopefully addressed Adam's concerns View in context: https://bugs.webkit.org/attachment.cgi?id=192294&action=review > Source/WebCore/html/parser/HTMLDocumentParser.cpp:484 > + HTMLParserThread::shared()->postTask(bind(&BackgroundHTMLParser::passedCheckpoint, m_backgroundParser, lastCheckpointPassed)); The only problem with this patch is that we'll never call passedCheckpoint if we never call pumpPendingSpeculations. We only call pumpPendingSpeculations if we ended up queuing speculations. It's entirely possible that we can process every chunk from the background parser immediately after receiving them.
Eric Seidel (no email)
Comment 15 2013-03-08 15:27:51 PST
(In reply to comment #14) > (From update of attachment 192294 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=192294&action=review > > > Source/WebCore/html/parser/HTMLDocumentParser.cpp:484 > > + HTMLParserThread::shared()->postTask(bind(&BackgroundHTMLParser::passedCheckpoint, m_backgroundParser, lastCheckpointPassed)); > > The only problem with this patch is that we'll never call passedCheckpoint if we never call pumpPendingSpeculations. We only call pumpPendingSpeculations if we ended up queuing speculations. It's entirely possible that we can process every chunk from the background parser immediately after receiving them. That's true. I'm not sure if I should do that in a follow-up, or go another round with this one. I'm open to either.
Eric Seidel (no email)
Comment 16 2013-03-08 15:29:04 PST
(In reply to comment #15) > (In reply to comment #14) > > (From update of attachment 192294 [details] [details]) > > View in context: https://bugs.webkit.org/attachment.cgi?id=192294&action=review > > > > > Source/WebCore/html/parser/HTMLDocumentParser.cpp:484 > > > + HTMLParserThread::shared()->postTask(bind(&BackgroundHTMLParser::passedCheckpoint, m_backgroundParser, lastCheckpointPassed)); > > > > The only problem with this patch is that we'll never call passedCheckpoint if we never call pumpPendingSpeculations. We only call pumpPendingSpeculations if we ended up queuing speculations. It's entirely possible that we can process every chunk from the background parser immediately after receiving them. > > That's true. I'm not sure if I should do that in a follow-up, or go another round with this one. I'm open to either. It seems inefficient to tell the background parser to release a chunk after each chunk. But maybe I'm optimizing prematurely. I have no data to suggest that messaging is expensive at all. :)
Adam Barth
Comment 17 2013-03-08 15:32:14 PST
Followup patch is fine.
WebKit Review Bot
Comment 18 2013-03-08 15:41:32 PST
Comment on attachment 192294 [details] Hopefully addressed Adam's concerns Clearing flags on attachment: 192294 Committed r145277: <http://trac.webkit.org/changeset/145277>
WebKit Review Bot
Comment 19 2013-03-08 15:41:36 PST
All reviewed patches have been landed. Closing bug.
Eric Seidel (no email)
Comment 20 2013-03-11 17:08:50 PDT
This work continued in bug 112069.
WebKit Review Bot
Comment 21 2013-03-12 11:12:05 PDT
Re-opened since this is blocked by bug 112170
Eric Seidel (no email)
Comment 22 2013-03-14 02:35:20 PDT
*** This bug has been marked as a duplicate of bug 112069 ***
Note You need to log in before you can comment on or make changes to this bug.