Bug 110547

Summary:

Free up background parser's checkpoints when speculation succeeds

Product:

WebKit

Reporter:

Tony Gentilcore <tonyg>

Component:

WebCore Misc.

Assignee:

Eric Seidel (no email) <eric>

Status:

RESOLVED DUPLICATE

Severity:

Normal

CC:

abarth, eric, esprehn+autocc, ojan.autocc, webkit.review.bot

Priority:

Version:

528+ (Nightly build)

Hardware:

Unspecified

OS:

Unspecified

Bug Depends on:

112170

Bug Blocks:

111645, 111893, 112069

Attachments:

Description	Flags
wip	none
Builds, but crashes fast/preloader tests	none
Patch	none
Hopefully addressed Adam's concerns	none

Tony Gentilcore

Reported 2013-02-21 19:56:30 PST

When speculation fails, we send a resumeFrom message to the background thread parser which causes it to free up all its checkpoints. We should send a message when speculation succeeds so that it can free up memory used for previous checkpoints.

Attachments
wip (6.68 KB, patch) 2013-03-05 02:14 PST, Eric Seidel (no email)	no flags	Details Formatted Diff Diff
Builds, but crashes fast/preloader tests (6.54 KB, patch) 2013-03-05 02:52 PST, Eric Seidel (no email)	no flags	Details Formatted Diff Diff
Patch (7.59 KB, patch) 2013-03-05 03:50 PST, Eric Seidel (no email)	no flags	Details Formatted Diff Diff
Hopefully addressed Adam's concerns (8.75 KB, patch) 2013-03-08 15:16 PST, Eric Seidel (no email)	no flags	Details Formatted Diff Diff
Show Obsolete (3) View All Add attachment proposed patch, testcase, etc.

Eric Seidel (no email)

Comment 1 2013-03-05 02:14:07 PST

Created attachment 191440 [details] wip

Eric Seidel (no email)

Comment 2 2013-03-05 02:14:38 PST

I'm not even sure that compiles yet (my laptop is slow). But I welcome your feedback. I'm likely to go to sleep before finishing it tonight.

Eric Seidel (no email)

Comment 3 2013-03-05 02:52:43 PST

Created attachment 191448 [details] Builds, but crashes fast/preloader tests

Eric Seidel (no email)

Comment 4 2013-03-05 02:53:13 PST

Comment on attachment 191448 [details] Builds, but crashes fast/preloader tests Old habits. :) Didn't mean to mark for review.

Eric Seidel (no email)

Comment 5 2013-03-05 03:50:09 PST

Created attachment 191458 [details] Patch

Eric Seidel (no email)

Comment 6 2013-03-05 04:07:28 PST

Comment on attachment 191458 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=191458&action=review > Source/WebCore/ChangeLog:10 > + pages, as we will no longer hold multiple copies of every > + source byte during the whole parse. This is slightly misleading, as this is really only letting go of one copy. The m_segments and m_checkpoint.input hold onto refcounted StringImpls, so there is just one shared copy which we're now releasing earlier. I expect this will only effect particularly large loads.

Eric Seidel (no email)

Comment 7 2013-03-06 22:55:50 PST

I believe this patch is ready for review and landing?

Daniel Bates

Comment 8 2013-03-06 23:45:18 PST

Comment on attachment 191458 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=191458&action=review > Source/WebCore/ChangeLog:12 > + Many LayoutTests exercise this code path, and I've manually (debugger and printf) I think you're missing the word "verified" in this sentence. > Source/WebCore/html/parser/BackgroundHTMLInputStream.cpp:58 > + const Checkpoint& checkpoint = m_checkpoints[checkpointIndex]; Would it make sense to assert the invariant checkpointIndex < m_checkpoints.size() in this function?

Adam Barth

Comment 9 2013-03-07 00:06:31 PST

Comment on attachment 191458 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=191458&action=review I should look at this again when I'm less tired. It's not as beautiful as I had hoped. > Source/WebCore/html/parser/BackgroundHTMLInputStream.cpp:60 > + for (size_t i = 0; i < checkpoint.numberOfSegmentsAlreadyAppended; ++i) > + m_segments[i] = String(); Does this work correctly in the case where there are multiple checkpoints per append? Probably. Also, this algorithm is n^2. Should we keep some state for the smallest index that has actual data and start the loop from there? > Source/WebCore/html/parser/HTMLDocumentParser.cpp:462 > + checkpointIndex = std::max(chunk->inputCheckpoint, checkpointIndex); Why is this needed? > Source/WebCore/html/parser/HTMLDocumentParser.cpp:470 > + // Processing a chunk can call document.write, causing us to invalidate any remaining speculations. > + // Similarly, if we're stopped just abort and don't bother to tell the parser about passing this checkpoint. > + if (m_speculations.isEmpty() || isStopped()) { > + checkpointIndex = 0; > + break; > + } This is pretty ugly....

Adam Barth

Comment 10 2013-03-08 11:16:28 PST

Comment on attachment 191458 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=191458&action=review I realize my complaints below are mostly aesthetic... > Source/WebCore/html/parser/BackgroundHTMLInputStream.cpp:56 > +void BackgroundHTMLInputStream::releaseDataUpThroughCheckpoint(HTMLInputCheckpoint checkpointIndex) invalidateCheckpointsBefore ? >> Source/WebCore/html/parser/BackgroundHTMLInputStream.cpp:58 >> + const Checkpoint& checkpoint = m_checkpoints[checkpointIndex]; > > Would it make sense to assert the invariant checkpointIndex < m_checkpoints.size() in this function? We should ASSERT that checkpoint hasn't been invalidated yet. >> Source/WebCore/html/parser/BackgroundHTMLInputStream.cpp:60 >> + m_segments[i] = String(); > > Does this work correctly in the case where there are multiple checkpoints per append? Probably. > > Also, this algorithm is n^2. Should we keep some state for the smallest index that has actual data and start the loop from there? Please fix the n^2 issue. >> Source/WebCore/html/parser/HTMLDocumentParser.cpp:470 >> + // Processing a chunk can call document.write, causing us to invalidate any remaining speculations. >> + // Similarly, if we're stopped just abort and don't bother to tell the parser about passing this checkpoint. >> + if (m_speculations.isEmpty() || isStopped()) { >> + checkpointIndex = 0; >> + break; >> + } > > This is pretty ugly.... There must be a way to structure this code so that it's less subtle. Maybe we should do this work in validateSpeculations instead?

Eric Seidel (no email)

Comment 11 2013-03-08 11:36:18 PST

Comment on attachment 191458 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=191458&action=review >> Source/WebCore/html/parser/BackgroundHTMLInputStream.cpp:56 >> +void BackgroundHTMLInputStream::releaseDataUpThroughCheckpoint(HTMLInputCheckpoint checkpointIndex) > > invalidateCheckpointsBefore ? Happy to name it whatever. The tricky part for me was that it's inclusive of the checkpoint you pass it invalidateCheckpointsUntil(checkpoint)? >>> Source/WebCore/html/parser/BackgroundHTMLInputStream.cpp:60 >>> + m_segments[i] = String(); >> >> Does this work correctly in the case where there are multiple checkpoints per append? Probably. >> >> Also, this algorithm is n^2. Should we keep some state for the smallest index that has actual data and start the loop from there? > > Please fix the n^2 issue. OK. I'll keep a m_lastInvalidSegment or similar. >> Source/WebCore/html/parser/HTMLDocumentParser.cpp:462 >> + checkpointIndex = std::max(chunk->inputCheckpoint, checkpointIndex); > > Why is this needed? Since we can assume the chunks have checkpoints in ascending order, it's not. We just need the last one. >>> Source/WebCore/html/parser/HTMLDocumentParser.cpp:470 >>> + } >> >> This is pretty ugly.... > > There must be a way to structure this code so that it's less subtle. Maybe we should do this work in validateSpeculations instead? We need to know when the chunk is processed. I'll have to check when validateSpeculations is called.

Eric Seidel (no email)

Comment 12 2013-03-08 15:16:24 PST

Created attachment 192294 [details] Hopefully addressed Adam's concerns

Adam Barth

Comment 13 2013-03-08 15:21:02 PST

Comment on attachment 192294 [details] Hopefully addressed Adam's concerns View in context: https://bugs.webkit.org/attachment.cgi?id=192294&action=review I can't tell if you changed HTMLDocumentParser::pumpPendingSpeculations or if I've just read it enough times to understand what it's doing. :) > Source/WebCore/html/parser/BackgroundHTMLInputStream.h:72 > + // Note: These indicies may === vector.size(), in which case there are no valid checkpoints/segments at this time. === -> == (Triple-equals is JavaScript silliness.)

Adam Barth

Comment 14 2013-03-08 15:22:36 PST

Comment on attachment 192294 [details] Hopefully addressed Adam's concerns View in context: https://bugs.webkit.org/attachment.cgi?id=192294&action=review > Source/WebCore/html/parser/HTMLDocumentParser.cpp:484 > + HTMLParserThread::shared()->postTask(bind(&BackgroundHTMLParser::passedCheckpoint, m_backgroundParser, lastCheckpointPassed)); The only problem with this patch is that we'll never call passedCheckpoint if we never call pumpPendingSpeculations. We only call pumpPendingSpeculations if we ended up queuing speculations. It's entirely possible that we can process every chunk from the background parser immediately after receiving them.

Eric Seidel (no email)

Comment 15 2013-03-08 15:27:51 PST

(In reply to comment #14) > (From update of attachment 192294 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=192294&action=review > > > Source/WebCore/html/parser/HTMLDocumentParser.cpp:484 > > + HTMLParserThread::shared()->postTask(bind(&BackgroundHTMLParser::passedCheckpoint, m_backgroundParser, lastCheckpointPassed)); > > The only problem with this patch is that we'll never call passedCheckpoint if we never call pumpPendingSpeculations. We only call pumpPendingSpeculations if we ended up queuing speculations. It's entirely possible that we can process every chunk from the background parser immediately after receiving them. That's true. I'm not sure if I should do that in a follow-up, or go another round with this one. I'm open to either.

Eric Seidel (no email)

Comment 16 2013-03-08 15:29:04 PST

(In reply to comment #15) > (In reply to comment #14) > > (From update of attachment 192294 [details] [details]) > > View in context: https://bugs.webkit.org/attachment.cgi?id=192294&action=review > > > > > Source/WebCore/html/parser/HTMLDocumentParser.cpp:484 > > > + HTMLParserThread::shared()->postTask(bind(&BackgroundHTMLParser::passedCheckpoint, m_backgroundParser, lastCheckpointPassed)); > > > > The only problem with this patch is that we'll never call passedCheckpoint if we never call pumpPendingSpeculations. We only call pumpPendingSpeculations if we ended up queuing speculations. It's entirely possible that we can process every chunk from the background parser immediately after receiving them. > > That's true. I'm not sure if I should do that in a follow-up, or go another round with this one. I'm open to either. It seems inefficient to tell the background parser to release a chunk after each chunk. But maybe I'm optimizing prematurely. I have no data to suggest that messaging is expensive at all. :)

Adam Barth

Comment 17 2013-03-08 15:32:14 PST

Followup patch is fine.

WebKit Review Bot

Comment 18 2013-03-08 15:41:32 PST

Comment on attachment 192294 [details] Hopefully addressed Adam's concerns Clearing flags on attachment: 192294 Committed r145277: <http://trac.webkit.org/changeset/145277>

WebKit Review Bot

Comment 19 2013-03-08 15:41:36 PST

All reviewed patches have been landed. Closing bug.

Eric Seidel (no email)

Comment 20 2013-03-11 17:08:50 PDT

This work continued in bug 112069.

WebKit Review Bot

Comment 21 2013-03-12 11:12:05 PDT

Re-opened since this is blocked by bug 112170

Eric Seidel (no email)

Comment 22 2013-03-14 02:35:20 PDT

*** This bug has been marked as a duplicate of bug 112069 ***

Note You need to log in before you can comment on or make changes to this bug.