Bug 85254

Summary: PageCache autorelease should not wait until 3 seconds and 42 pages
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: HistoryAssignee: Filip Pizlo <fpizlo>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, beidson, ggaren
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
the patch
none
the patch beidson: review+

Filip Pizlo
Reported 2012-04-30 18:21:35 PDT
This causes horrible pathologies when a script decides to trigger navigations. <rdar://problem/11349613>
Attachments
the patch (3.50 KB, patch)
2012-04-30 18:23 PDT, Filip Pizlo
no flags
the patch (3.64 KB, patch)
2012-04-30 18:24 PDT, Filip Pizlo
beidson: review+
Filip Pizlo
Comment 1 2012-04-30 18:23:05 PDT
Created attachment 139565 [details] the patch
Filip Pizlo
Comment 2 2012-04-30 18:24:57 PDT
Created attachment 139566 [details] the patch And now, with a change log!
Geoffrey Garen
Comment 3 2012-04-30 23:06:43 PDT
Comment on attachment 139566 [details] the patch r=me
Alexey Proskuryakov
Comment 4 2012-05-01 10:05:00 PDT
Pathologies notwithstanding, have you investigated the reasons for existing behavior? Geoff, I realize that you wrote this code originally, but there must be some tradeoff here.
Geoffrey Garen
Comment 5 2012-05-01 11:17:19 PDT
> Geoff, I realize that you wrote this code originally, but there must be some tradeoff here. Actually, I didn't write the code: Ken Kocienda / Richard Williamson wrote it man years ago, and I ported it to C++. I believe the original code was solely a hack to hide page teardown costs from PLT1. Hence the number 42: 1 greater than the number of pages in PLT1. Phil and Stephanie are going to test with PLT3, to verify that the original hack was not a good idea.
Brady Eidson
Comment 6 2012-05-02 14:40:27 PDT
Comment on attachment 139566 [details] the patch Results of new testing not withstanding, this seems too aggressive. If a call to autorelease comes in during a new page load, we will now be performing the work of the autorelease while also performing the work of the new page load. The user cares about the new page coming in hot, not the old suspended page being torn down "now" I think we need to look at this closer.
Brady Eidson
Comment 7 2012-05-02 14:44:13 PDT
(In reply to comment #6) > (From update of attachment 139566 [details]) > If a call to autorelease comes in during a new page load... And to be clear, the call to autorelease almost always comes in during a new page load.
Filip Pizlo
Comment 8 2012-05-02 14:44:33 PDT
(In reply to comment #6) > (From update of attachment 139566 [details]) > Results of new testing not withstanding, this seems too aggressive. > > If a call to autorelease comes in during a new page load, we will now be performing the work of the autorelease while also performing the work of the new page load. > > The user cares about the new page coming in hot, not the old suspended page being torn down "now" They certainly will care very much about the page being torn down, if the GC now has twice as much work during page load. > > I think we need to look at this closer. Measuring the performance is, I think, the right approach here, rather than rejecting the change out of hand.
Brady Eidson
Comment 9 2012-05-02 14:53:19 PDT
(In reply to comment #8) > (In reply to comment #6) > > (From update of attachment 139566 [details] [details]) > > Results of new testing not withstanding, this seems too aggressive. > > > > If a call to autorelease comes in during a new page load, we will now be performing the work of the autorelease while also performing the work of the new page load. > > > > The user cares about the new page coming in hot, not the old suspended page being torn down "now" > > They certainly will care very much about the page being torn down, if the GC now has twice as much work during page load. In the past we've achieved better page load performance by deferring things not related to page loading. These things include both tearing down suspended pages and performing GC, amongst many others. You're arguing the difference between performing 1x GC and 2x GC during page load. I think we should be performing 0x GC if CPU-time is needed for the load. > > > > I think we need to look at this closer. > > Measuring the performance is, I think, the right approach here, rather than rejecting the change out of hand. Benchmark driven performance *improvements* are absolutely fantastic. But one benchmark alone can not prove that a change is not a regression.
Brady Eidson
Comment 10 2012-05-02 14:53:53 PDT
(In reply to comment #9) > But one benchmark alone can not prove that a change is not a regression. s/regression/real world regression/
Filip Pizlo
Comment 11 2012-05-02 14:59:34 PDT
(In reply to comment #9) > (In reply to comment #8) > > (In reply to comment #6) > > > (From update of attachment 139566 [details] [details] [details]) > > > Results of new testing not withstanding, this seems too aggressive. > > > > > > If a call to autorelease comes in during a new page load, we will now be performing the work of the autorelease while also performing the work of the new page load. > > > > > > The user cares about the new page coming in hot, not the old suspended page being torn down "now" > > > > They certainly will care very much about the page being torn down, if the GC now has twice as much work during page load. > > In the past we've achieved better page load performance by deferring things not related to page loading. > > These things include both tearing down suspended pages and performing GC, amongst many others. It should, if it prevents us from running out of memory. > > You're arguing the difference between performing 1x GC and 2x GC during page load. I think we should be performing 0x GC if CPU-time is needed for the load. Not if it increases memory usage. > > > > > > > I think we need to look at this closer. > > > > Measuring the performance is, I think, the right approach here, rather than rejecting the change out of hand. > > Benchmark driven performance *improvements* are absolutely fantastic. > > But one benchmark alone can not prove that a change is not a regression. You might notice that the patch has yet to land even despite your r-. That's because we have no intention of landing it based on only one benchmark; it will land once a broader set of evidence is collected.
Geoffrey Garen
Comment 12 2012-05-02 16:14:32 PDT
Brady, are you referring to measurable evidence that the current autorelease behavior is a performance improvement, or are you stating your intuition?
Brady Eidson
Comment 13 2012-05-02 16:41:35 PDT
(In reply to comment #11) > (In reply to comment #9) > > (In reply to comment #8) > > > (In reply to comment #6) > > > > (From update of attachment 139566 [details] [details] [details] [details]) > > > > Results of new testing not withstanding, this seems too aggressive. > > > > > > > > If a call to autorelease comes in during a new page load, we will now be performing the work of the autorelease while also performing the work of the new page load. > > > > > > > > The user cares about the new page coming in hot, not the old suspended page being torn down "now" > > > > > > They certainly will care very much about the page being torn down, if the GC now has twice as much work during page load. > > > > In the past we've achieved better page load performance by deferring things not related to page loading. > > > > These things include both tearing down suspended pages and performing GC, amongst many others. > > It should, if it prevents us from running out of memory. Not running out of memory and avoiding using memory at all costs are two different things. This change seems decidedly tilted towards the later. > > You're arguing the difference between performing 1x GC and 2x GC during page load. I think we should be performing 0x GC if CPU-time is needed for the load. > > Not if it increases memory usage. This is a broad statement. Having a temporarily higher watermark - assume there actually is free memory - has historically been a worthwhile tradeoff for faster page loads. What priority has changed such that we avoid temporarily using more available memory even when it would help us load faster? > > > > I think we need to look at this closer. > > > > > > Measuring the performance is, I think, the right approach here, rather than rejecting the change out of hand. > > > > Benchmark driven performance *improvements* are absolutely fantastic. > > > > But one benchmark alone can not prove that a change is not a regression. > > You might notice that the patch has yet to land even despite your r-. That's because we have no intention of landing it based on only one benchmark; it will land once a broader set of evidence is collected. The only information about plans I had in the bug was that we were going to test to PLT3 then call it a day, and that seems insufficient to me. Getting a broader set of evidence (as well as having a more in depth discussion) is all I was asking for.
Brady Eidson
Comment 14 2012-05-02 16:44:07 PDT
(In reply to comment #12) > Brady, are you referring to measurable evidence that the current autorelease behavior is a performance improvement, or are you stating your intuition? Both. There is measurable evidence with PLT1. Since any measurable evidence beyond that is currently lacking, my intuition (as well as long history with this code) is all I have to go on. I've asked Stephanie to get a few specific numbers from the new PLT run which will help me move forward one way or another with this.
Filip Pizlo
Comment 15 2012-05-02 16:49:40 PDT
(In reply to comment #13) > (In reply to comment #11) > > (In reply to comment #9) > > > (In reply to comment #8) > > > > (In reply to comment #6) > > > > > (From update of attachment 139566 [details] [details] [details] [details] [details]) > > > > > Results of new testing not withstanding, this seems too aggressive. > > > > > > > > > > If a call to autorelease comes in during a new page load, we will now be performing the work of the autorelease while also performing the work of the new page load. > > > > > > > > > > The user cares about the new page coming in hot, not the old suspended page being torn down "now" > > > > > > > > They certainly will care very much about the page being torn down, if the GC now has twice as much work during page load. > > > > > > In the past we've achieved better page load performance by deferring things not related to page loading. > > > > > > These things include both tearing down suspended pages and performing GC, amongst many others. > > > > It should, if it prevents us from running out of memory. > > Not running out of memory and avoiding using memory at all costs are two different things. This change seems decidedly tilted towards the later. > > > > You're arguing the difference between performing 1x GC and 2x GC during page load. I think we should be performing 0x GC if CPU-time is needed for the load. > > > > Not if it increases memory usage. > > This is a broad statement. Having a temporarily higher watermark - assume there actually is free memory - has historically been a worthwhile tradeoff for faster page loads. > > What priority has changed such that we avoid temporarily using more available memory even when it would help us load faster? > > > > > > I think we need to look at this closer. > > > > > > > > Measuring the performance is, I think, the right approach here, rather than rejecting the change out of hand. > > > > > > Benchmark driven performance *improvements* are absolutely fantastic. > > > > > > But one benchmark alone can not prove that a change is not a regression. > > > > You might notice that the patch has yet to land even despite your r-. That's because we have no intention of landing it based on only one benchmark; it will land once a broader set of evidence is collected. > > The only information about plans I had in the bug was that we were going to test to PLT3 then call it a day, and that seems insufficient to me. > > Getting a broader set of evidence (as well as having a more in depth discussion) is all I was asking for. What evidence other than the benchmarks we are already running would you consider sufficient?
Filip Pizlo
Comment 16 2012-05-02 16:50:17 PDT
(In reply to comment #14) > (In reply to comment #12) > > Brady, are you referring to measurable evidence that the current autorelease behavior is a performance improvement, or are you stating your intuition? > > Both. > > There is measurable evidence with PLT1. > > Since any measurable evidence beyond that is currently lacking, my intuition (as well as long history with this code) is all I have to go on. > > I've asked Stephanie to get a few specific numbers from the new PLT run which will help me move forward one way or another with this. So this was a speculative R- in case the numbers, which are still being gathered, are in line with your guess?
Brady Eidson
Comment 17 2012-05-02 17:07:36 PDT
(In reply to comment #16) > (In reply to comment #14) > > (In reply to comment #12) > > > Brady, are you referring to measurable evidence that the current autorelease behavior is a performance improvement, or are you stating your intuition? > > > > Both. > > > > There is measurable evidence with PLT1. > > > > Since any measurable evidence beyond that is currently lacking, my intuition (as well as long history with this code) is all I have to go on. > > > > I've asked Stephanie to get a few specific numbers from the new PLT run which will help me move forward one way or another with this. > > So this was a speculative R- in case the numbers, which are still being gathered, are in line with your guess? No, it was an r- based on: 1 - My understanding of the issues involved with the code as-is as well as its history 2 - My desire to have a discussion of the negative side effect of the change that hadn't taken place (on the record) 3 - My desire to have a discussion of alternatives Note that the single benchmark we're running it against might not show a regression - which would be a supporting argument for the change - but would not entirely resolve any of my concerns.
Geoffrey Garen
Comment 18 2012-05-02 17:51:13 PDT
> There is measurable evidence with PLT1. To be clear, what PLT1 shows is: - If you load exactly 41 pages and then stop for > 3 seconds, a policy to wait 42 pages and 3 seconds is a speedup. - If you load more than 41 pages, or stop for < 3 seconds, a policy to wait 42 pages and 3 is a slowdown. That seems like pretty weak evidence in favor of this code.
Brady Eidson
Comment 19 2012-05-02 18:17:46 PDT
(In reply to comment #18) > > There is measurable evidence with PLT1. > > To be clear, what PLT1 shows is: > > - If you load exactly 41 pages and then stop for > 3 seconds, a policy to wait 42 pages and 3 seconds is a speedup. > > - If you load more than 41 pages, or stop for < 3 seconds, a policy to wait 42 pages and 3 is a slowdown. > > That seems like pretty weak evidence in favor of this code. Well, it shows that. But it also shows that: - With this code in place, the runtime is A. - With code removed, the runtime is A + B. I have verified this at least 3 times since joining the project. The most recent was in early 2010. At that time, B was substantial. B can be seen as sum(C) for 41 C's. What this patch does is introduce a potential stutter of C to the start of each page load. On average, C is B/41 which might not be substantial. But some C's are much worse than others and modern complex web pages are liable to create bigger C's The additional numbers I've asked Stephanie to gather when testing this patch will tell us what C is for each page load to give us actual numerical data. If we find that the patch doesn't largely regress the new B and that all values of C are way too small to be noticed by a real world user, that would be fantastic.
Brady Eidson
Comment 20 2012-05-03 11:46:43 PDT
Stephanie and Filip and I have discussed everything in email and in person and think this patch is the way to go for now. Restoring r+ on Geoff's behalf :)
Filip Pizlo
Comment 21 2012-05-03 15:36:21 PDT
Note You need to log in before you can comment on or make changes to this bug.