Bug 165963 - The GC should be able to reflect upon its pipeline of work
Summary: The GC should be able to reflect upon its pipeline of work
Status: RESOLVED DUPLICATE of bug 165910
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: All All
: P2 Normal
Assignee: Filip Pizlo
URL:
Keywords: InRadar
Depends on:
Blocks: 165910
  Show dependency treegraph
 
Reported: 2016-12-16 11:14 PST by Filip Pizlo
Modified: 2017-01-08 11:34 PST (History)
2 users (show)

See Also:


Attachments
a little something like this (29.95 KB, patch)
2016-12-17 10:37 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
a bit more (31.01 KB, patch)
2016-12-18 11:14 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
so fun (50.36 KB, patch)
2016-12-22 15:05 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
more (69.84 KB, patch)
2016-12-30 17:24 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (116.13 KB, patch)
2017-01-02 13:30 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
more (117.99 KB, patch)
2017-01-05 21:51 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
it compiles! (120.11 KB, patch)
2017-01-06 08:29 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
seems to work sorta (127.54 KB, patch)
2017-01-06 11:17 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
even more (133.77 KB, patch)
2017-01-06 15:18 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
giving up on the new scheduler (134.40 KB, patch)
2017-01-06 19:44 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
reverted to the old scheduling algorithm (132.87 KB, patch)
2017-01-07 11:05 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2016-12-16 11:14:33 PST
Patch forthcoming.
Comment 1 Filip Pizlo 2016-12-17 10:37:35 PST
Created attachment 297409 [details]
a little something like this

Work in progress.
Comment 2 Filip Pizlo 2016-12-17 10:38:26 PST
Comment on attachment 297409 [details]
a little something like this

View in context: https://bugs.webkit.org/attachment.cgi?id=297409&action=review

> Source/JavaScriptCore/heap/SpaceTimeScheduler.cpp:22
> - * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * (INCLUDING NEGfLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE

Oops.  Reverted.
Comment 3 Filip Pizlo 2016-12-18 11:14:33 PST
Created attachment 297442 [details]
a bit more
Comment 4 Filip Pizlo 2016-12-22 15:05:23 PST
Created attachment 297678 [details]
so fun
Comment 5 Filip Pizlo 2016-12-30 17:24:28 PST
Created attachment 297865 [details]
more
Comment 6 Filip Pizlo 2017-01-02 13:30:09 PST
Created attachment 297909 [details]
the patch
Comment 7 WebKit Commit Bot 2017-01-02 13:32:14 PST
Attachment 297909 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/heap/MarkingConstraint.cpp:38:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/heap/MarkingConstraint.cpp:38:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/heap/MarkingConstraint.cpp:39:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/heap/MarkingConstraint.cpp:40:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
Total errors found: 4 in 25 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 8 Filip Pizlo 2017-01-02 13:36:17 PST
Comment on attachment 297909 [details]
the patch

Ooops, did not mean to set r?
Comment 9 Filip Pizlo 2017-01-05 21:51:18 PST
Created attachment 298173 [details]
more
Comment 10 Filip Pizlo 2017-01-06 08:29:40 PST
Created attachment 298202 [details]
it compiles!
Comment 11 Filip Pizlo 2017-01-06 11:17:12 PST
Created attachment 298216 [details]
seems to work sorta
Comment 12 Filip Pizlo 2017-01-06 15:18:28 PST
Created attachment 298232 [details]
even more

This is getting fun.
Comment 13 Radar WebKit Bug Importer 2017-01-06 15:29:00 PST
<rdar://problem/29909896>
Comment 14 Filip Pizlo 2017-01-06 15:38:38 PST
The initial implementation boosted splay-latency but hurt on retreating wavefront worst-case pathologies like hash-map, because the only scheduler was a tad bit quicker to choose a longer pause.  It looks like the old scheduler was getting luckier on hash-map, but was still having a bad time - lots of wasted iterations before finishing a GC cycle.

I'm now adding some exponential backoff to the space-time scheduler.  This appears to preserve the patch's advantage on splay-latency while also progressing hash-map.

Ultimately, the goal of this patch is to create a framework that can be more easily extended, since it doesn't hide the cost of constraint evaluation from the scheduler.
Comment 15 Filip Pizlo 2017-01-06 19:44:59 PST
Created attachment 298250 [details]
giving up on the new scheduler

Going to go back to the old scheduler.  The new one is empirically worse.
Comment 16 Filip Pizlo 2017-01-06 19:45:32 PST
(In reply to comment #15)
> Created attachment 298250 [details]
> giving up on the new scheduler
> 
> Going to go back to the old scheduler.  The new one is empirically worse.

I should clarify: the same configuration that led to better splay-latency performance also killed Speedometer performance.  We don't want that.
Comment 17 Filip Pizlo 2017-01-07 11:05:59 PST
Created attachment 298280 [details]
reverted to the old scheduling algorithm

I think that what made the old scheduling algorithm so effective is that its decisions were so time-driven. It stayed in phase except when we "snapped" phase after a round of constraint solving. This meant that if we encountered jank while in some mode (resumed or suspended) and overran our budget then the scheduler would degrade to something like a coin flip. It turns out that having this kind of non-determinism is _good_ for the GC, since it protects us from many run-away scenarios. For example if a deterministic scheduler encounters a huge object, then we'd yield to mutator unless the scheduler had some mechanism to detect this case.

Altogether it seems that the good things about the old scheduler are:
- Eventually stop-the-world, which we arrive at somewhat elastically.
- Periodic stop-the-world increments that have a chance of getting lucky and declaring termination.
- Prioritize collector progress. In a retreating wavefront collector, the mutator has the speed advantage - so we need to give the collector some unfair advantage.
- Allow constraint solving to take as long as it needs to take to generate work.
- Degrade to coin flip when we encounter jank.

I can't easily think of another scheduling algorithm that achieves these things while staying so simple. Every time I've tried to dramatically change this scheduler, I've gotten regressions all over the place. I think I'll stick to making only incremental improvements. Probably if I do come back to improving the scheduler, then I'll try to make the scheduler's period adapt to the time it took to run constraints. We should use a longer period when we took longer to run constraints so that the mutator has more time to itself.
Comment 18 Filip Pizlo 2017-01-07 11:06:27 PST
(In reply to comment #17)
> Created attachment 298280 [details]
> reverted to the old scheduling algorithm
> 
> I think that what made the old scheduling algorithm so effective is that its
> decisions were so time-driven. It stayed in phase except when we "snapped"
> phase after a round of constraint solving. This meant that if we encountered
> jank while in some mode (resumed or suspended) and overran our budget then
> the scheduler would degrade to something like a coin flip. It turns out that
> having this kind of non-determinism is _good_ for the GC, since it protects
> us from many run-away scenarios. For example if a deterministic scheduler
> encounters a huge object, then we'd yield to mutator unless the scheduler
> had some mechanism to detect this case.
> 
> Altogether it seems that the good things about the old scheduler are:
> - Eventually stop-the-world, which we arrive at somewhat elastically.
> - Periodic stop-the-world increments that have a chance of getting lucky and
> declaring termination.
> - Prioritize collector progress. In a retreating wavefront collector, the
> mutator has the speed advantage - so we need to give the collector some
> unfair advantage.
> - Allow constraint solving to take as long as it needs to take to generate
> work.
> - Degrade to coin flip when we encounter jank.
> 
> I can't easily think of another scheduling algorithm that achieves these
> things while staying so simple. Every time I've tried to dramatically change
> this scheduler, I've gotten regressions all over the place. I think I'll
> stick to making only incremental improvements. Probably if I do come back to
> improving the scheduler, then I'll try to make the scheduler's period adapt
> to the time it took to run constraints. We should use a longer period when
> we took longer to run constraints so that the mutator has more time to
> itself.

Still testing this new scheduler... so far I just confirmed that it does the right things on splay and hash-map.
Comment 19 Filip Pizlo 2017-01-08 11:34:10 PST
The patch is actually fixing this and its parent bug.

*** This bug has been marked as a duplicate of bug 165910 ***