Bug 156457 - PolymorphicAccess should buffer AccessCases before regenerating
Summary: PolymorphicAccess should buffer AccessCases before regenerating
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: All All
: P2 Normal
Assignee: Filip Pizlo
URL:
Keywords:
Depends on: 156454 156470
Blocks:
  Show dependency treegraph
 
Reported: 2016-04-10 15:54 PDT by Filip Pizlo
Modified: 2016-08-03 13:38 PDT (History)
10 users (show)

See Also:


Attachments
non-working work in progress (25.72 KB, patch)
2016-04-11 13:48 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
it seems to compile and stuff (26.81 KB, patch)
2016-04-11 14:31 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
it passes so many tests (32.97 KB, patch)
2016-04-11 17:13 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (39.03 KB, patch)
2016-04-11 19:16 PDT, Filip Pizlo
benjamin: review+
buildbot: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from ews101 for mac-yosemite (1.02 MB, application/zip)
2016-04-11 20:22 PDT, Build Bot
no flags Details
Archive of layout-test-results from ews105 for mac-yosemite-wk2 (939.63 KB, application/zip)
2016-04-11 20:58 PDT, Build Bot
no flags Details
Archive of layout-test-results from ews112 for mac-yosemite (1.19 MB, application/zip)
2016-04-11 21:02 PDT, Build Bot
no flags Details
patch for landing (39.30 KB, patch)
2016-04-11 22:00 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2016-04-10 15:54:31 PDT
...
Comment 1 Filip Pizlo 2016-04-11 13:48:11 PDT
Created attachment 276171 [details]
non-working work in progress

Oh man.  This code is weird.
Comment 2 Filip Pizlo 2016-04-11 14:31:14 PDT
Created attachment 276174 [details]
it seems to compile and stuff
Comment 3 Filip Pizlo 2016-04-11 14:48:05 PDT
This seems to sort of be kind of working.  The heuristics are evil.  I'll have to spend some time on it.
Comment 4 Filip Pizlo 2016-04-11 17:13:34 PDT
Created attachment 276192 [details]
it passes so many tests
Comment 5 Filip Pizlo 2016-04-11 18:45:39 PDT
Looks like this is neutral on JS benchmarks.  That's pretty cool.  I'll test page load next.
Comment 6 Filip Pizlo 2016-04-11 19:16:18 PDT
Created attachment 276199 [details]
the patch
Comment 7 Benjamin Poulain 2016-04-11 20:16:17 PDT
Comment on attachment 276199 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=276199&action=review

> Source/JavaScriptCore/ChangeLog:22
> +        This change ensures that the DFG still gets the same kind of profiling 

Missing period.

> Source/JavaScriptCore/bytecode/PolymorphicAccess.cpp:1567
> +        // FIXME: Do we have to clone cases that aren't generated? Maybe we can just take those
> +        // from m_list, since we don't have to keep those alive if we fail.

Forgot this or intentional?

> Source/JavaScriptCore/bytecode/PolymorphicAccess.cpp:1611
> +    

Empty line.
Comment 8 Build Bot 2016-04-11 20:22:48 PDT
Comment on attachment 276199 [details]
the patch

Attachment 276199 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/1139417

Number of test failures exceeded the failure limit.
Comment 9 Build Bot 2016-04-11 20:22:53 PDT
Created attachment 276205 [details]
Archive of layout-test-results from ews101 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews101  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 10 Build Bot 2016-04-11 20:58:41 PDT
Comment on attachment 276199 [details]
the patch

Attachment 276199 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/1139520

New failing tests:
inspector/heap/garbageCollected.html
Comment 11 Build Bot 2016-04-11 20:58:44 PDT
Created attachment 276208 [details]
Archive of layout-test-results from ews105 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews105  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 12 Build Bot 2016-04-11 21:02:38 PDT
Comment on attachment 276199 [details]
the patch

Attachment 276199 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/1139477

New failing tests:
imported/w3c/web-platform-tests/streams/readable-streams/templated.https.html
inspector/formatting/formatting-javascript.html
Comment 13 Build Bot 2016-04-11 21:02:41 PDT
Created attachment 276210 [details]
Archive of layout-test-results from ews112 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews112  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 14 Filip Pizlo 2016-04-11 21:47:54 PDT
Those test failures all appear to be due to a missing null check in aboutToDie().
Comment 15 Filip Pizlo 2016-04-11 21:59:09 PDT
(In reply to comment #7)
> Comment on attachment 276199 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=276199&action=review
> 
> > Source/JavaScriptCore/ChangeLog:22
> > +        This change ensures that the DFG still gets the same kind of profiling 
> 
> Missing period.

fixed.

> 
> > Source/JavaScriptCore/bytecode/PolymorphicAccess.cpp:1567
> > +        // FIXME: Do we have to clone cases that aren't generated? Maybe we can just take those
> > +        // from m_list, since we don't have to keep those alive if we fail.
> 
> Forgot this or intentional?

Meant to file a bug, and link it here. Filed: https://bugs.webkit.org/show_bug.cgi?id=156493

> 
> > Source/JavaScriptCore/bytecode/PolymorphicAccess.cpp:1611
> > +    
> 
> Empty line.

Fixed.
Comment 16 Filip Pizlo 2016-04-11 22:00:19 PDT
Created attachment 276218 [details]
patch for landing
Comment 17 Filip Pizlo 2016-04-11 22:37:12 PDT
The bots are green but I am still looking into some JSC stress test failures.
Comment 18 Geoffrey Garen 2016-04-12 11:12:36 PDT
Comment on attachment 276218 [details]
patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=276218&action=review

> Source/JavaScriptCore/ChangeLog:20
> +        new code. We simply guarantee that after we buffer a case, we will take at most
> +        Options::repatchBufferingCountdown() slow path calls before generating code for it. That
> +        option is currently 7. Taking 7 more slow paths means that we have 7 more opportunities to
> +        gather more access cases, or to realize that this IC is too crazy to bother with.

Why not take a number of slow paths proportional to the size of the IC? Cost of regeneration is proportional to size of IC, so it should be OK to regenerate right away for small ICs and back off once the IC gets large.
Comment 19 Filip Pizlo 2016-04-12 11:51:48 PDT
(In reply to comment #18)
> Comment on attachment 276218 [details]
> patch for landing
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=276218&action=review
> 
> > Source/JavaScriptCore/ChangeLog:20
> > +        new code. We simply guarantee that after we buffer a case, we will take at most
> > +        Options::repatchBufferingCountdown() slow path calls before generating code for it. That
> > +        option is currently 7. Taking 7 more slow paths means that we have 7 more opportunities to
> > +        gather more access cases, or to realize that this IC is too crazy to bother with.
> 
> Why not take a number of slow paths proportional to the size of the IC? Cost
> of regeneration is proportional to size of IC, so it should be OK to
> regenerate right away for small ICs and back off once the IC gets large.

I don't think we need to get that fancy.  "7" is picked because:

- Most stubs never see that many cacheable cases.  They either don't see 7 cases ever - as in, after patching in fewer than 7 cases, we never take slow path again.  Or they see an uncacheable case before they hit 7, at which point the IC gives up completely.

- Even if you get past 7 cases, the next thing that happens is usually that you hit the IC stub size limit and the IC disables itself entirely.

Currently, for the vast majority of stubs, we will do this:

1) On second execution they repatch (either self cache or a monomorphic stub).

2) After 7 more executions after that one, we repatch for the final time, this time generating a nice stub that has many cases.  Or, before the 7th execution we give up because of an uncacheable slot.

Reducing the threshold from 7 to something smaller isn't going to help.  Code that runs for more than 7 executions will not benefit from having a temporary stub for fewer than 7 cases that gets replaced with a bigger stub sometime around the 7th execution.  Code that runs for less than 7 executions won't regret not having a stub; in fact it probably benefits in this patch from not wastefully generating a stub we don't use.

If I were to improve this code, I'd probably want to explore reducing repatchings even more, for example by removing the rule that the second execution always gets to repatch.

I think that a proportional approach to buffering would make sense if we ever allowed our ICs to have more than ~10 cases.  But so long as the max number of cases is small, I think that we want to converge to an approach where you generate the stub exactly once - you regenerate it once you have exactly all of the cases you will ever want to handle, and then after that, you want to stop repatching unless the IC is reset (due to watchpoints, GC, whatever).  That's why my next move is probably to remove (1) (always do a monomorphic repatch on second execution).

Note that we also already have other heuristics to throttle repatching if it happens a lot, and those heuristics do exponential backoff.  Those heuristics are meant to be like a rearguard, and they enable us to be sloppy in other places - for example the generated code in the stub is sometimes unsure about whether it wants to call the C++ slow path for repatching or the C++ slow path for just a plain polymprphic access.  In case of doubt they can call the repatching one because if it happens often then the repatching slow path will behave like the plain one because of the exponential backoff.
Comment 20 Filip Pizlo 2016-04-12 13:06:42 PDT
Landed in http://trac.webkit.org/changeset/199382
Comment 21 Chris Dumez 2016-04-13 17:24:53 PDT
I see a ~1% Cold PLT progression on iPhone 5 after this change.
Comment 22 Chris Dumez 2016-04-14 08:58:47 PDT
(In reply to comment #21)
> I see a ~1% Cold PLT progression on iPhone 5 after this change.

Looks like a potential 2% progression on Speedometer / Mac as well.
Comment 23 Chris Dumez 2016-04-14 09:12:58 PDT
(In reply to comment #22)
> (In reply to comment #21)
> > I see a ~1% Cold PLT progression on iPhone 5 after this change.
> 
> Looks like a potential 2% progression on Speedometer / Mac as well.

And a ~3% progression on Speedometer / iOS as well.
Comment 24 Kyle Seely 2016-08-03 10:15:00 PDT
This bug is affecting my application in really bad ways and I need to find a workaround for the current version of safari because this fix isn't there. I did a bisect of webkit and it led me to the commit that fixed this bug. Here is a jsfiddle https://jsfiddle.net/ohofdets/10/ that will reproduce the error in safari 9.1.2. The fiddle works as expected in the nightly build of webkit and the bisect led me to this commit. To give you some context of what the fiddle is doing we use a library called watchjs that uses Object.defineProperty to call callbacks when an object's property changes. Do you have any ideas of how I can work around this bug in javascript to eliminate this bug for safari users?