RESOLVED WONTFIX 164823
Slight Octane regression from concurrent GC's eager object zero-fill
https://bugs.webkit.org/show_bug.cgi?id=164823
Summary Slight Octane regression from concurrent GC's eager object zero-fill
Filip Pizlo
Reported 2016-11-16 09:41:37 PST
Patch forthcoming.
Attachments
one possible approach (5.04 KB, patch)
2016-11-16 09:49 PST, Filip Pizlo
no flags
the patch (5.09 KB, patch)
2016-11-16 13:02 PST, Filip Pizlo
ggaren: review+
Filip Pizlo
Comment 1 2016-11-16 09:49:58 PST
Created attachment 294946 [details] one possible approach This fixes the eager zero-fill performance problems by doing the zero-fill conditionally. But I'm also going to attempt a fix where the GC manages the zero-filling.
Filip Pizlo
Comment 2 2016-11-16 13:02:10 PST
(In reply to comment #1) > Created attachment 294946 [details] > one possible approach > > This fixes the eager zero-fill performance problems by doing the zero-fill > conditionally. > > But I'm also going to attempt a fix where the GC manages the zero-filling. Thinking about it more, doing the zero-filling in GC doesn't work. It fails to protect us from the case where the collector races with a transition on an object that was allocated without eager zero-fill. Also, the GC zero-fill patch was not really faster.
Filip Pizlo
Comment 3 2016-11-16 13:02:53 PST
Created attachment 294967 [details] the patch
Geoffrey Garen
Comment 4 2016-11-16 13:05:59 PST
Comment on attachment 294967 [details] the patch r=me
Filip Pizlo
Comment 5 2016-11-16 14:25:13 PST
Filip Pizlo
Comment 6 2016-11-16 15:50:38 PST
I'm beginning to think that no version of this optimization is sound: it's possible that an object allocated before GC started, and so without zero-fill, could be part of a memory ordering race while the GC is running. I think that a better approach would be to remove object initializations and then put fences on PutStructure. To make that efficient on ARM, we would have to do redundant PutStructure elimination in DFG IR.
Filip Pizlo
Comment 7 2016-11-16 16:02:20 PST
(In reply to comment #6) > I'm beginning to think that no version of this optimization is sound: it's > possible that an object allocated before GC started, and so without > zero-fill, could be part of a memory ordering race while the GC is running. > > I think that a better approach would be to remove object initializations and > then put fences on PutStructure. To make that efficient on ARM, we would > have to do redundant PutStructure elimination in DFG IR. Rolled out in r208822.
Chris Dumez
Comment 8 2016-11-16 21:02:10 PST
(In reply to comment #5) > Landed in https://trac.webkit.org/changeset/208811 FYI, one of our Mac bots has caught up with r208811 and there was no progression on Octane.
Chris Dumez
Comment 9 2016-11-18 09:30:20 PST
(In reply to comment #8) > (In reply to comment #5) > > Landed in https://trac.webkit.org/changeset/208811 > > FYI, one of our Mac bots has caught up with r208811 and there was no > progression on Octane. FYI, A/B testing shows there was no Speedometer progression from r208811 on ARM either.
Filip Pizlo
Comment 10 2016-12-13 10:13:52 PST
I don't think that we know that there was actually any regression due to this specifically, based on subsequent testing.
Note You need to log in before you can comment on or make changes to this bug.