Bug 186916 - We can't remove code after ForceOSRExit until after FixupPhase
Summary: We can't remove code after ForceOSRExit until after FixupPhase
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: Safari 11
Hardware: iPhone / iPad iOS 11
: P2 Normal
Assignee: Saam Barati
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2018-06-22 05:54 PDT by d.lukashenko
Modified: 2019-03-18 14:51 PDT (History)
17 users (show)

See Also:


Attachments
test case (3.29 KB, text/html)
2018-06-22 05:55 PDT, d.lukashenko
no flags Details
Mac test case (2.31 KB, text/html)
2019-01-20 14:10 PST, Saam Barati
no flags Details
patch (3.22 KB, patch)
2019-01-20 16:34 PST, Saam Barati
no flags Details | Formatted Diff | Diff
patch (3.88 KB, patch)
2019-01-20 16:38 PST, Saam Barati
no flags Details | Formatted Diff | Diff
patch (3.91 KB, patch)
2019-01-20 16:41 PST, Saam Barati
ysuzuki: review+
ews: commit-queue-
Details | Formatted Diff | Diff
patch for landing (5.54 KB, patch)
2019-01-20 18:43 PST, Saam Barati
no flags Details | Formatted Diff | Diff
WIP (6.05 KB, patch)
2019-03-08 12:21 PST, Saam Barati
ews: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from ews106 for mac-highsierra-wk2 (2.79 MB, application/zip)
2019-03-08 14:28 PST, Build Bot
no flags Details
patch (9.14 KB, patch)
2019-03-13 12:50 PDT, Saam Barati
ysuzuki: review+
Details | Formatted Diff | Diff
patch for landing (9.15 KB, patch)
2019-03-14 19:39 PDT, Saam Barati
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description d.lukashenko 2018-06-22 05:54:43 PDT
We encounter issue when division calculation result assigned to variable results in variable having 0 value when it shouln't.

I've attached test html which demonstrates this issue (probably could be reduced to much smaller test case).
Steps to reproduce:
1. load page
2. in top left corner there will be canvas and you need to draw something using touch
3. click button
It will read image from canvas, calculate number of filled dots and divide by number of all dots to receive fill factor.
This operation performed 100 times and log attempts, when fill factor resulted in 0. For us it fails 5-6 first times. Note that divisor and divider populated properly and it is only result not saved properly to variable.
To repeat test you need to reload test page.

Environment details:
we can reproduce it on iOS 11.3-11.4 iPhone8/iPad
we cannot reproduce it on iOS 11.0 iPhone8
Comment 1 d.lukashenko 2018-06-22 05:55:42 PDT
Created attachment 343322 [details]
test case
Comment 2 Alexey Proskuryakov 2018-06-23 04:57:15 PDT
Does this reproduce for you with iOS 12 beta?
Comment 3 Radar WebKit Bug Importer 2018-06-23 04:57:25 PDT
<rdar://problem/41396612>
Comment 4 d.lukashenko 2018-06-23 06:05:53 PDT
Sorry, I'm afraid we don't have such device available, as we mostly use 3rd-party test environment providers and they don't have this version available.
Comment 5 d.lukashenko 2018-10-05 04:25:14 PDT
Just tested it on iOS 12/iPhone and it still fails to produce proper results first ~10 iterations
Comment 6 Saam Barati 2019-01-20 13:55:27 PST
This happens in all tiers. If I only use the LLInt, it still happens. If I just use baseline, it still happens. If I use baselineJIT+, it still happens. If I turn off GC, this still happens.

Doesn't feel like a JSC bug to me, but perhaps it is. I wonder if it could be something w/ how canvas is filling in the buffers.
Comment 7 Saam Barati 2019-01-20 13:56:52 PST
(In reply to Saam Barati from comment #6)
> This happens in all tiers. If I only use the LLInt, it still happens. If I
> just use baseline, it still happens. If I use baselineJIT+, it still
> happens. If I turn off GC, this still happens.
> 
> Doesn't feel like a JSC bug to me, but perhaps it is. I wonder if it could
> be something w/ how canvas is filling in the buffers.

Weird, I am now reading the code. And it's literally the same calculation in two different places... Doesn't look like a canvas bug.
Comment 8 Saam Barati 2019-01-20 13:59:18 PST
(In reply to Saam Barati from comment #6)
> This happens in all tiers. If I only use the LLInt, it still happens. If I
> just use baseline, it still happens. If I use baselineJIT+, it still
> happens. If I turn off GC, this still happens.
> 
> Doesn't feel like a JSC bug to me, but perhaps it is. I wonder if it could
> be something w/ how canvas is filling in the buffers.

It looks like I was incorrectly setting environment variables. So my test wasn't valid.
Comment 9 Saam Barati 2019-01-20 14:02:45 PST
This is a DFG bug. It happens if I turn off the FTL. It doesn't happen if I turn off the DFG
Comment 10 Saam Barati 2019-01-20 14:10:54 PST
Created attachment 359650 [details]
Mac test case
Comment 11 Saam Barati 2019-01-20 14:30:47 PST
I know what's going on. We end up with IR like:

a: ArithDiv
b: MovHint(@a)
c: ForceOSRExit

However, for @a, we think we don't need to check that the remainder is non-zero. I need to figure out why backwards propagation is failing to say we care about this value.
Comment 12 Saam Barati 2019-01-20 14:34:52 PST
MovHint does nothing in backwards propagation. I wonder why... That feels wrong.
Comment 13 Saam Barati 2019-01-20 14:36:33 PST
(In reply to Saam Barati from comment #11)
> I know what's going on. We end up with IR like:
> 
> a: ArithDiv
> b: MovHint(@a)
> c: ForceOSRExit
> 
> However, for @a, we think we don't need to check that the remainder is
> non-zero. I need to figure out why backwards propagation is failing to say
> we care about this value.

This code is what we generate in the "var fillRatio = filled / pixels;"
Comment 14 Saam Barati 2019-01-20 16:20:47 PST
Reduced test for JSC shell:

```
function foo(v, a, b) {
    if (v) {
        let r = a / b;
        OSRExit();
        return r;
    }
}
noInline(foo);

for (let i = 0; i < 10000; ++i) {
    let r = foo(true, 4, 4);
    if (r !== 1)
        throw new Error("Bad!");
}
if (foo(true, 1, 4) !== 0.25)
    throw new Error("Bad!");
```
Comment 15 Saam Barati 2019-01-20 16:34:43 PST
Created attachment 359658 [details]
patch
Comment 16 Build Bot 2019-01-20 16:37:02 PST
Attachment 359658 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/ChangeLog:8:  Line contains tab character.  [whitespace/tab] [5]
Total errors found: 1 in 3 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 17 Saam Barati 2019-01-20 16:38:50 PST
Created attachment 359659 [details]
patch
Comment 18 Saam Barati 2019-01-20 16:41:51 PST
Created attachment 359660 [details]
patch
Comment 19 Yusuke Suzuki 2019-01-20 16:42:20 PST
Comment on attachment 359660 [details]
patch

r=me
Comment 20 Build Bot 2019-01-20 18:26:26 PST
Comment on attachment 359660 [details]
patch

Attachment 359660 [details] did not pass jsc-ews (mac):
Output: https://webkit-queues.webkit.org/results/10823730

New failing tests:
stress/arith-abs-to-arith-negate-range-optimizaton.js.no-cjit-validate-phases
stress/arith-abs-to-arith-negate-range-optimizaton.js.no-ftl
stress/arith-abs-to-arith-negate-range-optimizaton.js.default
stress/arith-abs-to-arith-negate-range-optimizaton.js.ftl-no-cjit-no-inline-validate
stress/arith-abs-to-arith-negate-range-optimizaton.js.no-cjit-collect-continuously
stress/arith-abs-to-arith-negate-range-optimizaton.js.ftl-no-cjit-validate-sampling-profiler
stress/arith-abs-to-arith-negate-range-optimizaton.js.ftl-no-cjit-b3o1
stress/arith-abs-to-arith-negate-range-optimizaton.js.no-llint
apiTests
Comment 21 Saam Barati 2019-01-20 18:43:34 PST
Created attachment 359664 [details]
patch for landing
Comment 22 WebKit Commit Bot 2019-01-20 19:54:22 PST
Comment on attachment 359664 [details]
patch for landing

Clearing flags on attachment: 359664

Committed r240223: <https://trac.webkit.org/changeset/240223>
Comment 23 WebKit Commit Bot 2019-01-20 19:54:24 PST
All reviewed patches have been landed.  Closing bug.
Comment 24 Saam Barati 2019-01-22 11:31:30 PST
This is a 1% regression on JetStream 2. It regresses float-mm, crypto, octane-zlib by quite a lot. We need to think about how to do this without affecting performance. (Thought AFAICT, code is just wrong without this patch, so it may take some creativity to teach OSR how to materialize the bits it needs)
Comment 25 Saam Barati 2019-01-22 11:41:26 PST
rolled out in:
https://trac.webkit.org/changeset/240269/webkit
Comment 26 Saam Barati 2019-01-22 11:49:20 PST
I'm beginning to think that this analysis is only sound over bytecode. Alternatively, we could teach the DFG to recover the original value and only perform this optimization if it's able to recover the value. Like OSR exit can easily replay an `add` without doing side effects.

The reason I think this regressed is that I added code into BytecodeParser that made us emit Unreachable after ForceOSRExit. Before that, we'd be able to better "analyze" what the bytecode was doing because we saw past these ForceOSRExit nodes. However, relying on that is super hacky. We should probably solve this in a general way.
Comment 27 Filip Pizlo 2019-03-07 16:14:46 PST
(In reply to Saam Barati from comment #26)
> I'm beginning to think that this analysis is only sound over bytecode.
> Alternatively, we could teach the DFG to recover the original value and only
> perform this optimization if it's able to recover the value. Like OSR exit
> can easily replay an `add` without doing side effects.

Right now it's sound because the ByteCodeParser creates DFG IR that has a complete view of exits, either because we emit real Nodes for all uses in bytecode, or we emit Phantoms to simulate those uses.

Based on this, our current approach of backwards-analyzing DFG IR is sound, except if you do the Unreachable code killing thing in the ByteCodeParser.  It's generally not OK to do optimizations in ByteCodeParser.  We can do some, but we have to be super careful.

> 
> The reason I think this regressed is that I added code into BytecodeParser
> that made us emit Unreachable after ForceOSRExit. Before that, we'd be able
> to better "analyze" what the bytecode was doing because we saw past these
> ForceOSRExit nodes. However, relying on that is super hacky. We should
> probably solve this in a general way.

Philosophically, I support eventually doing the BackwardsPropagationPhase over bytecode.  But we should be careful about making that move.  I think that there are perf benefits to doing the analysis over DFG IR, since it means that the results of the analysis are immediately expressed inside the DFG IR data structures and not somewhere else.

So, it may be that the way we do it right now is the best.
Comment 28 Saam Barati 2019-03-07 19:38:08 PST
(In reply to Filip Pizlo from comment #27)
> (In reply to Saam Barati from comment #26)
> > I'm beginning to think that this analysis is only sound over bytecode.
> > Alternatively, we could teach the DFG to recover the original value and only
> > perform this optimization if it's able to recover the value. Like OSR exit
> > can easily replay an `add` without doing side effects.
> 
> Right now it's sound because the ByteCodeParser creates DFG IR that has a
> complete view of exits, either because we emit real Nodes for all uses in
> bytecode, or we emit Phantoms to simulate those uses.
> 
> Based on this, our current approach of backwards-analyzing DFG IR is sound,
> except if you do the Unreachable code killing thing in the ByteCodeParser. 
> It's generally not OK to do optimizations in ByteCodeParser.  We can do
> some, but we have to be super careful.

Maybe we should make this optimization just emit Phantoms. I'll first run some perf numbers to see if we can just remove this optimization.

> 
> > 
> > The reason I think this regressed is that I added code into BytecodeParser
> > that made us emit Unreachable after ForceOSRExit. Before that, we'd be able
> > to better "analyze" what the bytecode was doing because we saw past these
> > ForceOSRExit nodes. However, relying on that is super hacky. We should
> > probably solve this in a general way.
> 
> Philosophically, I support eventually doing the BackwardsPropagationPhase
> over bytecode.  But we should be careful about making that move.  I think
> that there are perf benefits to doing the analysis over DFG IR, since it
> means that the results of the analysis are immediately expressed inside the
> DFG IR data structures and not somewhere else.
> 
> So, it may be that the way we do it right now is the best.

Right. If we want to keep this optimization, I see two potential paths forward:
1. Make it use Phantom and keep it in bytecode parser.
2. Make it a helper phase that we run after fixup.
Comment 29 Filip Pizlo 2019-03-07 20:31:33 PST
(In reply to Saam Barati from comment #28)
> (In reply to Filip Pizlo from comment #27)
> > (In reply to Saam Barati from comment #26)
> > > I'm beginning to think that this analysis is only sound over bytecode.
> > > Alternatively, we could teach the DFG to recover the original value and only
> > > perform this optimization if it's able to recover the value. Like OSR exit
> > > can easily replay an `add` without doing side effects.
> > 
> > Right now it's sound because the ByteCodeParser creates DFG IR that has a
> > complete view of exits, either because we emit real Nodes for all uses in
> > bytecode, or we emit Phantoms to simulate those uses.
> > 
> > Based on this, our current approach of backwards-analyzing DFG IR is sound,
> > except if you do the Unreachable code killing thing in the ByteCodeParser. 
> > It's generally not OK to do optimizations in ByteCodeParser.  We can do
> > some, but we have to be super careful.
> 
> Maybe we should make this optimization just emit Phantoms. I'll first run
> some perf numbers to see if we can just remove this optimization.
> 
> > 
> > > 
> > > The reason I think this regressed is that I added code into BytecodeParser
> > > that made us emit Unreachable after ForceOSRExit. Before that, we'd be able
> > > to better "analyze" what the bytecode was doing because we saw past these
> > > ForceOSRExit nodes. However, relying on that is super hacky. We should
> > > probably solve this in a general way.
> > 
> > Philosophically, I support eventually doing the BackwardsPropagationPhase
> > over bytecode.  But we should be careful about making that move.  I think
> > that there are perf benefits to doing the analysis over DFG IR, since it
> > means that the results of the analysis are immediately expressed inside the
> > DFG IR data structures and not somewhere else.
> > 
> > So, it may be that the way we do it right now is the best.
> 
> Right. If we want to keep this optimization, I see two potential paths
> forward:
> 1. Make it use Phantom and keep it in bytecode parser.
> 2. Make it a helper phase that we run after fixup.

The phantom approach is going to be faster. So we should do that. 

Ugh. I don’t like bytecode parser optimizations at all but perf is perf and the third tier is not about being pretty.
Comment 30 Saam Barati 2019-03-08 12:21:30 PST
Created attachment 364048 [details]
WIP

I think this should work. I'm first going to run an A/B test to see if we can just completely get rid of this code instead of making this change. If the A/B tests say we need this code, I'll prepare a full patch for review.
Comment 31 Build Bot 2019-03-08 12:27:30 PST
Attachment 364048 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:7329:  This { should be at the end of the previous line  [whitespace/braces] [4]
Total errors found: 1 in 2 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 32 Build Bot 2019-03-08 14:28:04 PST
Comment on attachment 364048 [details]
WIP

Attachment 364048 [details] did not pass mac-wk2-ews (mac-wk2):
Output: https://webkit-queues.webkit.org/results/11430768

New failing tests:
accessibility/mac/selection-notification-focus-change.html
Comment 33 Build Bot 2019-03-08 14:28:06 PST
Created attachment 364066 [details]
Archive of layout-test-results from ews106 for mac-highsierra-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews106  Port: mac-highsierra-wk2  Platform: Mac OS X 10.13.6
Comment 34 Saam Barati 2019-03-11 11:05:51 PDT
Mac data so far:
Up to 0.5% worse on Speedometer2, but not on every device.
Neutral on JS2 on iMac. I'm kicking off some tasks to run it on more Mac HW models.


I don't yet have iOS data as some tasks I kicked off did not succeed. Hopefully I will have that data soon.
Comment 35 Saam Barati 2019-03-13 12:50:16 PDT
Created attachment 364562 [details]
patch
Comment 36 Yusuke Suzuki 2019-03-14 19:35:16 PDT
Comment on attachment 364562 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=364562&action=review

r=me

> Source/JavaScriptCore/ChangeLog:16
> +        There was an optimization on the bytecode parser I added that converted blocks
> +        with ForceOSRExit in them to remove all IR after the ForceOSRExit. However,
> +        this is incorrect because it breaks backwards propagation. For example, it
> +        could incorrectly lead us to think it's safe to not check for overflow in
> +        an Add because such Add has no non-int uses. Backwards propagation relies on
> +        having a view over bytecode uses, and this optimization broke that. This patch
> +        rolls out that optimization, as initial perf data shows it may no longer be
> +        needed.

You can mention to r232742 here. This patch is logically the revert of r232742.
Comment 37 Saam Barati 2019-03-14 19:36:21 PDT
Comment on attachment 364562 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=364562&action=review

>> Source/JavaScriptCore/ChangeLog:16
>> +        needed.
> 
> You can mention to r232742 here. This patch is logically the revert of r232742.

Good point. Will do.
Comment 38 Saam Barati 2019-03-14 19:39:23 PDT
Created attachment 364757 [details]
patch for landing
Comment 39 WebKit Commit Bot 2019-03-14 21:31:58 PDT
Comment on attachment 364757 [details]
patch for landing

Clearing flags on attachment: 364757

Committed r242989: <https://trac.webkit.org/changeset/242989>
Comment 40 WebKit Commit Bot 2019-03-14 21:32:00 PDT
All reviewed patches have been landed.  Closing bug.