Bug 236037 - Wasm crash on https://copy.sh/v86/?profile=dsl
Summary: Wasm crash on https://copy.sh/v86/?profile=dsl
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Saam Barati
URL:
Keywords: InRadar
Depends on:
Blocks: 236269
  Show dependency treegraph
 
Reported: 2022-02-02 13:54 PST by Yusuke Suzuki
Modified: 2022-02-07 21:56 PST (History)
7 users (show)

See Also:


Attachments
asm.txt (16.18 MB, text/plain)
2022-02-06 20:42 PST, Saam Barati
no flags Details
quick-and-dirty-fix (2.82 KB, patch)
2022-02-06 21:05 PST, Saam Barati
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
patch (5.32 KB, patch)
2022-02-07 15:57 PST, Saam Barati
mark.lam: review+
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
patch for landing (5.33 KB, patch)
2022-02-07 18:16 PST, Saam Barati
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yusuke Suzuki 2022-02-02 13:54:55 PST
And in the other examples in https://github.com/copy/v86
Comment 1 Alexey Proskuryakov 2022-02-03 16:45:22 PST
rdar://88358719
Comment 2 Saam Barati 2022-02-04 10:06:40 PST
crash in https://copy.sh/v86/?profile=dsl
Comment 3 Saam Barati 2022-02-04 11:38:54 PST
Turning off OMG makes the crash go away. Will try to configure more options to see what reproduces and what doesn't.
Comment 4 Saam Barati 2022-02-04 11:58:07 PST
it seems like an issue with OMG+fast memory. If I turn on OMG, and turn off fast memory, I don't crash.
Comment 5 Saam Barati 2022-02-04 15:42:31 PST
Reproduces with concurrent JIT off.
Comment 6 Saam Barati 2022-02-04 15:43:15 PST
Reproduces with OSR entry turned off
Comment 7 Saam Barati 2022-02-05 11:03:21 PST
seems like we need to be running with webAssemblyOMGOptimizationLevel=2 for this to reproduce. It also doesn't seem to be a bug in IRC, since running with O1 and forcing on IRC does not crash either
Comment 8 Yusuke Suzuki 2022-02-05 11:15:43 PST
(In reply to Saam Barati from comment #7)
> seems like we need to be running with webAssemblyOMGOptimizationLevel=2 for
> this to reproduce. It also doesn't seem to be a bug in IRC, since running
> with O1 and forcing on IRC does not crash either

Nice! In that case, typically I comment out several path in B3Generate.cpp to see which pass can trigger this problem :)

   84     if (procedure.optLevel() >= 2) {
   85         reduceDoubleToFloat(procedure);
   86         reduceStrength(procedure);
   87         // FIXME: Re-enable B3 hoistLoopInvariantValues
   88         // https://bugs.webkit.org/show_bug.cgi?id=212651
   89         if (Options::useB3HoistLoopInvariantValues())
   90             hoistLoopInvariantValues(procedure);
   91         if (eliminateCommonSubexpressions(procedure))
   92             eliminateCommonSubexpressions(procedure);
   93         eliminateDeadCode(procedure);
   94         inferSwitches(procedure);
   95         reduceLoopStrength(procedure);
   96         if (Options::useB3TailDup())
   97             duplicateTails(procedure);
   98         fixSSA(procedure);
   99         foldPathConstants(procedure);
  100         // FIXME: Add more optimizations here.
  101         // https://bugs.webkit.org/show_bug.cgi?id=150507
Comment 9 Saam Barati 2022-02-06 20:33:38 PST
(In reply to Yusuke Suzuki from comment #8)
> (In reply to Saam Barati from comment #7)
> > seems like we need to be running with webAssemblyOMGOptimizationLevel=2 for
> > this to reproduce. It also doesn't seem to be a bug in IRC, since running
> > with O1 and forcing on IRC does not crash either
> 
> Nice! In that case, typically I comment out several path in B3Generate.cpp
> to see which pass can trigger this problem :)
> 
>    84     if (procedure.optLevel() >= 2) {
>    85         reduceDoubleToFloat(procedure);
>    86         reduceStrength(procedure);
>    87         // FIXME: Re-enable B3 hoistLoopInvariantValues
>    88         // https://bugs.webkit.org/show_bug.cgi?id=212651
>    89         if (Options::useB3HoistLoopInvariantValues())
>    90             hoistLoopInvariantValues(procedure);
>    91         if (eliminateCommonSubexpressions(procedure))
>    92             eliminateCommonSubexpressions(procedure);
>    93         eliminateDeadCode(procedure);
>    94         inferSwitches(procedure);
>    95         reduceLoopStrength(procedure);
>    96         if (Options::useB3TailDup())
>    97             duplicateTails(procedure);
>    98         fixSSA(procedure);
>    99         foldPathConstants(procedure);
>   100         // FIXME: Add more optimizations here.
>   101         // https://bugs.webkit.org/show_bug.cgi?id=150507

Yeah, I've been doing this. No luck so far. Though commenting out some stuff seems to change how frequently the crash reproduces.
Comment 10 Saam Barati 2022-02-06 20:40:42 PST
Unrelated, but we emit some silly code we should fix:

```
         0x120b5282c:    str      x0, [sp, #17128]
         0x120b52830:    ldr      x0, [sp, #17128]
```
Comment 11 Saam Barati 2022-02-06 20:42:50 PST
Created attachment 451062 [details]
asm.txt

Attaching asm of a function we're crashing in. Noted which instruction we're crashing in with "CRASH".

We have this code sequence:

         0x120b52850:    ldr      x0, [sp, #17096]
         0x120b52854:    add      x0, x22, x0
         0x120b52858:    movz     x16, #0xc640
         0x120b5285c:    add      x16, sp, x16, uxtx
         0x120b52860:    ldur     x1, [x16]
         0x120b52864:    ldr      w20, [x0, x1] <--- CRASH


The interesting thing here is I can't find where we store to [sp, #17096], which makes me wonder if it's uninitialized. But there's a chance we're storing it somehow that I haven't been able to figure out by statically looking at the asm.
Comment 12 Saam Barati 2022-02-06 20:46:41 PST
(In reply to Saam Barati from comment #11)
> Created attachment 451062 [details]
> asm.txt
> 
> Attaching asm of a function we're crashing in. Noted which instruction we're
> crashing in with "CRASH".
> 
> We have this code sequence:
> 
>          0x120b52850:    ldr      x0, [sp, #17096]
>          0x120b52854:    add      x0, x22, x0
>          0x120b52858:    movz     x16, #0xc640
>          0x120b5285c:    add      x16, sp, x16, uxtx
>          0x120b52860:    ldur     x1, [x16]
>          0x120b52864:    ldr      w20, [x0, x1] <--- CRASH
> 
> 
> The interesting thing here is I can't find where we store to [sp, #17096],
> which makes me wonder if it's uninitialized. But there's a chance we're
> storing it somehow that I haven't been able to figure out by statically
> looking at the asm.

Nevermind, was happening right in front of my face:
         0x120b52840:    movz     x16, #0x42c8
         0x120b52844:    add      x16, sp, x16, uxtx
         0x120b52848:    stur     w0, [x16]
         0x120b5284c:    stur     wzr, [x16]


in:

         0x120b52840:    movz     x16, #0x42c8
         0x120b52844:    add      x16, sp, x16, uxtx
         0x120b52848:    stur     w0, [x16]
         0x120b5284c:    stur     wzr, [x16]
         0x120b52850:    ldr      x0, [sp, #17096]
         0x120b52854:    add      x0, x22, x0
         0x120b52858:    movz     x16, #0xc640
         0x120b5285c:    add      x16, sp, x16, uxtx
         0x120b52860:    ldur     x1, [x16]
         0x120b52864:    ldr      w20, [x0, x1] <--- CRASH


But it seems like we're storing zero to it right after storing something else. So weird.
Comment 13 Saam Barati 2022-02-06 20:51:41 PST
Check out these first two moves of constants followed by stores:
         0x120b5283c:    movz     x16, #0x42cc
         0x120b52840:    movz     x16, #0x42c8
         0x120b52844:    add      x16, sp, x16, uxtx
         0x120b52848:    stur     w0, [x16]
         0x120b5284c:    stur     wzr, [x16]
         0x120b52850:    ldr      x0, [sp, #17096]
         0x120b52854:    add      x0, x22, x0
         0x120b52858:    movz     x16, #0xc640
         0x120b5285c:    add      x16, sp, x16, uxtx
         0x120b52860:    ldur     x1, [x16]
         0x120b52864:    ldr      w20, [x0, x1] <--- CRASH


Can't get over the fact that it seems like we shouldn't be moving into x16 back to back like that, and somehow the code should've been ordered differently so we store to both locations independently.
Comment 14 Saam Barati 2022-02-06 20:56:18 PST
I'm pretty suspicious of a bug inside AirLowerStackArgs.
Comment 15 Saam Barati 2022-02-06 21:02:11 PST
(In reply to Saam Barati from comment #14)
> I'm pretty suspicious of a bug inside AirLowerStackArgs.

I found a bug inside AirLowerStackArgs. But I don't know if it's our crash or not.
Comment 16 Saam Barati 2022-02-06 21:05:32 PST
Created attachment 451064 [details]
quick-and-dirty-fix

this is the bug I spotted. Will make a nicer patch once I test it.
Comment 17 Saam Barati 2022-02-06 21:30:17 PST
(In reply to Saam Barati from comment #16)
> Created attachment 451064 [details]
> quick-and-dirty-fix
> 
> this is the bug I spotted. Will make a nicer patch once I test it.

This is the right fix. It also fixes a bug where the site just hangs when loading.

We used to crash or hang 100% of the time, and this fixes that issue.

However, I did encounter one crash that I still need to investigate. The crash happened ~1/20 loads. So maybe it's a different issue.

I'll upload a patch to fix this tomorrow. Ima try to think about how to write a test case, too. It might be tricky.

I'll open a separate bug to investigate the other issue tomorrow.
Comment 18 Saam Barati 2022-02-07 15:57:44 PST
Created attachment 451165 [details]
patch
Comment 19 Mark Lam 2022-02-07 17:11:59 PST
Comment on attachment 451165 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=451165&action=review

r=me

> Source/JavaScriptCore/ChangeLog:22
> +        mov x16, k
> +        mov x16, k2
> +        mov x1, [x16]
> +        mov zr, [x16]

Did you mean this instead?
```
    mov k, x16
    mov k2, x16
    mov x1, [x16]
    mov zr, [x16]
```
I find this example confusing: the first 2 instructions imply the dest is the left operand, while the last 2 instructions imply that the dest is the right operand.  Can you make it consistent?

> Source/JavaScriptCore/ChangeLog:28
> +        mov x16, k
> +        mov x1, [x16]
> +        mov x16, k2
> +        mov zr, [x16]

Ditto.
Comment 20 Saam Barati 2022-02-07 18:12:13 PST
Comment on attachment 451165 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=451165&action=review

>> Source/JavaScriptCore/ChangeLog:22
>> +        mov zr, [x16]
> 
> Did you mean this instead?
> ```
>     mov k, x16
>     mov k2, x16
>     mov x1, [x16]
>     mov zr, [x16]
> ```
> I find this example confusing: the first 2 instructions imply the dest is the left operand, while the last 2 instructions imply that the dest is the right operand.  Can you make it consistent?

We're moving constants k and k2 into x16. Let me switch to using arm syntax and it'll clear things up.
Comment 21 Saam Barati 2022-02-07 18:16:24 PST
Created attachment 451189 [details]
patch for landing
Comment 22 EWS 2022-02-07 19:01:40 PST
Committed r289354 (246942@main): <https://commits.webkit.org/246942@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 451189 [details].
Comment 23 Saam Barati 2022-02-07 21:56:49 PST
<rdar://88611690>