Bug 150083 - Tail call optimizations lead to crashes on ARM Thumb + Linux
Summary: Tail call optimizations lead to crashes on ARM Thumb + Linux
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Zan Dobersek
URL:
Keywords:
: 154857 155790 (view as bug list)
Depends on:
Blocks: 108645
  Show dependency treegraph
 
Reported: 2015-10-13 00:21 PDT by Zan Dobersek
Modified: 2016-04-15 06:32 PDT (History)
11 users (show)

See Also:


Attachments
JSC_showDisassembly=true output on google.com (9.72 MB, text/plain)
2015-10-13 00:27 PDT, Zan Dobersek
no flags Details
Naive fixes (1.57 KB, patch)
2016-01-15 01:01 PST, Zan Dobersek
no flags Details | Formatted Diff | Diff
Patch (3.17 KB, patch)
2016-04-06 11:25 PDT, Zan Dobersek
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Zan Dobersek 2015-10-13 00:21:11 PDT
With tail calls enabled in JSC, a SIGILL is hit on some sites when running on ARM Thumb and (if at all relevant due to possibly different call procedures) a Linux OS.

Here's a short gdb inspection of the situation on a release build, loading google.com:

Program received signal SIGILL, Illegal instruction.
0x719c90e4 in ?? ()
(gdb) info registers 
r0             0x7eeb2ee0	2129342176
r1             0x0	0
r2             0x73427f60	1933737824
r3             0x10	16
r4             0x73427f60	1933737824
r5             0x6ee8c53f	1860748607
r6             0x0	0
r7             0x7eeb2f78	2129342328
r8             0xfffffffb	4294967291
r9             0x6a8d53c0	1787646912
r10            0xfffffffb	4294967291
r11            0x7eeb33f8	2129343480
r12            0x719c90e2	1906086114
sp             0x7eeb2ee0	0x7eeb2ee0
lr             0x719a7c9d	1905949853
pc             0x719c90e4	0x719c90e4
cpsr           0x600b0010	1611333648
(gdb) x /20i $pc
=> 0x719c90e4:			; <UNDEFINED> instruction: 0xf04f466f
   0x719c90e8:			; <UNDEFINED> instruction: 0xf8c70c00
   0x719c90ec:			; <UNDEFINED> instruction: 0xf645c008
   0x719c90f0:	vsubhn.i16	d22, <illegal reg q11.5>, q4
   0x719c90f4:	eorsvs	r3, r7, r0, asr #12
   0x719c90f8:	ldmdbvs	r9!, {r3, r4, r5, r9, r10, lr}
   0x719c90fc:	vmla.i8	q11, q6, q5
   0x719c9100:	vmull.s8	<illegal reg q9.5>, d23, d1
   0x719c9104:			; <UNDEFINED> instruction: 0x47e06c10
   0x719c9108:			; <UNDEFINED> instruction: 0xf646469e
   0x719c910c:			; <UNDEFINED> instruction: 0xf2c766d4
   0x719c9110:	ldmdavs	r6!, {r6, r9, r10, r12, sp}
   0x719c9114:	tstle	r3, r0, lsl #28
   0x719c9118:	pop	{r0, r2, r3, r4, r5, r7, r9, r10, lr}
   0x719c911c:	ldrbmi	r4, [r0, -r0, lsl #1]!
   0x719c9120:	strvs	pc, [r8], r5, asr #12
   0x719c9124:	strbcc	pc, [r0], -r7, asr #5	; <UNPREDICTABLE>
   0x719c9128:	ldmdavs	r8!, {r0, r1, r2, r4, r5, sp, lr}
   0x719c912c:	cmnpl	sp, #78643200	; 0x4b00000
   0x719c9130:	movwvs	pc, #8903	; 0x22c7	; <UNPREDICTABLE>
(gdb) x /20i $pc-0x3
   0x719c90e1:	stmdb	sp!, {r7, lr}
   0x719c90e5:	mov	r7, sp
   0x719c90e7:	mov.w	r12, #0
   0x719c90eb:	str.w	r12, [r7, #8]
   0x719c90ef:	movw	r6, #24200	; 0x5e88
   0x719c90f3:	movt	r6, #29504	; 0x7340
   0x719c90f7:	str	r7, [r6, #0]
   0x719c90f9:	mov	r0, r7
   0x719c90fb:	ldr	r1, [r7, #16]
   0x719c90fd:	ldr	r2, [r1, #20]
   0x719c90ff:	movw	r12, #50049	; 0xc381
   0x719c9103:	movt	r12, #30224	; 0x7610
   0x719c9107:	blx	r12
   0x719c9109:	mov	lr, r3
   0x719c910b:	movw	r6, #28372	; 0x6ed4
   0x719c910f:	movt	r6, #29504	; 0x7340
   0x719c9113:	ldr	r6, [r6, #0]
   0x719c9115:	cmp	r6, #0
   0x719c9117:	bne.n	0x719c9120
   0x719c9119:	mov	sp, r7
(gdb) bt
#0  0x719c90e4 in ?? ()
#1  0x719a7c9c in ?? ()
#2  0x719a7c9c in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Comment 1 Zan Dobersek 2015-10-13 00:27:00 PDT
Created attachment 262978 [details]
JSC_showDisassembly=true output on google.com

This is all the output of the disassembled code that's generated in the same instance that's inspected in comment #0.

I assume what's supposed to be called is the JIT CTI native call at 0x719c90e0.
Comment 2 Zan Dobersek 2015-10-13 00:31:01 PDT
In debug build, a probably relevant assert is hit:

ASSERTION FAILED: !(reinterpret_cast<intptr_t>(to) & 1)
../Source/JavaScriptCore/assembler/ARMv7Assembler.h(2205) : static void JSC::ARMv7Assembler::relinkJump(void*, void*)
1   0x7408650e WTFCrash
2   0x729b6bd2 JSC::ARMv7Assembler::relinkJump(void*, void*)
3   0x729dbc32 JSC::AbstractMacroAssembler<JSC::ARMv7Assembler, JSC::MacroAssemblerARMv7>::repatchNearCall(JSC::CodeLocationNearCall, JSC::CodeLocationLabel)
4   0x729d773a JSC::linkFor(JSC::ExecState*, JSC::CallLinkInfo&, JSC::CodeBlock*, JSC::JSFunction*, JSC::MacroAssemblerCodePtr)
5   0x729bb8a8
Comment 3 Zan Dobersek 2016-01-15 01:01:52 PST
Created attachment 269042 [details]
Naive fixes

At this point there are two issues in JSC code that affect tail calls on ARM Thumb, both revolving around tail call code location (data location vs executable address).

The first one is in AbstractMacroAssembler<>::repatchNearCall(), where the destination's data location, instead of executable address, should be passed in to relinkJump(). This was previously causing assertions in relinkJump() for the `to` value.

The second one is in linkPolymorphicCall(). When calling LinkBuffer::link() for a tail call, the data location of the destination should be passed in, since in that case MacroAssembler::linkCall() will be linking a jump instead of a call. This too was hitting an assertion later down the call stack, in ARMv7::linkJumpAbsolute().

The attached patch just shows the quick-fix I've done for the two problems, and  some feedback would already be appreciated. The first fix looks straightforward, but the second one is a bit convoluted.
Comment 4 Csaba Osztrogonác 2016-04-06 08:30:27 PDT
(In reply to comment #3)
> Created attachment 269042 [details]
> Naive fixes
> 
> At this point there are two issues in JSC code that affect tail calls on ARM
> Thumb, both revolving around tail call code location (data location vs
> executable address).
> 
> The first one is in AbstractMacroAssembler<>::repatchNearCall(), where the
> destination's data location, instead of executable address, should be passed
> in to relinkJump(). This was previously causing assertions in relinkJump()
> for the `to` value.
> 
> The second one is in linkPolymorphicCall(). When calling LinkBuffer::link()
> for a tail call, the data location of the destination should be passed in,
> since in that case MacroAssembler::linkCall() will be linking a jump instead
> of a call. This too was hitting an assertion later down the call stack, in
> ARMv7::linkJumpAbsolute().
> 
> The attached patch just shows the quick-fix I've done for the two problems,
> and  some feedback would already be appreciated. The first fix looks
> straightforward, but the second one is a bit convoluted.

I tried this patch and fixes bug155790 on Thumb2. It would be 
great if JSC experts can check if this fix is proper or not.
Comment 5 Zan Dobersek 2016-04-06 11:25:09 PDT
Created attachment 275788 [details]
Patch

Now in reviewable form.
Comment 6 Csaba Osztrogonác 2016-04-12 01:35:23 PDT
(In reply to comment #5)
> Created attachment 275788 [details]
> Patch
> 
> Now in reviewable form.

(In reply to comment #4)
> I tried this patch and fixes bug155790 on Thumb2. It would be 
> great if JSC experts can check if this fix is proper or not.

Ping?
Comment 7 Csaba Osztrogonác 2016-04-12 01:39:14 PDT
*** Bug 155790 has been marked as a duplicate of this bug. ***
Comment 8 Csaba Osztrogonác 2016-04-13 00:44:54 PDT
(In reply to comment #5)
> Created attachment 275788 [details]
> Patch
> 
> Now in reviewable form.

(In reply to comment #4)
> I tried this patch and fixes bug155790 on Thumb2. It would be 
> great if JSC experts can check if this fix is proper or not.

Ping?
Comment 9 Csaba Osztrogonác 2016-04-15 01:09:51 PDT
*** Bug 154857 has been marked as a duplicate of this bug. ***
Comment 10 Csaba Osztrogonác 2016-04-15 01:14:36 PDT
Comment on attachment 275788 [details]
Patch

It seems 3 months wasn't enough for JSC maintainers to check this serious bug, which 
is a 7 months old regression caused by http://trac.webkit.org/changeset/189884 .

I tested it, checked the implementation and looks good to me, r=me, 
let's land it as is. But I'm not sure if it is the best way to fix 
this regression. Please let use know if you have better fix for it.
Comment 11 WebKit Commit Bot 2016-04-15 02:07:26 PDT
Comment on attachment 275788 [details]
Patch

Clearing flags on attachment: 275788

Committed r199586: <http://trac.webkit.org/changeset/199586>
Comment 12 WebKit Commit Bot 2016-04-15 02:07:31 PDT
All reviewed patches have been landed.  Closing bug.