Bug 103146

Summary: ARMv7 replaceWithJump ASSERT failure after r135330.
Product: WebKit Reporter: Balazs Kilvady <kilvadyb>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: allan.jensen, barraclough, cgarcia, ctruta, fpizlo, fu, galpeter, gergely, noam, oliver, ossy, palfia, rgabor, webkit.review.bot, xan.lopez, zherczeg
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Other   
OS: Linux   
Bug Depends on:    
Bug Blocks: 108645, 102662, 103747    
Attachments:
Description Flags
patch for ARMv7 none

Description Balazs Kilvady 2012-11-23 08:16:33 PST
Running v8 test v7 with jsc in debug mode on our ARMv7 board I received this ASSERT failure:

Starting program: /data/kilvadyb/webkit-arm/webkit/WebKitBuild/Debug/bin/jsc run.js
[Thread debugging using libthread_db enabled]
[New Thread 0x42dc4400 (LWP 1111)]
Richards: 161
DeltaBlue: 21.8
Crypto: 110
RayTrace: 101
ASSERTION FAILED: canBeJumpT4(instruction, target)
/data/kilvadyb/webkit-arm/webkit/Source/JavaScriptCore/assembler/ARMv7Assembler.h(2475) : static void JSC::ARMv7Assembler::linkJumpT4(uint16_t*, void*)

Program received signal SIGSEGV, Segmentation fault.
0x4040b38c in JSC::ARMv7Assembler::linkJumpT4 (instruction=0x455695f6, target=0x43599dc0)
    at /data/kilvadyb/webkit-arm/webkit/Source/JavaScriptCore/assembler/ARMv7Assembler.h:2475
2475	        ASSERT(canBeJumpT4(instruction, target));
(gdb)

I had similar problem on MIPS where a replaceWithJump would be easier to implement with direct jump instead of jump via register.
Comment 1 Filip Pizlo 2012-11-23 15:23:20 PST
(In reply to comment #0)
> Running v8 test v7 with jsc in debug mode on our ARMv7 board I received this ASSERT failure:
> 
> Starting program: /data/kilvadyb/webkit-arm/webkit/WebKitBuild/Debug/bin/jsc run.js
> [Thread debugging using libthread_db enabled]
> [New Thread 0x42dc4400 (LWP 1111)]
> Richards: 161
> DeltaBlue: 21.8
> Crypto: 110
> RayTrace: 101
> ASSERTION FAILED: canBeJumpT4(instruction, target)
> /data/kilvadyb/webkit-arm/webkit/Source/JavaScriptCore/assembler/ARMv7Assembler.h(2475) : static void JSC::ARMv7Assembler::linkJumpT4(uint16_t*, void*)
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x4040b38c in JSC::ARMv7Assembler::linkJumpT4 (instruction=0x455695f6, target=0x43599dc0)
>     at /data/kilvadyb/webkit-arm/webkit/Source/JavaScriptCore/assembler/ARMv7Assembler.h:2475
> 2475            ASSERT(canBeJumpT4(instruction, target));
> (gdb)
> 
> I had similar problem on MIPS where a replaceWithJump would be easier to implement with direct jump instead of jump via register.

Actually, it looks like that assert is just wrong. I'll look more in a bit.
Comment 2 Balazs Kilvady 2012-11-24 07:06:30 PST
(In reply to comment #1)
> (In reply to comment #0)
> > Running v8 test v7 with jsc in debug mode on our ARMv7 board I received this ASSERT failure:
> > 
> > Starting program: /data/kilvadyb/webkit-arm/webkit/WebKitBuild/Debug/bin/jsc run.js
> > [Thread debugging using libthread_db enabled]
> > [New Thread 0x42dc4400 (LWP 1111)]
> > Richards: 161
> > DeltaBlue: 21.8
> > Crypto: 110
> > RayTrace: 101
> > ASSERTION FAILED: canBeJumpT4(instruction, target)
> > /data/kilvadyb/webkit-arm/webkit/Source/JavaScriptCore/assembler/ARMv7Assembler.h(2475) : static void JSC::ARMv7Assembler::linkJumpT4(uint16_t*, void*)
> > 
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x4040b38c in JSC::ARMv7Assembler::linkJumpT4 (instruction=0x455695f6, target=0x43599dc0)
> >     at /data/kilvadyb/webkit-arm/webkit/Source/JavaScriptCore/assembler/ARMv7Assembler.h:2475
> > 2475            ASSERT(canBeJumpT4(instruction, target));
> > (gdb)
> > 
> > I had similar problem on MIPS where a replaceWithJump would be easier to implement with direct jump instead of jump via register.
> 
> Actually, it looks like that assert is just wrong. I'll look more in a bit.

Thank you for checking it. On MIPS I have this problem (at exactly the same test step) when the "jump to a direct address" one word (4 bytes) instruction has boundaries and the target address of the jump is out of these boundaries. I guess the same happens on ARM when the target range of T4 type jump is not enough. On MIPS I could use
target -> register
jump via register
but it would take 4 word (4 bytes == word) instructions so replaceWithJump might overwrite some useful already generated instructions and it would cause problems when replace-jump code should be reverted. I could solve it only with using nop instruction words to make place the 4 words replace-jump to make it possible to revert.

On MIPS the usually replaced/overwritten code:
lui t0, 0xXXXX
ori t0, t0, 0xYYYY
bne t0, t4, address
nop

(The same usually replaced code on ARM:
movw r12, YYYY
movt r12, XXXX
cmp r12, rN
bne address
)

and the jump via register:

lui t0, 0xXXXX
ori t0, t0, 0xXXXX
jr t0
nop ; necessary in "jump/branch slot".

So this kind of jump replacement overwrites the "bne reg1, reg0, address" instruction and address value would be lost.
Comment 3 Xan Lopez 2013-02-28 04:21:53 PST
(In reply to comment #1)
> (In reply to comment #0)
> > Running v8 test v7 with jsc in debug mode on our ARMv7 board I received this ASSERT failure:
> > 
> > Starting program: /data/kilvadyb/webkit-arm/webkit/WebKitBuild/Debug/bin/jsc run.js
> 
> Actually, it looks like that assert is just wrong. I'll look more in a bit.

I get exactly the same behavior by reverting the entire patch or commenting that single ASSERT, so indeed it seems we are ASSERTing the wrong thing in there.
Comment 4 Gabor Rapcsanyi 2013-03-12 01:53:03 PDT
Is there any progress on this?
Comment 5 Zoltan Herczeg 2013-03-13 07:48:16 PDT
Hey Filip,

it seems many systems are plagued by this issue, and we need some help to fix it. The issue is that replaceWithJump() always use linkJumpT4(), which is limited to 4 byte long instruction space. Instead we need something like in linkJumpAbsolute() 10 byte long space area for doing a jump. I don't really understand why linkJump* functions uses instruction indexes below zero. Is it safe to use 5 instructions in replaceWithJump?

This is the backtrace. It is clear that target-instruction is bigger than the allowed 24 bit difference:

#0 0x000675f4 in JSC::ARMv7Assembler::linkJumpT4 (instruction=0xb4902a76, target=0xb6fd3f80)
at /home/rgabor/commit/DFG/Source/JavaScriptCore/assembler/ARMv7Assembler.h:2521
#1 0x00199174 in JSC::ARMv7Assembler::replaceWithJump (instructionStart=0xb4902a72, to=0xb6fd3f80)
at /home/rgabor/commit/DFG/Source/JavaScriptCore/assembler/ARMv7Assembler.h:2165
#2 0x001995b2 in JSC::MacroAssemblerARMv7::replaceWithJump (instructionStart=..., destination=...)
at /home/rgabor/commit/DFG/Source/JavaScriptCore/assembler/MacroAssemblerARMv7.h:1230
#3 0x0019b098 in JSC::RepatchBuffer::replaceWithJump (this=0xbeffea48, instructionStart=..., destination=...)
at /home/rgabor/commit/DFG/Source/JavaScriptCore/assembler/RepatchBuffer.h:156
#4 0x0020d074 in JSC::JIT::privateCompileClosureCall (this=0xbeffeb40, callLinkInfo=0x7be514, calleeCodeBlock=0x7b9940, expectedStructure=0xb493f878,
expectedExecutable=0xb42b8d78, codePtr=...) at /home/rgabor/commit/DFG/Source/JavaScriptCore/jit/JITCall32_64.cpp:348
Comment 6 Zoltan Herczeg 2013-03-18 07:45:46 PDT
Created attachment 193562 [details]
patch for ARMv7
Comment 7 Csaba Osztrogonác 2013-03-18 10:30:08 PDT
*** Bug 111704 has been marked as a duplicate of this bug. ***
Comment 8 Csaba Osztrogonác 2013-03-18 10:33:50 PDT
(In reply to comment #6)
> Created an attachment (id=193562) [details]
> patch for ARMv7

It fixes the inspector crashes mentioned in https://bugs.webkit.org/show_bug.cgi?id=111704: 

http://build.webkit.sed.hu/builders/ARMv7%20Linux%20Qt5%20Release%20%28Test%29/builds/8083
Comment 9 Balazs Kilvady 2013-03-20 09:02:16 PDT
*** Bug 112816 has been marked as a duplicate of this bug. ***
Comment 10 Zoltan Herczeg 2013-03-20 14:48:00 PDT
I would be grateful if somebody would review this patch. Thanks.
Comment 11 Xan Lopez 2013-03-20 14:56:40 PDT
(In reply to comment #10)
> I would be grateful if somebody would review this patch. Thanks.

FWIW I was seeing this crash in QNX, not Linux, so I suspect the fix should be more generic.
Comment 12 WebKit Review Bot 2013-03-20 15:09:46 PDT
Comment on attachment 193562 [details]
patch for ARMv7

Clearing flags on attachment: 193562

Committed r146396: <http://trac.webkit.org/changeset/146396>
Comment 13 WebKit Review Bot 2013-03-20 15:09:53 PDT
All reviewed patches have been landed.  Closing bug.
Comment 14 Zoltan Herczeg 2013-03-20 20:58:54 PDT
> FWIW I was seeing this crash in QNX, not Linux, so I suspect the fix should be more generic.

Did you try the patch? Does it fix your platform? If so, we can make it more generic by adding your OS to the list.
Comment 15 Filip Pizlo 2013-03-20 22:15:19 PDT
(In reply to comment #14)
> > FWIW I was seeing this crash in QNX, not Linux, so I suspect the fix should be more generic.
> 
> Did you try the patch? Does it fix your platform? If so, we can make it more generic by adding your OS to the list.

I think that the reason why you guys are seeing this badness is that you're using the ExecutableAllocator and not ExecutableAllocatorFixedVMPool, or whatever it is called.

We used the FixedVMPool on both X86_64 and ARMv7.  We do it for two simple reasons:

- About ~16MB is more than enough, since JSC only JITs things when it absolutely needs to.

- Having all JIT memory within a confined pool means that we can consistently use compact jumps.

Maybe the right thing for y'all is to switch to using the fixed VM pool?

Maybe the right thing for the project is to switch *everything* over to the fixed pool, including x86-32, so we can simplify the code and all share the invariant that for a given JSGlobalData, any two slabs of JIT code will always be within a small enough distance from each other that a relatively compact jump can be emitted?  (Note that for example the choice of 16MB on ARMv7, and 1GB on X86_64, is a direct consequent of the largest jumpable distance using a single jump instruction, on those platforms.)
Comment 16 Zoltan Herczeg 2013-03-21 04:15:21 PDT
> Maybe the right thing for y'all is to switch to using the fixed VM pool?

I have no objections. We will try this pool in due course, and let you know whether it works.
Comment 17 Xan Lopez 2013-03-21 05:02:48 PDT
(In reply to comment #14)
> > FWIW I was seeing this crash in QNX, not Linux, so I suspect the fix should be more generic.
> 
> Did you try the patch? Does it fix your platform? If so, we can make it more generic by adding your OS to the list.

Nope, I haven't, but if I read it right most of the changes are guarded by OS(LINUX) stuff. I guess maybe the nowp() thing should work, since is the only generic change? Anyway I'll try to test it ASAP.
Comment 18 Cosmin Truta 2013-03-21 11:17:32 PDT
(In reply to comment #17)
> > Did you try the patch? Does it fix your platform? If so, we can make it more generic by adding your OS to the list.
> 
> Nope, I haven't, but if I read it right most of the changes are guarded by OS(LINUX) stuff. I guess maybe the nowp() thing should work, since is the only generic change? Anyway I'll try to test it ASAP.

Indeed, I added OS(QNX) besides OS(LINUX), and that solved the problem. See bug 112863.