Bug 175617

Summary: Enhance MacroAssembler::probe() to support an initializeStackFunction callback.
Product: WebKit Reporter: Mark Lam <mark.lam>
Component: JavaScriptCoreAssignee: Mark Lam <mark.lam>
Status: RESOLVED FIXED    
Severity: Normal CC: fpizlo, jfbastien, keith_miller, msaboff, ossy, pvollan, rmorisset, saam, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 174645    
Attachments:
Description Flags
proposed patch.
none
proposed patch. jfbastien: review+

Description Mark Lam 2017-08-15 21:30:33 PDT
This allows the probe function to provide a second ProbeFunction callback to initialize the stack values after the stack pointer has been adjusted.
Comment 1 Radar WebKit Bug Importer 2017-08-15 21:31:10 PDT
<rdar://problem/33912104>
Comment 2 Mark Lam 2017-08-15 22:34:10 PDT
Created attachment 318234 [details]
proposed patch.

Let's try this on the EWS.
Comment 3 Mark Lam 2017-08-16 11:42:05 PDT
Created attachment 318278 [details]
proposed patch.

This patch has passed the testmasm test for x86, x86_64, arm64, and armv7s.  It's ready for a review.
Comment 4 JF Bastien 2017-08-16 13:12:29 PDT
Comment on attachment 318278 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=318278&action=review

r=me


On x86-64 I think you should do a follow-up patch to avoid setting eflags with pushf / popf. It's slow, and historically has been the source of some exploits. I have code and benchmark for it here: https://github.com/jfbastien/benchmark-x86-flags  I'd use seto+lahf / addb+sahf instead.

> Source/JavaScriptCore/ChangeLog:60
> +           we're moving the ProbeContext, we're always be moving it to a lower address.

"we're always be moving it"
*we'll ?

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:358
> +    // Initialize ProbeContex::initializeStackFunction to zero.

ProbeContext

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:370
> +    "add       x2, x27, #" STRINGIZE_VALUE_OF(PROBE_SIZE_PLUS_EXTRAS + OUT_SIZE) "\n" // End of ProveContext + buffer.

ProbeContext

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:376
> +    "bic       x1, x1, #0xf" "\n" // The ARM EABI specifies that the stack needs to be 16 byte aligned.

IIUC x1 is:
RESULT_SP - (PROBE_SIZE_PLUS_EXTRA + OUT_SIZE)
Right?

So why do you need to align it? It seems like OUT_SIZE is the only thing that could force misalignment, no? Why not assert it's always a value that won't mis-align SP?

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:377
> +    "mov       sp, x1" "\n" // Set the new sp to protect that memory from interrupts before we copy the ProbeContext.

I think you should also set FP here, so that backtraces will look fine.

> Source/JavaScriptCore/assembler/MacroAssemblerARM.cpp:264
>      "bic       r3, r3, #0xf" "\n"

Ditto on mask.

> Source/JavaScriptCore/assembler/MacroAssemblerARM.cpp:265
> +    "mov       sp, r3" "\n" // Set the sp to protect the ProbeContext from interrupts before we initialize it.

Ditto on FP.

> Source/JavaScriptCore/assembler/MacroAssemblerARM.cpp:292
> +    "vstmia.64 ip!, { d16-d31 }" "\n"

Wait... The previous code was just wrong, lost half the values, overwrote the other half, and kept garbage?

> Source/JavaScriptCore/assembler/MacroAssemblerARM.cpp:296
> +    // Initialize ProbeContex::initializeStackFunction to zero.

ProbeContext

> Source/JavaScriptCore/assembler/MacroAssemblerARM.cpp:314
> +    "bic       r1, r1, #0xf" "\n" // The ARM EABI specifies that the stack needs to be 16 byte aligned.

Ditto

> Source/JavaScriptCore/assembler/MacroAssemblerARM.cpp:328
> +    "str       r4, [r6], #4" "\n"

You could use load / store pair here, but I'm not sure it's worth sweating the details on ARM32. You could also use load / store multiple to avoid looping entirely and have less code, but it's not necessarily faster.

> Source/JavaScriptCore/assembler/MacroAssemblerARMv7.cpp:235
> +    // Initialize ProbeContex::initializeStackFunction to zero.

ProbeContext

> Source/JavaScriptCore/assembler/MacroAssemblerARMv7.cpp:254
> +    "bic       r1, r1, #0xf" "\n" // The ARM EABI specifies that the stack needs to be 16 byte aligned.

Ditto
Comment 5 Mark Lam 2017-08-16 13:34:50 PDT
Thanks for the review.

(In reply to JF Bastien from comment #4)
> On x86-64 I think you should do a follow-up patch to avoid setting eflags
> with pushf / popf. It's slow, and historically has been the source of some
> exploits. I have code and benchmark for it here:
> https://github.com/jfbastien/benchmark-x86-flags  I'd use seto+lahf /
> addb+sahf instead.

I filed https://bugs.webkit.org/show_bug.cgi?id=175636 to address this later.

> > Source/JavaScriptCore/ChangeLog:60
> > +           we're moving the ProbeContext, we're always be moving it to a lower address.
> 
> "we're always be moving it"
> *we'll ?

Fixed.

> > Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:358
> > +    // Initialize ProbeContex::initializeStackFunction to zero.
> 
> ProbeContext

Fixed.

> > Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:370
> > +    "add       x2, x27, #" STRINGIZE_VALUE_OF(PROBE_SIZE_PLUS_EXTRAS + OUT_SIZE) "\n" // End of ProveContext + buffer.
> 
> ProbeContext

Fixed.

> > Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:376
> > +    "bic       x1, x1, #0xf" "\n" // The ARM EABI specifies that the stack needs to be 16 byte aligned.
> 
> IIUC x1 is:
> RESULT_SP - (PROBE_SIZE_PLUS_EXTRA + OUT_SIZE)
> Right?

RESULT_SP is set by the probe function.  Technically, it should never set it to an unaligned value, but if it does due to a bug, I don't what the crash to happen in the trampoline code.

> So why do you need to align it? It seems like OUT_SIZE is the only thing
> that could force misalignment, no? Why not assert it's always a value that
> won't mis-align SP?

The probe code is not meant for use in performance hot paths, plus an explicit mask here makes the probe trampoline more robust and impervious to a bug in the probe function setting a misaligned sp.

> > Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:377
> > +    "mov       sp, x1" "\n" // Set the new sp to protect that memory from interrupts before we copy the ProbeContext.
> 
> I think you should also set FP here, so that backtraces will look fine.

Not strictly needed but is nice to have.  I filed https://bugs.webkit.org/show_bug.cgi?id=175637 to address this later.

> > Source/JavaScriptCore/assembler/MacroAssemblerARM.cpp:264
> >      "bic       r3, r3, #0xf" "\n"
> 
> Ditto on mask.

Same reason as above: not perf critical, and more robust (does not require PROBE_SIZE and OUT_SIZE to be carefully packed for alignment).

> > Source/JavaScriptCore/assembler/MacroAssemblerARM.cpp:265
> > +    "mov       sp, r3" "\n" // Set the sp to protect the ProbeContext from interrupts before we initialize it.
> 
> Ditto on FP.

Will address later in https://bugs.webkit.org/show_bug.cgi?id=175637.

> > Source/JavaScriptCore/assembler/MacroAssemblerARM.cpp:292
> > +    "vstmia.64 ip!, { d16-d31 }" "\n"
> 
> Wait... The previous code was just wrong, lost half the values, overwrote
> the other half, and kept garbage?

Yep.  This is a bug fix.

> > Source/JavaScriptCore/assembler/MacroAssemblerARM.cpp:296
> > +    // Initialize ProbeContex::initializeStackFunction to zero.
> 
> ProbeContext

Fixed.

> > Source/JavaScriptCore/assembler/MacroAssemblerARM.cpp:314
> > +    "bic       r1, r1, #0xf" "\n" // The ARM EABI specifies that the stack needs to be 16 byte aligned.
> 
> Ditto

Same as above.

> > Source/JavaScriptCore/assembler/MacroAssemblerARM.cpp:328
> > +    "str       r4, [r6], #4" "\n"
> 
> You could use load / store pair here, but I'm not sure it's worth sweating
> the details on ARM32. You could also use load / store multiple to avoid
> looping entirely and have less code, but it's not necessarily faster.

Yeah.  I'm keeping it simple as is.  The ARM_TRADITIONAL folks can choose to optimize if they see a need later.  Since this is not hot path code, I don't think there'll be a need.

> > Source/JavaScriptCore/assembler/MacroAssemblerARMv7.cpp:235
> > +    // Initialize ProbeContex::initializeStackFunction to zero.
> 
> ProbeContext

Fixed.

> > Source/JavaScriptCore/assembler/MacroAssemblerARMv7.cpp:254
> > +    "bic       r1, r1, #0xf" "\n" // The ARM EABI specifies that the stack needs to be 16 byte aligned.
> 
> Ditto

Same as above.
Comment 6 Mark Lam 2017-08-16 13:43:42 PDT
Landed in r220807: <http://trac.webkit.org/r220807>.
Comment 7 Csaba Osztrogonác 2017-08-16 23:55:08 PDT
(In reply to Mark Lam from comment #6)
> Landed in r220807: <http://trac.webkit.org/r220807>.

It broke the ARMv7 traditional build:
standard input}:76: Error: selected processor does not support ARM mode `cbz r2,.LctiMasmProbeTrampolineRestoreRegisters'

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489c/Cjaghefc.html

It says that cbz is 16 bit thumb1 instruction,
there is no cbz 32 bit ARM instruction.
Comment 8 Mark Lam 2017-08-17 00:05:52 PDT
(In reply to Csaba Osztrogonác_OOO_until_21st_Aug from comment #7)
> It broke the ARMv7 traditional build:
> standard input}:76: Error: selected processor does not support ARM mode `cbz
> r2,.LctiMasmProbeTrampolineRestoreRegisters'

Fixed in r220834: <http://trac.webkit.org/r220834>.
Comment 9 Filip Pizlo 2017-08-18 13:24:17 PDT
Comment on attachment 318278 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=318278&action=review

> Source/JavaScriptCore/assembler/testmasm.cpp:608
> +            context->initializeStackFunction = fillStack;
> +            context->initializeStackArg = &data;

This is a super weird API.  Why is this needed?
Comment 10 Mark Lam 2017-08-18 13:25:59 PDT
(In reply to Filip Pizlo from comment #9)
> Comment on attachment 318278 [details]
> proposed patch.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=318278&action=review
> 
> > Source/JavaScriptCore/assembler/testmasm.cpp:608
> > +            context->initializeStackFunction = fillStack;
> > +            context->initializeStackArg = &data;
> 
> This is a super weird API.  Why is this needed?

This is needed by the low level implementation because we can't write things to the stack until after we move the stack pointer.  That said, this is not the API you're looking for.  What you want will be coming in https://bugs.webkit.org/show_bug.cgi?id=175688 soon.