Bug 174697

Summary: Add the ability to change sp and pc to the ARM64 JIT probe.
Product: WebKit Reporter: Mark Lam <mark.lam>
Component: JavaScriptCoreAssignee: Mark Lam <mark.lam>
Status: RESOLVED FIXED    
Severity: Normal CC: buildbot, fpizlo, jfbastien, keith_miller, msaboff, sbarati, webkit-bug-importer, ysuzuki
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 174645    
Attachments:
Description Flags
proposed patch.
none
proposed patch.
none
proposed patch. jfbastien: review+

Description Mark Lam 2017-07-20 16:33:44 PDT
It's not possible to change the value of the pc on ARM64 without using a gpr.  For this reason and because previous use cases of JIT probes aim first to preserve register values, this ability to change sp and pc was not supported on the ARM64 JIT probe.  

However, we'll now need this in order to use the JIT probes for implementing OSR exits.  So, we'll add it by using the lr register for the pc dispatch.  In JIT code, lr is never used for any other purpose than to modify pc anyway.  Additionally, at all OSR exit sites, we are guaranteed that lr will not contain any interesting values that we'll need to preserve.  So, it is ok to let the JIT probe use lr for changing the pc.

<rdar://problem/33436965>
Comment 1 Mark Lam 2017-07-24 20:52:10 PDT
Created attachment 316346 [details]
proposed patch.
Comment 2 Build Bot 2017-07-24 20:53:40 PDT
Attachment 316346 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/assembler/testmasm.cpp:226:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/assembler/testmasm.cpp:271:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/assembler/testmasm.cpp:272:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/assembler/testmasm.cpp:318:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/assembler/testmasm.cpp:423:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/assembler/testmasm.cpp:474:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
Total errors found: 6 in 5 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Mark Lam 2017-07-24 20:53:55 PDT
Comment on attachment 316346 [details]
proposed patch.

Added patch to the wrong bug.
Comment 4 Mark Lam 2017-07-25 20:16:14 PDT
Created attachment 316423 [details]
proposed patch.
Comment 5 Mark Lam 2017-07-25 20:32:28 PDT
Created attachment 316425 [details]
proposed patch.
Comment 6 JF Bastien 2017-07-25 22:06:51 PDT
Comment on attachment 316425 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=316425&action=review

r=me

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:239
> +#define IN_ERROR_FUNCTION_OFFSET   (6 * PTR_SIZE)

Wouldn't offsetof be more natural to use instead of the macros? Especially given the static_assert below. Or is this because you're passing it to the assembler?

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:282
> +    ".align 4" "\n"

.align is ambiguous between assemblers. .balign and .p2align are better IMO.

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:304
> +    "mrs       x1, fpsr" "\n" // Preload fpsr.

I think you should issue one of the two mrs right after the stp x0, x1, and the other a few cycles later.

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:316
> +    "stp       x22, x23,[sp, #" STRINGIZE_VALUE_OF(PROBE_CPU_X22_OFFSET) "]" "\n"

Missing a space before the bracket.

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:324
> +    "str       lr, [sp, #" STRINGIZE_VALUE_OF(SAVED_PROBE_RETURN_PC_OFFSET) "]" "\n"

Can the above two be stp lr, x6 [sp, PC_OFFSET] ?

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:333
> +    "add       x9, sp, #" STRINGIZE_VALUE_OF(PROBE_CPU_Q0_OFFSET) "\n"

x9 is only used for indexing, right? Can you add Q0_OFFSET to sp below instead of using x9?

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:476
> +    "beq     " LOCAL_LABEL_STRING(ctiMasmProbeTrampolineEnd) "\n"

Here you can use cbz / cbnz.

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:478
> +    // In order to restore lr, we need to do the restorationation at the probe return site.

"restorationation"

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:518
> +    "add       sp, sp, #" STRINGIZE_VALUE_OF(OUT_SIZE) "\n"

Here I think you could have used incrementing ldp instead of adding at the end.
Comment 7 Mark Lam 2017-07-26 07:52:54 PDT
Comment on attachment 316425 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=316425&action=review

>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:239
>> +#define IN_ERROR_FUNCTION_OFFSET   (6 * PTR_SIZE)
> 
> Wouldn't offsetof be more natural to use instead of the macros? Especially given the static_assert below. Or is this because you're passing it to the assembler?

This is used to embed offset values in the inline asm below.  We can't use offsetof in inline asm constants.

>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:282
>> +    ".align 4" "\n"
> 
> .align is ambiguous between assemblers. .balign and .p2align are better IMO.

I'll go with ".balign 16".

>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:304
>> +    "mrs       x1, fpsr" "\n" // Preload fpsr.
> 
> I think you should issue one of the two mrs right after the stp x0, x1, and the other a few cycles later.

I presume you're suggesting this to optimize for instruction scheduling?  I'll apply the change.

>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:316
>> +    "stp       x22, x23,[sp, #" STRINGIZE_VALUE_OF(PROBE_CPU_X22_OFFSET) "]" "\n"
> 
> Missing a space before the bracket.

Will fix.

>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:324
>> +    "str       lr, [sp, #" STRINGIZE_VALUE_OF(SAVED_PROBE_RETURN_PC_OFFSET) "]" "\n"
> 
> Can the above two be stp lr, x6 [sp, PC_OFFSET] ?

I can give it a try.  It depends on the alignment, but I should be able to make it work one way or the other.

>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:333
>> +    "add       x9, sp, #" STRINGIZE_VALUE_OF(PROBE_CPU_Q0_OFFSET) "\n"
> 
> x9 is only used for indexing, right? Can you add Q0_OFFSET to sp below instead of using x9?

I used to do the indexing off of sp.  The whole reason for introducing a new base register x9 is because your favorite "stp" instruction is limited in what offsets it can take.  As a reasult, some of the last stores below (maybe 3 of them) cannot be encoded in a stp instruction if I use sp.  So, I switched to x9 just for consistency of how we store to the set of saved fprs.

So, no.  We cannot index off of sp.

>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:476
>> +    "beq     " LOCAL_LABEL_STRING(ctiMasmProbeTrampolineEnd) "\n"
> 
> Here you can use cbz / cbnz.

I'll look into it.

>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:478
>> +    // In order to restore lr, we need to do the restorationation at the probe return site.
> 
> "restorationation"

Will fix.

>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:518
>> +    "add       sp, sp, #" STRINGIZE_VALUE_OF(OUT_SIZE) "\n"
> 
> Here I think you could have used incrementing ldp instead of adding at the end.

How do I do that?  OUT_SIZE is 6 * 8 bytes.  Doesn't post-increment only increment by the size of values read i.e. 16 bytes in this case?

Oh ... you mean change all 3 ldps to be post-increment.  I'll look into it.
Comment 8 Mark Lam 2017-07-26 10:47:32 PDT
Comment on attachment 316425 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=316425&action=review

>>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.cpp:324
>>> +    "str       lr, [sp, #" STRINGIZE_VALUE_OF(SAVED_PROBE_RETURN_PC_OFFSET) "]" "\n"
>> 
>> Can the above two be stp lr, x6 [sp, PC_OFFSET] ?
> 
> I can give it a try.  It depends on the alignment, but I should be able to make it work one way or the other.

Turns out I can't do this either because stp offsets have limited range: "error: index must be a multiple of 8 in range [-512, 504]."
Comment 9 Mark Lam 2017-07-26 11:32:11 PDT
Thanks for the review.  Landed in r219951: <http://trac.webkit.org/r219951>.