Bug 172767

Summary: [JSC][ARMv6] Fix ARMv6 JIT support
Product: WebKit Reporter: Caio Lima <ticaiolima>
Component: JavaScriptCoreAssignee: Caio Lima <ticaiolima>
Status: NEW ---    
Severity: Normal CC: buildbot, fpizlo, guijemont, jacob.bramley+webkit, keith_miller, mark.lam, msaboff, ossy, saam
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 108645, 172765    
Attachments:
Description Flags
Patch
none
Patch mark.lam: review-

Description Caio Lima 2017-05-31 12:08:19 PDT
...
Comment 1 Caio Lima 2017-05-31 12:30:52 PDT
Created attachment 311619 [details]
Patch
Comment 2 Build Bot 2017-05-31 12:32:37 PDT
Attachment 311619 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/assembler/ARMAssembler.h:218:  enum members should use InterCaps with an initial capital letter or initial 'k' for C-style enums.  [readability/enum_casing] [4]
Total errors found: 1 in 6 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Mark Lam 2017-06-07 10:28:23 PDT
Comment on attachment 311619 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=311619&action=review

> Source/JavaScriptCore/assembler/ARMAssembler.h:218
> +            // mcr     15, 0, r6, cr7, cr10, {5}
> +            ARM6_MEMFENCE = 0xee076fba,

I think this may not be correct.  Someone who knows ARM better should take a closer look.  Here's why:

1. The instruction you're replacing is DMBSY.
    The spec for DMBSY (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0646a/BABDFABI.html) says:
    "DMB acts as a data memory barrier. It ensures that all explicit memory accesses that appear, in program order, before the DMB instruction are completed before any explicit memory accesses that appear, in program order, after the DMB instruction. DMB does not affect the ordering or execution of instructions that do not access memory."
    Specifically, it ensures that memory access are completed in program order.

2. You're replacing it with "mcr 15, 0, r6, cr7, cr10, {5}".
    The spec for MCR (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0360f/I1014942.html) says:
    "... As such, DMB ensures the apparent order of the explicit memory operations before and after the DMB instruction, but does not ensure the completion of those memory operations."

    I suspect what you really want is a "Data Synchronization Barrier", but I'm not certain.  Someone who knows better needs to review this.

Also, you picked r6 for the <Rd> argument in the instruction.  What's the reason for this?  It's not clear from the spec what <Rd> is for in terms of the "Data Memory Barrier" or "Data Synchronization Barrier".  Maybe it doesn't matter, but someone who knows traditional ARM better should review this.

> Source/JavaScriptCore/jit/RegisterSet.cpp:163
> +#elif CPU(ARM)
> +    result.set(ARMRegisters::r4);
> +    result.set(ARMRegisters::r5);
> +    result.set(ARMRegisters::r6);
> +    result.set(ARMRegisters::r7);
> +    result.set(ARMRegisters::r8);
> +    result.set(ARMRegisters::r9);
> +    result.set(ARMRegisters::r10);

There's already a section for CPU(ARM_TRADITIONAL) above.  This is not needed.  Please remove and remember to update your ChangeLog.

Also, I think this set is incomplete.  https://stackoverflow.com/questions/261419/arm-to-c-calling-convention-registers-to-save says the callee saved regs are r4-r8, r9 (possibly), and r10-r11.  Your new set is missing r11.
Comment 4 Caio Lima 2017-06-14 08:16:47 PDT
Comment on attachment 311619 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=311619&action=review

Thank you for the review

>> Source/JavaScriptCore/assembler/ARMAssembler.h:218
>> +            ARM6_MEMFENCE = 0xee076fba,
> 
> I think this may not be correct.  Someone who knows ARM better should take a closer look.  Here's why:
> 
> 1. The instruction you're replacing is DMBSY.
>     The spec for DMBSY (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0646a/BABDFABI.html) says:
>     "DMB acts as a data memory barrier. It ensures that all explicit memory accesses that appear, in program order, before the DMB instruction are completed before any explicit memory accesses that appear, in program order, after the DMB instruction. DMB does not affect the ordering or execution of instructions that do not access memory."
>     Specifically, it ensures that memory access are completed in program order.
> 
> 2. You're replacing it with "mcr 15, 0, r6, cr7, cr10, {5}".
>     The spec for MCR (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0360f/I1014942.html) says:
>     "... As such, DMB ensures the apparent order of the explicit memory operations before and after the DMB instruction, but does not ensure the completion of those memory operations."
> 
>     I suspect what you really want is a "Data Synchronization Barrier", but I'm not certain.  Someone who knows better needs to review this.
> 
> Also, you picked r6 for the <Rd> argument in the instruction.  What's the reason for this?  It's not clear from the spec what <Rd> is for in terms of the "Data Memory Barrier" or "Data Synchronization Barrier".  Maybe it doesn't matter, but someone who knows traditional ARM better should review this.

Yes. I was researching about your questions and I'm not able to give you 100% of certain about them, so I just pinged the original authors of the Patch and they are going to give clarifications here soon.

One thing that I noticed between DSB and DMB in ARM11 is that instructions after DSB don't execute until it finishes. If I didn't misunderstood, "dmb sy"doesn't impose it and the code is more performant in practice. Following the ARM spec, It seems that we probably don't have a "dmb sy" equivalent instruction into ARM11 and we will probably need to choose which one is better.

>> Source/JavaScriptCore/jit/RegisterSet.cpp:163
>> +    result.set(ARMRegisters::r10);
> 
> There's already a section for CPU(ARM_TRADITIONAL) above.  This is not needed.  Please remove and remember to update your ChangeLog.
> 
> Also, I think this set is incomplete.  https://stackoverflow.com/questions/261419/arm-to-c-calling-convention-registers-to-save says the callee saved regs are r4-r8, r9 (possibly), and r10-r11.  Your new set is missing r11.

Ok.
Comment 5 Guillaume Emont 2017-06-21 15:27:56 PDT
Comment on attachment 311619 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=311619&action=review

>>> Source/JavaScriptCore/assembler/ARMAssembler.h:218
>>> +            ARM6_MEMFENCE = 0xee076fba,
>> 
>> I think this may not be correct.  Someone who knows ARM better should take a closer look.  Here's why:
>> 
>> 1. The instruction you're replacing is DMBSY.
>>     The spec for DMBSY (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0646a/BABDFABI.html) says:
>>     "DMB acts as a data memory barrier. It ensures that all explicit memory accesses that appear, in program order, before the DMB instruction are completed before any explicit memory accesses that appear, in program order, after the DMB instruction. DMB does not affect the ordering or execution of instructions that do not access memory."
>>     Specifically, it ensures that memory access are completed in program order.
>> 
>> 2. You're replacing it with "mcr 15, 0, r6, cr7, cr10, {5}".
>>     The spec for MCR (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0360f/I1014942.html) says:
>>     "... As such, DMB ensures the apparent order of the explicit memory operations before and after the DMB instruction, but does not ensure the completion of those memory operations."
>> 
>>     I suspect what you really want is a "Data Synchronization Barrier", but I'm not certain.  Someone who knows better needs to review this.
>> 
>> Also, you picked r6 for the <Rd> argument in the instruction.  What's the reason for this?  It's not clear from the spec what <Rd> is for in terms of the "Data Memory Barrier" or "Data Synchronization Barrier".  Maybe it doesn't matter, but someone who knows traditional ARM better should review this.
> 
> Yes. I was researching about your questions and I'm not able to give you 100% of certain about them, so I just pinged the original authors of the Patch and they are going to give clarifications here soon.
> 
> One thing that I noticed between DSB and DMB in ARM11 is that instructions after DSB don't execute until it finishes. If I didn't misunderstood, "dmb sy"doesn't impose it and the code is more performant in practice. Following the ARM spec, It seems that we probably don't have a "dmb sy" equivalent instruction into ARM11 and we will probably need to choose which one is better.

Original author of the patch here, sorry for the delay.

Regarding difference of DMB/DSB: I think the wording "but does not ensure the completion of those memory operations." means that the previous memory operations are not guaranteed to finish before the MCR finishes, but it provides a guarantee that they will be finished before other memory accesses after the MCR. My impression is that this second guarantee is enough for our needs, but I cannot claim I understand all the (present and future) uses of memoryBarrier in JavaScriptCore, so it could be good to have somebody else (not sure whom) weigh in on what are the needs here.

Also, we do have a Data Synchronisation Barrier similar to DMB SY: it's the same instruction, but with cr7, cr10, {4}, though in that case I think the register passed has a meaning, which I don't fully understand.

Regarding the use of $r6, my understanding from the documentation is that it is not modified, and I could confirm that on my device with gdb, so I just picked a random register.
Comment 6 Caio Lima 2017-06-26 08:07:29 PDT
Hi all, here are some updates that I've got in a new round of research.

I've found the following reference of ARM11:

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0301h/Babhejba.html

"The purpose of the Data Memory Barrier operation is to ensure that all outstanding explicit memory transactions complete before any following explicit memory transactions begin. This ensures that data in memory is up to date before any memory transaction that depends on it."

Also, we can check the last ARMv6-M Arch Reference (http://infocenter.arm.com/help/topic/com.arm.doc.ddi0419d/DDI0419D_armv6m_arm.pdf) the following statement in page A3-59:

"The DMB instruction is a data memory barrier. DMB exhibits the following behavior:
- All explicit memory accesses by instructions occurring in program order before this instruction are globally observed before any explicit memory accesses because of instructions occurring in program order after this instruction are observed.
- The DMB instruction has no effect on the ordering of other instructions executing on the processor.
As such, DMB ensures the apparent order of the explicit memory operations before and after the instruction, without
ensuring their completion. For details on the DMB instruction, see DMB on page A6-121."

In the same manual, we can find at page A3-57:

"For all memory, a read or write is defined to be complete when it is globally observed:
- A branch predictor maintenance operation is defined to be complete when the effects of operation are globally observed."

About the statement:

"DMB ensures the apparent order of the explicit memory operations before and after the instruction, without ensuring their completion."

I understand that the "dmb" doesn't wait until the completion of memory access. The instruction that assures that is "dsb", since it's completion just happens when the memory access before it are complete. We can check that in http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0646a/CHDDGICF.html

Given that, I think we really want "mcr 15, 0, <rd>, cr7, cr10, {5}" in ARMv6 since it's equivalent of ARMv7 "dmb sy;".

PS.: The manual I'm pointing in ARM11 is the manual of the processor I'm testing the ARMv6 support. I could find the "dmb sy" instruction in ARMv6-M reference, however it isn't present in the ARM11 reference.

Who could we ask this kind of question that is ARM related?
Comment 7 Filip Pizlo 2017-07-03 18:01:58 PDT
Comment on attachment 311619 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=311619&action=review

>>>> Source/JavaScriptCore/assembler/ARMAssembler.h:218
>>>> +            ARM6_MEMFENCE = 0xee076fba,
>>> 
>>> I think this may not be correct.  Someone who knows ARM better should take a closer look.  Here's why:
>>> 
>>> 1. The instruction you're replacing is DMBSY.
>>>     The spec for DMBSY (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0646a/BABDFABI.html) says:
>>>     "DMB acts as a data memory barrier. It ensures that all explicit memory accesses that appear, in program order, before the DMB instruction are completed before any explicit memory accesses that appear, in program order, after the DMB instruction. DMB does not affect the ordering or execution of instructions that do not access memory."
>>>     Specifically, it ensures that memory access are completed in program order.
>>> 
>>> 2. You're replacing it with "mcr 15, 0, r6, cr7, cr10, {5}".
>>>     The spec for MCR (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0360f/I1014942.html) says:
>>>     "... As such, DMB ensures the apparent order of the explicit memory operations before and after the DMB instruction, but does not ensure the completion of those memory operations."
>>> 
>>>     I suspect what you really want is a "Data Synchronization Barrier", but I'm not certain.  Someone who knows better needs to review this.
>>> 
>>> Also, you picked r6 for the <Rd> argument in the instruction.  What's the reason for this?  It's not clear from the spec what <Rd> is for in terms of the "Data Memory Barrier" or "Data Synchronization Barrier".  Maybe it doesn't matter, but someone who knows traditional ARM better should review this.
>> 
>> Yes. I was researching about your questions and I'm not able to give you 100% of certain about them, so I just pinged the original authors of the Patch and they are going to give clarifications here soon.
>> 
>> One thing that I noticed between DSB and DMB in ARM11 is that instructions after DSB don't execute until it finishes. If I didn't misunderstood, "dmb sy"doesn't impose it and the code is more performant in practice. Following the ARM spec, It seems that we probably don't have a "dmb sy" equivalent instruction into ARM11 and we will probably need to choose which one is better.
> 
> Original author of the patch here, sorry for the delay.
> 
> Regarding difference of DMB/DSB: I think the wording "but does not ensure the completion of those memory operations." means that the previous memory operations are not guaranteed to finish before the MCR finishes, but it provides a guarantee that they will be finished before other memory accesses after the MCR. My impression is that this second guarantee is enough for our needs, but I cannot claim I understand all the (present and future) uses of memoryBarrier in JavaScriptCore, so it could be good to have somebody else (not sure whom) weigh in on what are the needs here.
> 
> Also, we do have a Data Synchronisation Barrier similar to DMB SY: it's the same instruction, but with cr7, cr10, {4}, though in that case I think the register passed has a meaning, which I don't fully understand.
> 
> Regarding the use of $r6, my understanding from the documentation is that it is not modified, and I could confirm that on my device with gdb, so I just picked a random register.

Hi everyone,

I'm probably the best point of reference for the meaning of the various fences in JSC.  MacroAssembler::memoryFence() is supposed to mean an acqrel fence: all accesses from before the acqrel will indeed happen before all of the accesses after the acqrel.  This is indeed weaker than the DSB fence, which also forces the entire execution of those prior instructions to finish before any execution of those after it start.  DSB is probably meant for what we call the WTF::crossModifyingCodeFence(), in <wtf/Atomics.h>.

Here are two code snippets that I read in order to reach the conclusion that MacroAssembler::memoryFence() means acqrel.  I never remember this stuff, but I do remember where to look.

In MacroAssemblerARM64.h, we have:

    // We take memoryFence to mean acqrel. This has acqrel semantics on ARM64.
    void memoryFence()
    {
        m_assembler.dmbISH();
    }

Look at that great comment!  I like looking at the ARM64 assembler because our memory model is essentially the philosophical superset of x86-TSO and ARM64 memory models.  Most of the hard stuff is because of ARM64, not x86.  So, typically, you can find out the most about crazy memory model stuff by just seeing what the ARM64 assembler does.

Another place to look is B3LowerToAir.  Note that Air instructions are almost always MacroAssembler method names, so "MemoryFence" means MacroAssembler::memoryFence().

        case Fence: {
            FenceValue* fence = m_value->as<FenceValue>();
            if (!fence->write && !fence->read)
                return;
            if (!fence->write) {
                // A fence that reads but does not write is for protecting motion of stores.
                append(StoreFence);
                return;
            }
            if (!fence->read) {
                // A fence that writes but does not read is for protecting motion of loads.
                append(LoadFence);
                return;
            }
            append(MemoryFence);
            return;
        }

This is the one mention of MemoryFence.  Here we see that we select MemoryFence from B3::Fence when we want to protect the motion of both reads and writes.  Sounds about right and DMB does that.

So, yeah, this should be DMB.
Comment 8 Caio Lima 2017-07-03 20:53:36 PDT
Created attachment 314551 [details]
Patch
Comment 9 Build Bot 2017-07-03 20:54:47 PDT
Attachment 314551 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/assembler/ARMAssembler.h:218:  enum members should use InterCaps with an initial capital letter or initial 'k' for C-style enums.  [readability/enum_casing] [4]
Total errors found: 1 in 5 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 10 Mark Lam 2017-07-06 17:32:05 PDT
Comment on attachment 311619 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=311619&action=review

>>>>> Source/JavaScriptCore/assembler/ARMAssembler.h:218
>>>>> +            ARM6_MEMFENCE = 0xee076fba,
>>>> 
>>>> I think this may not be correct.  Someone who knows ARM better should take a closer look.  Here's why:
>>>> 
>>>> 1. The instruction you're replacing is DMBSY.
>>>>     The spec for DMBSY (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0646a/BABDFABI.html) says:
>>>>     "DMB acts as a data memory barrier. It ensures that all explicit memory accesses that appear, in program order, before the DMB instruction are completed before any explicit memory accesses that appear, in program order, after the DMB instruction. DMB does not affect the ordering or execution of instructions that do not access memory."
>>>>     Specifically, it ensures that memory access are completed in program order.
>>>> 
>>>> 2. You're replacing it with "mcr 15, 0, r6, cr7, cr10, {5}".
>>>>     The spec for MCR (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0360f/I1014942.html) says:
>>>>     "... As such, DMB ensures the apparent order of the explicit memory operations before and after the DMB instruction, but does not ensure the completion of those memory operations."
>>>> 
>>>>     I suspect what you really want is a "Data Synchronization Barrier", but I'm not certain.  Someone who knows better needs to review this.
>>>> 
>>>> Also, you picked r6 for the <Rd> argument in the instruction.  What's the reason for this?  It's not clear from the spec what <Rd> is for in terms of the "Data Memory Barrier" or "Data Synchronization Barrier".  Maybe it doesn't matter, but someone who knows traditional ARM better should review this.
>>> 
>>> Yes. I was researching about your questions and I'm not able to give you 100% of certain about them, so I just pinged the original authors of the Patch and they are going to give clarifications here soon.
>>> 
>>> One thing that I noticed between DSB and DMB in ARM11 is that instructions after DSB don't execute until it finishes. If I didn't misunderstood, "dmb sy"doesn't impose it and the code is more performant in practice. Following the ARM spec, It seems that we probably don't have a "dmb sy" equivalent instruction into ARM11 and we will probably need to choose which one is better.
>> 
>> Original author of the patch here, sorry for the delay.
>> 
>> Regarding difference of DMB/DSB: I think the wording "but does not ensure the completion of those memory operations." means that the previous memory operations are not guaranteed to finish before the MCR finishes, but it provides a guarantee that they will be finished before other memory accesses after the MCR. My impression is that this second guarantee is enough for our needs, but I cannot claim I understand all the (present and future) uses of memoryBarrier in JavaScriptCore, so it could be good to have somebody else (not sure whom) weigh in on what are the needs here.
>> 
>> Also, we do have a Data Synchronisation Barrier similar to DMB SY: it's the same instruction, but with cr7, cr10, {4}, though in that case I think the register passed has a meaning, which I don't fully understand.
>> 
>> Regarding the use of $r6, my understanding from the documentation is that it is not modified, and I could confirm that on my device with gdb, so I just picked a random register.
> 
> Hi everyone,
> 
> I'm probably the best point of reference for the meaning of the various fences in JSC.  MacroAssembler::memoryFence() is supposed to mean an acqrel fence: all accesses from before the acqrel will indeed happen before all of the accesses after the acqrel.  This is indeed weaker than the DSB fence, which also forces the entire execution of those prior instructions to finish before any execution of those after it start.  DSB is probably meant for what we call the WTF::crossModifyingCodeFence(), in <wtf/Atomics.h>.
> 
> Here are two code snippets that I read in order to reach the conclusion that MacroAssembler::memoryFence() means acqrel.  I never remember this stuff, but I do remember where to look.
> 
> In MacroAssemblerARM64.h, we have:
> 
>     // We take memoryFence to mean acqrel. This has acqrel semantics on ARM64.
>     void memoryFence()
>     {
>         m_assembler.dmbISH();
>     }
> 
> Look at that great comment!  I like looking at the ARM64 assembler because our memory model is essentially the philosophical superset of x86-TSO and ARM64 memory models.  Most of the hard stuff is because of ARM64, not x86.  So, typically, you can find out the most about crazy memory model stuff by just seeing what the ARM64 assembler does.
> 
> Another place to look is B3LowerToAir.  Note that Air instructions are almost always MacroAssembler method names, so "MemoryFence" means MacroAssembler::memoryFence().
> 
>         case Fence: {
>             FenceValue* fence = m_value->as<FenceValue>();
>             if (!fence->write && !fence->read)
>                 return;
>             if (!fence->write) {
>                 // A fence that reads but does not write is for protecting motion of stores.
>                 append(StoreFence);
>                 return;
>             }
>             if (!fence->read) {
>                 // A fence that writes but does not read is for protecting motion of loads.
>                 append(LoadFence);
>                 return;
>             }
>             append(MemoryFence);
>             return;
>         }
> 
> This is the one mention of MemoryFence.  Here we see that we select MemoryFence from B3::Fence when we want to protect the motion of both reads and writes.  Sounds about right and DMB does that.
> 
> So, yeah, this should be DMB.

@Guillaume, can you point me to the documentation (url) about the use of <Rd> in the mcr instruction for DMB?  You're passing r6.  r6 may not be modified, but is it supposed to provide some sort of input value?  I have not yet seen ARM documentation that explains what the expectation is (other than <Rd> cannot be PC).
Comment 11 Jacob Bramley 2017-07-11 08:51:37 PDT
Hello all,

I hope you don't mind me chipping in. I think most of the issues in the discussion were resolved but it looks like there were one or two unanswered questions which I might be able to help with. I've had to interpret the barrier documentation enough times that some of it has actually stuck.

@Filip, based on your description of memoryFence, your conclusion that DMB is appropriate is correct. I wonder if the SY (system) shareability domain is necessary, though, or if something like ISH (inner shareable) would suffice. This depends on the system design, and how memoryFence is used.

> @Guillaume, can you point me to the documentation (url) about the use of
> <Rd> in the mcr instruction for DMB?  You're passing r6.  r6 may not be
> modified, but is it supposed to provide some sort of input value?  I have
> not yet seen ARM documentation that explains what the expectation is (other
> than <Rd> cannot be PC).

The `mcr` instruction takes a register (as an input) but for the DMB operation it is ignored (and doesn't have to be initialised to anything in particular). This is stated at the bottom of page B6-1945 of the ARMv7-AR ARM (edition 0406C.c): https://developer.arm.com/docs/ddi0406/latest/arm-architecture-reference-manual-armv7-a-and-armv7-r-edition


@Caio:
> Also, we can check the last ARMv6-M Arch Reference (http://infocenter.arm.com/help/topic/com.arm.doc.ddi0419d/DDI0419D_armv6m_arm.pdf) the following statement in page A3-59

Although I think the wording you quoted still applies, note that the ARMv6-M document refers only to M-class (microcontroller) systems with, no MMU and suchlike. It doesn't refer to the likes of ARM11. At one time I think there was an "ARMv6" reference manual but now it's all covered in the ARMv7-AR one that I mentioned above.


Also note that the ARMv6 MCR-based DMB performs the same operation as the ARMv7 DMB. The wording is different in the various different documents but the operation is the same either way.


@Guillaume:
> Also, we do have a Data Synchronisation Barrier similar to DMB SY: it's the same instruction, but with cr7, cr10, {4}, though in that case I think the register passed has a meaning, which I don't fully understand.

Actually the register has no meaning to DSB (nor for ISB) as far as I'm aware. I'm curious, though: where did you find that information?


Thanks,
Jacob
Comment 12 Mark Lam 2017-07-11 10:04:11 PDT
(In reply to Jacob Bramley from comment #11)
> The `mcr` instruction takes a register (as an input) but for the DMB
> operation it is ignored (and doesn't have to be initialised to anything in
> particular). This is stated at the bottom of page B6-1945 of the ARMv7-AR
> ARM (edition 0406C.c):
> https://developer.arm.com/docs/ddi0406/latest/arm-architecture-reference-
> manual-armv7-a-and-armv7-r-edition

Jacob, thanks.
Comment 13 Mark Lam 2017-07-11 10:19:09 PDT
Comment on attachment 314551 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=314551&action=review

I think this patch is almost ready now that we have the spec so that we know it's doing the right thing.  Please apply the changes.  r- for now.  Thanks.

> Source/JavaScriptCore/ChangeLog:8
> +        Original author is Guillaume Emont.

Any specific reason why Guillaume isn't posting the patch himself?

> Source/JavaScriptCore/assembler/ARMAssembler.h:217
> +            // mcr     15, 0, r6, cr7, cr10, {5}

I think it's helpful to add a comment here saying:
    // The r6 argument register is ignored by this operation. Hence, it does not need to be initialized.
    // See section B6.2.2 in page B6-1945 of https://static.docs.arm.com/ddi0406/c/DDI0406C_C_arm_architecture_reference_manual.pdf.

It also seems quite arbitrary and out of left field to choose r6 (as if there's some special reason for its choice).  Why not just choose r0?

> Source/JavaScriptCore/assembler/ARMAssembler.h:218
> +            ARM6_MEMFENCE = 0xee076fba,

The spec for the in section B6.2.2 page B6-1946 of https://static.docs.arm.com/ddi0406/c/DDI0406C_C_arm_architecture_reference_manual.pdf says:
"CP15DMB, Data Memory Barrier operation
In ARMv7, use the DMB instruction to perform a Data Memory Barrier, see DMB on page A8-378.
The deprecated CP15 c7 encoding for a Data Memory Barrier is an MCR instruction with <opc1> set to 0, <CRn> set to c7, <CRm> set to c10, and <opc2> set to 5. This operation performs the full system barrier performed by the DMB instruction."

Note the part that says "full system barrier performed by the DMB instruction".

Hence, let's rename ARM6_MEMFENCE as ARM6_DMB_SY.

> Source/JavaScriptCore/assembler/ARMAssembler.h:733
> -        void dmbSY()
> +        void memoryFence()

Let's undo this.

> Source/JavaScriptCore/assembler/ARMAssembler.h:738
> +            m_buffer.putInt(ARM6_MEMFENCE);

Rename to ARM6_DMB_SY.

> Source/JavaScriptCore/assembler/MacroAssemblerARM.h:1489
> -        m_assembler.dmbSY();
> +        m_assembler.memoryFence();

Undo this.
Comment 14 Guillaume Emont 2017-07-11 14:28:08 PDT
Comment on attachment 314551 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=314551&action=review

>> Source/JavaScriptCore/ChangeLog:8
>> +        Original author is Guillaume Emont.
> 
> Any specific reason why Guillaume isn't posting the patch himself?

I initially made this as a dirty downstream patch, Caio is kindly cleaning it up and sending it here. The main motivation for this is to help him understand this code. A secondary motivation is that I am busy with other things and without his help, who knows when I would have found the time.

>> Source/JavaScriptCore/assembler/ARMAssembler.h:217
>> +            // mcr     15, 0, r6, cr7, cr10, {5}
> 
> I think it's helpful to add a comment here saying:
>     // The r6 argument register is ignored by this operation. Hence, it does not need to be initialized.
>     // See section B6.2.2 in page B6-1945 of https://static.docs.arm.com/ddi0406/c/DDI0406C_C_arm_architecture_reference_manual.pdf.
> 
> It also seems quite arbitrary and out of left field to choose r6 (as if there's some special reason for its choice).  Why not just choose r0?

My bad, I think I indeed chose $r6 arbitrarily. I think $r0 should be fine.
Comment 15 Guillaume Emont 2017-07-11 14:36:26 PDT
(In reply to Jacob Bramley from comment #11)
> 
> @Guillaume:
> > Also, we do have a Data Synchronisation Barrier similar to DMB SY: it's the same instruction, but with cr7, cr10, {4}, though in that case I think the register passed has a meaning, which I don't fully understand.
> 
> Actually the register has no meaning to DSB (nor for ISB) as far as I'm
> aware. I'm curious, though: where did you find that information?

I have an old pdf lying around tiled "ARM Architecture
Reference Manual" that seems to be from 2005. It has a table on page B6-21 and B6-22 describing the mcr 15 cr7 operations. for (c10, 4), it says "SBZ" in the data field. I did not understand what that means. I just did some more research in the document, and it seems to mean "Should Be Zero". I think that therefore we definitely want to use $r0 instead of $r6.
Comment 16 Mark Lam 2017-07-11 14:48:27 PDT
(In reply to Guillaume Emont from comment #15)
> (In reply to Jacob Bramley from comment #11)
> > 
> > @Guillaume:
> > > Also, we do have a Data Synchronisation Barrier similar to DMB SY: it's the same instruction, but with cr7, cr10, {4}, though in that case I think the register passed has a meaning, which I don't fully understand.
> > 
> > Actually the register has no meaning to DSB (nor for ISB) as far as I'm
> > aware. I'm curious, though: where did you find that information?
> 
> I have an old pdf lying around tiled "ARM Architecture
> Reference Manual" that seems to be from 2005. It has a table on page B6-21
> and B6-22 describing the mcr 15 cr7 operations. for (c10, 4), it says "SBZ"
> in the data field. I did not understand what that means. I just did some
> more research in the document, and it seems to mean "Should Be Zero". I
> think that therefore we definitely want to use $r0 instead of $r6.

The documentation isn't very clear about what SBZ means but I quite sure it doesn't means using r0.  I think I read somewhere that SBZ implies that the input should have a zero value.  Regardless, that is irrelevant here.  The docs that Jacob point to is clear on the fact that the Rd argument register is ignored.  So, r0 or r6 is fine.  I just thought r0 is better because using r6 implies more purpose than it has.
Comment 17 Caio Lima 2017-07-11 16:17:08 PDT
@Jacob

Thanks for the support here. Do you mind answer another question we discussed in the mailing list? 
Is there any ARMv6 processor that doesn't support the "mcr" operations? The problem here is because if the support is limited to some family of chips, our solution here isn't enough.
Comment 18 Jacob Bramley 2017-07-12 03:42:04 PDT
(In reply to Mark Lam from comment #16)
> (In reply to Guillaume Emont from comment #15)
> > [...]
> > 
> > I have an old pdf lying around tiled "ARM Architecture
> > Reference Manual" that seems to be from 2005. It has a table on page B6-21
> > and B6-22 describing the mcr 15 cr7 operations. for (c10, 4), it says "SBZ"
> > in the data field. I did not understand what that means. I just did some
> > more research in the document, and it seems to mean "Should Be Zero". I
> > think that therefore we definitely want to use $r0 instead of $r6.
> 
> The documentation isn't very clear about what SBZ means but I quite sure it
> doesn't means using r0.  I think I read somewhere that SBZ implies that the
> input should have a zero value.  Regardless, that is irrelevant here.  The
> docs that Jacob point to is clear on the fact that the Rd argument register
> is ignored.  So, r0 or r6 is fine.  I just thought r0 is better because
> using r6 implies more purpose than it has.

I think I've found that table (in document 0100I), and I interpret it the same way that you do. I would speculate that the SBZ specification was relaxed for ARMv7, and then applied retrospectively based on real implementations of ARMv6.

In any case, the ARMv7-AR document I linked to is the latest official documentation for ARMv6 so I would prefer the explanation in that one.

(In reply to Caio Lima from comment #17)
> @Jacob
> 
> Thanks for the support here. Do you mind answer another question we
> discussed in the mailing list? 
> Is there any ARMv6 processor that doesn't support the "mcr" operations? The
> problem here is because if the support is limited to some family of chips,
> our solution here isn't enough.

As far as I'm aware, all ARMv6 processors support these `mcr` operations. If that were not the case, the document would typically refer to something like "ARMv6K".

Also note that the ARMv6 barriers were deprecated in ARMv7, and privileged code can deny access to them, so on newer architectures you may have problems if they are used in some lowest-common-supported-platform situation.

Thanks,
Jacob
Comment 19 Mark Lam 2017-07-12 11:15:24 PDT
(In reply to Mark Lam from comment #16)
> (In reply to Guillaume Emont from comment #15)
> > (In reply to Jacob Bramley from comment #11)
> > > 
> > > @Guillaume:
> > > > Also, we do have a Data Synchronisation Barrier similar to DMB SY: it's the same instruction, but with cr7, cr10, {4}, though in that case I think the register passed has a meaning, which I don't fully understand.
> > > 
> > > Actually the register has no meaning to DSB (nor for ISB) as far as I'm
> > > aware. I'm curious, though: where did you find that information?
> > 
> > I have an old pdf lying around tiled "ARM Architecture
> > Reference Manual" that seems to be from 2005. It has a table on page B6-21
> > and B6-22 describing the mcr 15 cr7 operations. for (c10, 4), it says "SBZ"
> > in the data field. I did not understand what that means. I just did some
> > more research in the document, and it seems to mean "Should Be Zero". I
> > think that therefore we definitely want to use $r0 instead of $r6.
> 
> The documentation isn't very clear about what SBZ means but I quite sure it
> doesn't means using r0.  I think I read somewhere that SBZ implies that the
> input should have a zero value.  Regardless, that is irrelevant here.  The
> docs that Jacob point to is clear on the fact that the Rd argument register
> is ignored.  So, r0 or r6 is fine.  I just thought r0 is better because
> using r6 implies more purpose than it has.

@Guillaume, I apologize.  You were right: if the doc says SBZ, then the field should be 0.  In the register field, this means r0.

That said, I think the latest document is clear that the register value doesn't matter (except that it cannot be r15/pc).  But it also doesn't hurt to set it to r0.  Thanks.