Bug 214737 - for..of intrinsics implementation for 32bits
Summary: for..of intrinsics implementation for 32bits
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Paulo Matos
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-07-24 08:34 PDT by Paulo Matos
Modified: 2020-07-30 11:45 PDT (History)
10 users (show)

See Also:


Attachments
Patch (35.95 KB, patch)
2020-07-24 08:41 PDT, Paulo Matos
no flags Details | Formatted Diff | Diff
Patch (38.33 KB, patch)
2020-07-24 09:23 PDT, Paulo Matos
no flags Details | Formatted Diff | Diff
Patch (38.95 KB, patch)
2020-07-24 09:30 PDT, Paulo Matos
no flags Details | Formatted Diff | Diff
Patch (39.23 KB, patch)
2020-07-26 23:15 PDT, Paulo Matos
no flags Details | Formatted Diff | Diff
Patch (42.09 KB, patch)
2020-07-27 10:45 PDT, Paulo Matos
no flags Details | Formatted Diff | Diff
Patch (41.47 KB, patch)
2020-07-29 02:31 PDT, Paulo Matos
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Paulo Matos 2020-07-24 08:34:26 PDT
for..of intrinsics implementation for 32bits
Comment 1 Paulo Matos 2020-07-24 08:41:06 PDT
Created attachment 405145 [details]
Patch
Comment 2 Paulo Matos 2020-07-24 09:23:49 PDT
Created attachment 405152 [details]
Patch
Comment 3 Paulo Matos 2020-07-24 09:30:58 PDT
Created attachment 405156 [details]
Patch
Comment 4 Paulo Matos 2020-07-26 23:13:15 PDT
There are some mips failures due to a broken implementation of or8 for mips. I am working on a fix.
Comment 5 Paulo Matos 2020-07-26 23:15:23 PDT
Created attachment 405260 [details]
Patch
Comment 6 Paulo Matos 2020-07-27 01:21:09 PDT
Seems ready for review!
Comment 7 Adrian Perez 2020-07-27 03:35:10 PDT
Comment on attachment 405260 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=405260&action=review

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:496
> +    

The MIPS parts of the patch LGTM 👍️
Comment 8 Paulo Matos 2020-07-27 06:25:01 PDT
(In reply to Adrian Perez from comment #7)
> Comment on attachment 405260 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=405260&action=review
> 
> > Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:496
> > +    
> 
> The MIPS parts of the patch LGTM 👍️

Thanks taking a look!
Comment 9 Caio Lima 2020-07-27 06:36:17 PDT
Comment on attachment 405260 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=405260&action=review

I've left some comments on current patch. Since I'm co-author of it, I don't think it counts as a review.

> Source/JavaScriptCore/ChangeLog:11
> +        Adds or8 instruction to ARMv7 and MIPS Macro Assembler.

I think we could add such comment bellow the file we changed.

> Source/JavaScriptCore/ChangeLog:24
> +        (JSC::MacroAssemblerMIPS::store8):

For example, here we could have "Adds or8 instruction to ARMv7 and MIPS Macro Assembler."

> Source/JavaScriptCore/ChangeLog:28
> +        (JSC::DFG::OSRExit::compileExit):

Ande here we could have:

```
Fixes DFG OSR Exit bug, where checkpoint temporary value is
incorrectly recreated for Baseline.
Refactors code in DFG OSR Exit to be easier to modify and
maintain by separating the switch cases for 32 and 64bits.
```

> Source/JavaScriptCore/assembler/MacroAssemblerARMv7.h:380
> +        }

Could you confirm if we have `testmasm` covering this addition? Our bots aren't running `testmasm` yet, but we can potentially start running those tests and it would be nice to have `or8` covered there.

> Source/JavaScriptCore/assembler/MacroAssemblerARMv7.h:941
> +    }

ditto.

> Source/JavaScriptCore/jit/JITCall32_64.cpp:480
> +    JSValueRegs v(regT1, regT0);

nit: I think we should call it `nextRegs`

> Source/JavaScriptCore/jit/JITCall32_64.cpp:509
> +        emitJumpSlowCaseIfNotJSCell(vr, tagIterResultGPR);

The usage of `vr` here doesn't look quite right. However I don't see the necessity of having `vr` here, since the intention is to jump to slow case if `tagIterResultGPR` is not a JSCell tag, regardless `vr` being a constant or not (I don't even think that this is possible and that's probably why we are passing tests). We could follow 64-bits and add new `JIT::emitJumpSlowCaseIfNotJSCell(RegisterID tagReg)` function.

> Source/JavaScriptCore/jit/JITCall32_64.cpp:531
> +

nit: There is an extra \newline here

> Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:1417
> +macro performGetByIDHelper(opcodeStruct, modeMetadataName, valueProfileName, slowLabel, size, metadata, return)

This one is very close with `macro performGetByIDHelper` in LowLevelInterpeter64.asm. The major difference is on `loadPropertyAtVariableOffset`, `valueProfile` and `return`. Maybe we would like to merge them (here or in another patch). If we decide to that that later I think we should add a FIXME her.
Comment 10 Paulo Matos 2020-07-27 10:45:27 PDT
Created attachment 405291 [details]
Patch
Comment 11 Paulo Matos 2020-07-27 10:46:22 PDT
Thanks Caio, the newest patch should meet all your concerns.
Comment 12 Yusuke Suzuki 2020-07-28 14:58:11 PDT
Comment on attachment 405291 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=405291&action=review

r=me with comments

> Source/JavaScriptCore/assembler/MacroAssemblerARMv7.h:714
> +    void store8(RegisterID src, const ArmAddress address)

We do not need to have `const`.

> Source/JavaScriptCore/assembler/MacroAssemblerARMv7.h:908
> +    void store8(RegisterID src, const Address address)

Ditto.

> Source/JavaScriptCore/assembler/MacroAssemblerARMv7.h:913
> +    void store8(RegisterID src, const BaseIndex address)

Ditto.

> Source/JavaScriptCore/assembler/MacroAssemblerARMv7.h:938
> +    void store8(const RegisterID src, const RegisterID addrreg)

Ditto.

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:633
> +                    case Constant:
> +                        sideState->tmps[i] = recovery.constant();
> +                        break;

This is the same implementation. Put it outside of ifdef.

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:639
> +                    case UnboxedInt32InGPR:
> +                    case Int32DisplacedInJSStack: {
> +                        sideState->tmps[i] = jsNumber(static_cast<int32_t>(tmpScratch[i + tmpOffset]));
> +                        break;
> +                    }

Ditto.

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:644
> +                    case InPair:
> +                    case DisplacedInJSStack: {
>                          sideState->tmps[i] = reinterpret_cast<JSValue*>(tmpScratch)[i + tmpOffset];
> -#else
> +                        break;

Ditto.

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:652
> +                    case CellDisplacedInJSStack:
> +                    case UnboxedCellInGPR: {
>                          EncodedValueDescriptor* valueDescriptor = bitwise_cast<EncodedValueDescriptor*>(tmpScratch + i + tmpOffset);
>                          sideState->tmps[i] = JSValue(JSValue::CellTag, valueDescriptor->asBits.payload);
> -#endif
>                          break;
>                      }

Ditto.

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:654
> +                    case BooleanDisplacedInJSStack:

Why?

> Source/JavaScriptCore/jit/JITCall32_64.cpp:392
> +    VirtualRegister vr = destinationFor(bytecode, m_bytecodeIndex.checkpoint()).virtualRegister();

Remove it.

> Source/JavaScriptCore/jit/JITCall32_64.cpp:398
> +    emitJumpSlowCaseIfNotJSCell(vr, regT1);

Let's use `emitJumpSlowCaseIfNotJSCell(regT1)`.

> Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:2773
> +    bieq t1, UndefinedTag, .iteratorNextGeneric
> +    btinz t0, .iteratorNextGeneric

I think this is not correct. We should have JSEmpty check here.
Comment 13 Paulo Matos 2020-07-29 02:31:48 PDT
Created attachment 405446 [details]
Patch
Comment 14 Paulo Matos 2020-07-29 02:36:47 PDT
(In reply to Yusuke Suzuki from comment #12)
> Comment on attachment 405291 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=405291&action=review
> 
> r=me with comments
> 
> > Source/JavaScriptCore/assembler/MacroAssemblerARMv7.h:714
> > +    void store8(RegisterID src, const ArmAddress address)
> 
> We do not need to have `const`.
> 

Removed.

> > Source/JavaScriptCore/dfg/DFGOSRExit.cpp:633
> > +                    case Constant:
> > +                        sideState->tmps[i] = recovery.constant();
> > +                        break;
> 
> This is the same implementation. Put it outside of ifdef.
>

I simplified the ones I could but they were not all the same.

> 
> > Source/JavaScriptCore/dfg/DFGOSRExit.cpp:654
> > +                    case BooleanDisplacedInJSStack:
> 
> Why?
>

This can be loaded as a boolean. The problem is that we the tmp can only encode the payload, so sometimes we need to hardcode the tag for baseline, as it's done in the case of CellDisplacedInJSStack. But this is not necessary if it's a boolean.
 
> > Source/JavaScriptCore/jit/JITCall32_64.cpp:392
> > +    VirtualRegister vr = destinationFor(bytecode, m_bytecodeIndex.checkpoint()).virtualRegister();
> 
> Remove it.
>

Done.
 
> > Source/JavaScriptCore/jit/JITCall32_64.cpp:398
> > +    emitJumpSlowCaseIfNotJSCell(vr, regT1);
> 
> Let's use `emitJumpSlowCaseIfNotJSCell(regT1)`.
> 

Done.

> > Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:2773
> > +    bieq t1, UndefinedTag, .iteratorNextGeneric
> > +    btinz t0, .iteratorNextGeneric
> 
> I think this is not correct. We should have JSEmpty check here.

That's what btinz is doing afaict. Done the same way in 64bits.
Comment 15 EWS 2020-07-29 03:32:31 PDT
Committed r265036: <https://trac.webkit.org/changeset/265036>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 405446 [details].
Comment 16 Radar WebKit Bug Importer 2020-07-29 03:33:18 PDT
<rdar://problem/66265361>
Comment 17 Caio Lima 2020-07-29 04:01:36 PDT
Comment on attachment 405291 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=405291&action=review

>>> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:654
>>> +                    case BooleanDisplacedInJSStack:
>> 
>> Why?
> 
> This can be loaded as a boolean. The problem is that we the tmp can only encode the payload, so sometimes we need to hardcode the tag for baseline, as it's done in the case of CellDisplacedInJSStack. But this is not necessary if it's a boolean.

I don't understand what you mean by "But this is not necessary if it's a boolean". Could you clarify? This case is very similar with `CellDisplacedInJSStack` and `Int32DisplacedInJSStack`. `BooleanDisplacedInJSStack` on 32-bits means that we can only trust the payload value stored in scratch buffer, while tag will probably contain garbage. Since we have proper payload and we know it is a Boolean, we then store it as `jsBoolean(static_cast<bool>(payload))`, and the `jsBoolean` will add the JSValue::BooleanTag for us.
Comment 18 Paulo Matos 2020-07-29 05:53:59 PDT
(In reply to Caio Lima from comment #17)
> Comment on attachment 405291 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=405291&action=review
> 
> >>> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:654
> >>> +                    case BooleanDisplacedInJSStack:
> >> 
> >> Why?
> > 
> > This can be loaded as a boolean. The problem is that we the tmp can only encode the payload, so sometimes we need to hardcode the tag for baseline, as it's done in the case of CellDisplacedInJSStack. But this is not necessary if it's a boolean.
> 
> I don't understand what you mean by "But this is not necessary if it's a
> boolean". Could you clarify? This case is very similar with
> `CellDisplacedInJSStack` and `Int32DisplacedInJSStack`.
> `BooleanDisplacedInJSStack` on 32-bits means that we can only trust the
> payload value stored in scratch buffer, while tag will probably contain
> garbage. Since we have proper payload and we know it is a Boolean, we then
> store it as `jsBoolean(static_cast<bool>(payload))`, and the `jsBoolean`
> will add the JSValue::BooleanTag for us.

I miscommunicated here. Your clarification is correct.
Comment 19 Caio Lima 2020-07-29 06:30:02 PDT
Comment on attachment 405291 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=405291&action=review

>>> Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:2773
>>> +    btinz t0, .iteratorNextGeneric
>> 
>> I think this is not correct. We should have JSEmpty check here.
> 
> That's what btinz is doing afaict. Done the same way in 64bits.

I think Yusuke is right here. The `EmptyValueTag` is 0xfffffffa, while `JSValue::ValueEmpty` in 64-bits is 0x0. The 64-bit versions checks `btqnz t0`, which makes this branch to `.iteratorNextGeneric` if `m_next != ValueEmpty`. Every other case will fall through `.iteratorNextGeneric`. Checking for UndefinedTag means that we are never executing `iterator_next_try_fast*` path on 32-bits and always going through `.iteratorNextGeneric`, making this operation slower than necessary. We can verify it if we execute a microbenchmark that's using for-of in an ordinary Array.
Comment 20 Yusuke Suzuki 2020-07-29 14:28:52 PDT
Comment on attachment 405291 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=405291&action=review

>>>> Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:2773
>>>> +    btinz t0, .iteratorNextGeneric
>>> 
>>> I think this is not correct. We should have JSEmpty check here.
>> 
>> That's what btinz is doing afaict. Done the same way in 64bits.
> 
> I think Yusuke is right here. The `EmptyValueTag` is 0xfffffffa, while `JSValue::ValueEmpty` in 64-bits is 0x0. The 64-bit versions checks `btqnz t0`, which makes this branch to `.iteratorNextGeneric` if `m_next != ValueEmpty`. Every other case will fall through `.iteratorNextGeneric`. Checking for UndefinedTag means that we are never executing `iterator_next_try_fast*` path on 32-bits and always going through `.iteratorNextGeneric`, making this operation slower than necessary. We can verify it if we execute a microbenchmark that's using for-of in an ordinary Array.

Yes, JSEmpty representation is different in 64bit and 32bit. And this is not an empty check. We should check empty tag here.
Can you fix it?
Comment 21 Yusuke Suzuki 2020-07-30 11:32:22 PDT
Comment on attachment 405291 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=405291&action=review

>>> Source/JavaScriptCore/jit/JITCall32_64.cpp:392
>>> +    VirtualRegister vr = destinationFor(bytecode, m_bytecodeIndex.checkpoint()).virtualRegister();
>> 
>> Remove it.
> 
> Done.

It seems that landed patch is not including this change, can you fix it?

>>> Source/JavaScriptCore/jit/JITCall32_64.cpp:398
>>> +    emitJumpSlowCaseIfNotJSCell(vr, regT1);
>> 
>> Let's use `emitJumpSlowCaseIfNotJSCell(regT1)`.
> 
> Done.

It seems that landed patch is not including this change, can you fix it?
Comment 22 Yusuke Suzuki 2020-07-30 11:45:15 PDT
Comment on attachment 405446 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=405446&action=review

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:654
> +
>                      case Constant:
>                          sideState->tmps[i] = recovery.constant();
>                          break;
> -
> +                        
> +#if USE(JSVALUE64)
>                      case UnboxedInt32InGPR:
>                      case Int32DisplacedInJSStack: {
>                          sideState->tmps[i] = jsNumber(static_cast<int32_t>(tmpScratch[i + tmpOffset]));
>                          break;
>                      }
>  
> -#if USE(JSVALUE32_64)
> -                    case InPair:
> -#endif
> -                    case InGPR:
>                      case BooleanDisplacedInJSStack:
>                      case CellDisplacedInJSStack:
> +                    case UnboxedCellInGPR:
> +                    case InGPR:
>                      case DisplacedInJSStack: {
>                          sideState->tmps[i] = reinterpret_cast<JSValue*>(tmpScratch)[i + tmpOffset];
>                          break;
>                      }
>  
> -                    case UnboxedCellInGPR: {
> -#if USE(JSVALUE64)
> +                    case UnboxedBooleanInGPR: {
> +                        sideState->tmps[i] = jsBoolean(static_cast<bool>(tmpScratch[i + tmpOffset]));
> +                        break;
> +                    }
> +
> +#else // USE(JSVALUE32_64)
> +                    case UnboxedInt32InGPR:
> +                    case Int32DisplacedInJSStack: {
> +                        sideState->tmps[i] = jsNumber(static_cast<int32_t>(tmpScratch[i + tmpOffset]));
> +                        break;
> +                    }
> +
> +                    case InPair:
> +                    case DisplacedInJSStack: {
>                          sideState->tmps[i] = reinterpret_cast<JSValue*>(tmpScratch)[i + tmpOffset];
> -#else
> +                        break;
> +                    }
> +
> +                    case CellDisplacedInJSStack:
> +                    case UnboxedCellInGPR: {
>                          EncodedValueDescriptor* valueDescriptor = bitwise_cast<EncodedValueDescriptor*>(tmpScratch + i + tmpOffset);
>                          sideState->tmps[i] = JSValue(JSValue::CellTag, valueDescriptor->asBits.payload);
> -#endif
>                          break;
>                      }
>  
> +                    case BooleanDisplacedInJSStack:
>                      case UnboxedBooleanInGPR: {
>                          sideState->tmps[i] = jsBoolean(static_cast<bool>(tmpScratch[i + tmpOffset]));
>                          break;
>                      }
> +                        
> +#endif // USE(JSVALUE64)

Actually, I think this change is not necessary because old implementation just worked. Can you revert this part?