Bug 190047

Summary: [JSC] Optimize Object.keys by caching own keys results in StructureRareData
Product: WebKit Reporter: Yusuke Suzuki <ysuzuki>
Component: JavaScriptCoreAssignee: Yusuke Suzuki <ysuzuki>
Status: RESOLVED FIXED    
Severity: Normal CC: commit-queue, ews-watchlist, keith_miller, mark.lam, msaboff, saam, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=194050
Bug Depends on: 192715    
Bug Blocks:    
Attachments:
Description Flags
WIP
none
WIP
none
WIP
none
Patch
none
Patch
none
Patch
none
Patch saam: review+

Description Yusuke Suzuki 2018-09-27 11:12:19 PDT
Let's cache the own keys in StructureRareData and reuse it to optimize Object.keys.
We will put JSImmutableButterfly in StructureRareData. And Object.keys will generate JSArray wrapping it!
Comment 1 Yusuke Suzuki 2018-09-27 11:13:04 PDT
Object.keys are super heavily called in web-tooling-benchmarks.
Comment 2 Saam Barati 2018-09-27 11:32:48 PDT
When is it safe to cache this? Object.keys is just self keys, right? Not prototype chain? So I guess if we're not a dictionary we can cache?
Comment 3 Saam Barati 2018-09-27 11:33:04 PDT
(In reply to Saam Barati from comment #2)
> When is it safe to cache this? Object.keys is just self keys, right? Not
> prototype chain? So I guess if we're not a dictionary we can cache?

We should also have some story of cleaning up this value.
Comment 4 Yusuke Suzuki 2018-09-28 01:11:02 PDT
(In reply to Saam Barati from comment #3)
> (In reply to Saam Barati from comment #2)
> > When is it safe to cache this? Object.keys is just self keys, right? Not
> > prototype chain? So I guess if we're not a dictionary we can cache?
> 
> We should also have some story of cleaning up this value.

I think we can align the lifetime of the cache to the JSPropertyEnumerator cache (for-in cache).
But currently, we do not clear JSPropertyEnumerator cache.
Comment 5 Yusuke Suzuki 2018-09-28 05:56:34 PDT
Created attachment 351069 [details]
WIP

Initial patch
Comment 6 Yusuke Suzuki 2018-09-28 06:08:39 PDT
(In reply to Yusuke Suzuki from comment #4)
> (In reply to Saam Barati from comment #3)
> > (In reply to Saam Barati from comment #2)
> > > When is it safe to cache this? Object.keys is just self keys, right? Not
> > > prototype chain? So I guess if we're not a dictionary we can cache?
> > 
> > We should also have some story of cleaning up this value.
> 
> I think we can align the lifetime of the cache to the JSPropertyEnumerator
> cache (for-in cache).
> But currently, we do not clear JSPropertyEnumerator cache.

Possible ways.

1. Do not add clearing path.

Anyway, when Structure is collected, the cache is also collected.
This is the same lifetime to the existing for-in cache.

2. Use Weak.

Once the cached JSImmutableButterfly is not reachable from elsewhere, it will be collected.

3. Clear cache in Heap::finalize()

It would be a bit costly to iterate StructureRareData (iterating IsoHeap), but we can clear these data when finalizing the collector.
Comment 7 Yusuke Suzuki 2018-09-28 06:09:26 PDT
This is an initial attempt. It improves two benchmarks in SixSpeed since both use Object.keys.

                            baseline                  patched

object-assign.es5      333.2122+-4.8252     ^    219.6000+-3.3567        ^ definitely 1.5174x faster
for-of-object.es6      272.0637+-3.3468     ^    128.9627+-3.7837        ^ definitely 2.1096x faster

And it improves web-tooling-benchmark/lebab since it heavily uses Object.keys.

Before:  lebab:  5.76 runs/s
After:   lebab:  6.84 runs/s
Comment 8 Yusuke Suzuki 2018-09-28 06:11:00 PDT
Created attachment 351070 [details]
WIP
Comment 9 Yusuke Suzuki 2018-12-09 09:53:05 PST
Created attachment 356924 [details]
WIP
Comment 10 Yusuke Suzuki 2018-12-10 03:14:24 PST
Created attachment 356953 [details]
Patch
Comment 11 Yusuke Suzuki 2018-12-11 21:19:20 PST
Ping?
Comment 12 Keith Miller 2018-12-12 10:45:39 PST
Comment on attachment 356953 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=356953&action=review

r=me. Although, I agree with Saam that we should probably have a story for collecting stale enumerations. Can you file a bug?

Maybe something like: clear on FullGC + Structure is only referenced via a transition? Not sure how we would determine the latter though...

> Source/JavaScriptCore/runtime/StructureRareDataInlines.h:62
> +    return m_cachedOwnKeys.get();

Do we have an ASSERT(!compilationOrGCThread())?
Comment 13 Yusuke Suzuki 2018-12-12 23:12:25 PST
Comment on attachment 356953 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=356953&action=review

Thanks. I've opened the bug for tracking GC enhancement. https://bugs.webkit.org/show_bug.cgi?id=192659

>> Source/JavaScriptCore/runtime/StructureRareDataInlines.h:62
>> +    return m_cachedOwnKeys.get();
> 
> Do we have an ASSERT(!compilationOrGCThread())?

Sounds nice! Fixed.
Comment 14 Yusuke Suzuki 2018-12-12 23:13:43 PST
Committed r239153: <https://trac.webkit.org/changeset/239153>
Comment 15 Radar WebKit Bug Importer 2018-12-12 23:14:27 PST
<rdar://problem/46688655>
Comment 16 Yusuke Suzuki 2018-12-13 00:06:35 PST
Committed r239154: <https://trac.webkit.org/changeset/239154>
Comment 17 Yusuke Suzuki 2018-12-13 00:37:40 PST
Committed r239155: <https://trac.webkit.org/changeset/239155>
Comment 18 Saam Barati 2018-12-13 01:38:18 PST
Comment on attachment 356953 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=356953&action=review

I have a few comments.

> Source/JavaScriptCore/dfg/DFGConstantFoldingPhase.cpp:769
> +            case ObjectKeys: {

I really wish we had a way of duplicating less code between AI and constant folding

> Source/JavaScriptCore/runtime/ObjectConstructor.cpp:956
> +                            structure->setCachedOwnKeys(vm, jsCast<JSImmutableButterfly*>(vm.sentinelImmutableButterfly.get()));

Is the whole point of this just to cache after we ask for keys more than once? Why not just store a sentinel using the number 1? The getter can then just return nullptr to people when it’s really 1. You could then get a rid of a lot of pointer compares to this sentinel value too

> Source/JavaScriptCore/runtime/Structure.h:331
> +        WTF::loadLoadFence();

Why? All your loads are dependent loads based off of this

> Source/JavaScriptCore/runtime/StructureRareDataInlines.h:68
> +    WTF::loadLoadFence();

Ditto as to why

> Source/JavaScriptCore/runtime/StructureRareDataInlines.h:74
> +    WTF::storeStoreFence();

This one makes sense and is required
Comment 19 Saam Barati 2018-12-13 01:52:04 PST
(In reply to Keith Miller from comment #12)
> Comment on attachment 356953 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=356953&action=review
> 
> r=me. Although, I agree with Saam that we should probably have a story for
> collecting stale enumerations. Can you file a bug?
> 
> Maybe something like: clear on FullGC + Structure is only referenced via a
> transition? Not sure how we would determine the latter though...
We could do an even simpler thing:
Delete it if it hasn’t been used since the last GC. We just need to store a bit when we use it.
> 
> > Source/JavaScriptCore/runtime/StructureRareDataInlines.h:62
> > +    return m_cachedOwnKeys.get();
> 
> Do we have an ASSERT(!compilationOrGCThread())?
Comment 20 WebKit Commit Bot 2018-12-14 13:30:33 PST
Re-opened since this is blocked by bug 192715
Comment 21 Ryan Haddad 2018-12-14 13:40:39 PST
*** Bug 192673 has been marked as a duplicate of this bug. ***
Comment 22 Yusuke Suzuki 2018-12-15 04:43:27 PST
(In reply to Ryan Haddad from comment #21)
> *** Bug 192673 has been marked as a duplicate of this bug. ***

OK, I think I found the issue.
When executing Object.keys, we first create JSImmutableButterfly, then store JSString keys into this immutable butterfly.
The problem is that we allocates JSString* after creating JSImmutableButterfly.
So, when the collector is waken up in this JSString* allocation, it will see JSImmutableButterfly which holds many uninitialized hidden fields.
If the marker want to mark them, then we have a bad time.

I think calling `setStartingValue(JSValue())` for Contiguous shape case would be good way to fix the issue.
Comment 23 Yusuke Suzuki 2018-12-15 05:23:24 PST
Comment on attachment 356953 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=356953&action=review

>> Source/JavaScriptCore/runtime/ObjectConstructor.cpp:956
>> +                            structure->setCachedOwnKeys(vm, jsCast<JSImmutableButterfly*>(vm.sentinelImmutableButterfly.get()));
> 
> Is the whole point of this just to cache after we ask for keys more than once? Why not just store a sentinel using the number 1? The getter can then just return nullptr to people when it’s really 1. You could then get a rid of a lot of pointer compares to this sentinel value too

This a bit simplifies the marking code in StructureRareData::visitChildren and the user code of cachedOwnKeys.
I'll attempt to change it to some dummy value.

>> Source/JavaScriptCore/runtime/Structure.h:331
>> +        WTF::loadLoadFence();
> 
> Why? All your loads are dependent loads based off of this

Right, removed.

>> Source/JavaScriptCore/runtime/StructureRareDataInlines.h:68
>> +    WTF::loadLoadFence();
> 
> Ditto as to why

Ah, after considering about it, I think it is not necessary. No load-load barrier is required.
Comment 24 Yusuke Suzuki 2018-12-15 05:43:42 PST
Created attachment 357399 [details]
Patch
Comment 25 Saam Barati 2018-12-15 12:02:49 PST
Comment on attachment 357399 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=357399&action=review

> Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:2587
> +            if (structureSet.isFinite() && structureSet.size() == 1) {

Do we ever think we’d see two structures with the same ownKeys array? We could serialize this across >1 structure somehow if we ever found it useful

> Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:2590
> +                    auto* immutableButterfly = rareData->cachedOwnKeysConcurrently();

Why not just make the getter return nullptr if it’s the sentinel? Then you could have some function like getIgnoringSentinel for the only getter you call that cares if it’s actually the sentinel value

> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2695
> +    case ObjectKeysIntrinsic: {

Is Object.keys observable if the input isn’t a Proxy?

> Source/JavaScriptCore/dfg/DFGNode.cpp:229
> +    setOpAndDefaultFlags(NewArrayBuffer);

Does this node do the right thing if we’re having a bad time?

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:12394
> +            slowCases.append(m_jit.branch32(CCallHelpers::Equal, CCallHelpers::Address(scratchGPR, JSCell::structureIDOffset()), TrustedImm32(bitwise_cast<int32_t>(m_jit.vm()->structureStructure->structureID()))));

This is how we determine its RareData? Does StructureRareData have a a JSType?

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:12399
> +            slowCases.append(m_jit.branchPtr(CCallHelpers::Equal, scratchGPR, TrustedImmPtr(bitwise_cast<void*>(StructureRareData::cachedOwnKeysSentinel()))));

You can combine both branches here if you assert the sentinel is “1”, and branch BelowOrEqual

> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5531
> +                m_out.branch(m_out.notEqual(cachedOwnKeys, m_out.constIntPtr(bitwise_cast<void*>(StructureRareData::cachedOwnKeysSentinel()))), unsure(useCacheCase), unsure(slowCase));

Ditto about combining branches

> Source/JavaScriptCore/runtime/JSImmutableButterfly.h:123
> +        if (hasContiguous(indexingType())) {

Why is this needed?

> Source/JavaScriptCore/runtime/ObjectConstructor.cpp:939
> +                        auto* cachedButterfly = structure->cachedOwnKeys();

Based on earlier comment, this would be the only call to something like “cachedOwnKeysIgnorningSentinel”
Comment 26 Yusuke Suzuki 2018-12-15 20:40:35 PST
Comment on attachment 357399 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=357399&action=review

>> Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:2587
>> +            if (structureSet.isFinite() && structureSet.size() == 1) {
> 
> Do we ever think we’d see two structures with the same ownKeys array? We could serialize this across >1 structure somehow if we ever found it useful

Currently, I think we do not have the use cases that receive benefit from it. Once it is required, we need to consider the optimization.
But even in that case, ObjectKeys is accelerated if the given Structure has the cache.

>> Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:2590
>> +                    auto* immutableButterfly = rareData->cachedOwnKeysConcurrently();
> 
> Why not just make the getter return nullptr if it’s the sentinel? Then you could have some function like getIgnoringSentinel for the only getter you call that cares if it’s actually the sentinel value

Interesting. Basically, all the concurrent functions do not care about sentinel since they do not set a new JSImmutableButterfly. So, only the function called from the main thread matters.
Fixed.

>> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2695
>> +    case ObjectKeysIntrinsic: {
> 
> Is Object.keys observable if the input isn’t a Proxy?

No, it's not observable if the input isn't a Proxy. But as a first step, I think the current implementation is OK.
We could optimize it further by adding special care to clobberizing rules in the subsequent patch.

Moreover, if ObjectKeys only takes one Structure which is proven in AI, then it is converted into NewArrayBuffer, which tells us that it does not have any side effect.
So, I think large part of the use cases is already covered by this.

>> Source/JavaScriptCore/dfg/DFGNode.cpp:229
>> +    setOpAndDefaultFlags(NewArrayBuffer);
> 
> Does this node do the right thing if we’re having a bad time?

We convert ObjectKeys into NewArrayBuffer only when we can watch the HavingABadTimeWatchpoint.

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:12394
>> +            slowCases.append(m_jit.branch32(CCallHelpers::Equal, CCallHelpers::Address(scratchGPR, JSCell::structureIDOffset()), TrustedImm32(bitwise_cast<int32_t>(m_jit.vm()->structureStructure->structureID()))));
> 
> This is how we determine its RareData? Does StructureRareData have a a JSType?

Yes. You can see Structure::isRareData function. And StructureRareData does not have a JSType.

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:12399
>> +            slowCases.append(m_jit.branchPtr(CCallHelpers::Equal, scratchGPR, TrustedImmPtr(bitwise_cast<void*>(StructureRareData::cachedOwnKeysSentinel()))));
> 
> You can combine both branches here if you assert the sentinel is “1”, and branch BelowOrEqual

Nice. fixed.

>> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5531
>> +                m_out.branch(m_out.notEqual(cachedOwnKeys, m_out.constIntPtr(bitwise_cast<void*>(StructureRareData::cachedOwnKeysSentinel()))), unsure(useCacheCase), unsure(slowCase));
> 
> Ditto about combining branches

Fixed.

>> Source/JavaScriptCore/runtime/JSImmutableButterfly.h:123
>> +        if (hasContiguous(indexingType())) {
> 
> Why is this needed?

If this does not exist, the storage of JSImmutableButterfly (Contiguous) remains uninitialized when it is created.
This leads to the crash if we perform allocation after JSImmutableButterfly is created and before all the storage is filled.
This patch is rolled out because of this GC issue.

Two possible fixes are considered,
(1) Put DeferGC on ObjectKeys' appropriate path. But since ObjectKeys would create so many JSString*, it is not so desirable.
(2) Fill JSImmutableButterfly Contiguous storage fields.

In this patch, I took (2). This approach can fix the above scenario, which can easily happen.
And since JSValue() is JSEmpty (nullptr in 64bit), the following code could be compiled into very efficient one.

>> Source/JavaScriptCore/runtime/ObjectConstructor.cpp:939
>> +                        auto* cachedButterfly = structure->cachedOwnKeys();
> 
> Based on earlier comment, this would be the only call to something like “cachedOwnKeysIgnorningSentinel”

Fixed.
Comment 27 Yusuke Suzuki 2018-12-15 21:15:39 PST
Created attachment 357412 [details]
Patch
Comment 28 Yusuke Suzuki 2018-12-16 21:05:00 PST
Created attachment 357427 [details]
Patch
Comment 29 Saam Barati 2018-12-17 14:40:02 PST
Comment on attachment 357427 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=357427&action=review

r=me. Nice!

> Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:2591
> +                    auto* immutableButterfly = rareData->cachedOwnKeysConcurrently();
> +                    if (immutableButterfly && immutableButterfly != StructureRareData::cachedOwnKeysSentinel()) {

This should now just be:
if (auto* immutableButterfly = rareData->cachedOwnKeysConcurrently()) { ... }

> Source/JavaScriptCore/dfg/DFGConstantFoldingPhase.cpp:776
> +                            auto* immutableButterfly = rareData->cachedOwnKeysConcurrently();
> +                            if (immutableButterfly && immutableButterfly != StructureRareData::cachedOwnKeysSentinel()) {

ditto

> Source/JavaScriptCore/runtime/JSImmutableButterfly.h:126
> +        if (hasContiguous(indexingType())) {
> +            for (unsigned index = 0; index < length; ++index)
> +                toButterfly()->contiguous().at(this, index).setStartingValue(JSValue());
> +        }

Why not only do this for Object.keys? Seems like this might affect other callers too? Seems like there should be some kind of ::create that has this in its name?

> Source/JavaScriptCore/runtime/ObjectConstructor.cpp:939
> +                            // Cache the immutable butterfly!

nit: This comment has no purpose.

> Source/JavaScriptCore/runtime/StructureRareDataInlines.h:85
> +    WTF::storeStoreFence();

nit: Only needs to be done on the "else"
Comment 30 Yusuke Suzuki 2018-12-17 22:51:52 PST
Comment on attachment 357427 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=357427&action=review

Thank you!

>> Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:2591
>> +                    if (immutableButterfly && immutableButterfly != StructureRareData::cachedOwnKeysSentinel()) {
> 
> This should now just be:
> if (auto* immutableButterfly = rareData->cachedOwnKeysConcurrently()) { ... }

Fixed.

>> Source/JavaScriptCore/dfg/DFGConstantFoldingPhase.cpp:776
>> +                            if (immutableButterfly && immutableButterfly != StructureRareData::cachedOwnKeysSentinel()) {
> 
> ditto

Fixed.

>> Source/JavaScriptCore/runtime/JSImmutableButterfly.h:126
>> +        }
> 
> Why not only do this for Object.keys? Seems like this might affect other callers too? Seems like there should be some kind of ::create that has this in its name?

I think this is not bad. If the code creates JSImmutableButterfly and does some allocations before filling up the storage, then we have a bad time if this code does not exist.
If we would like to remove this code, I think we should put `DisallowGC` on all the places creating JSImmutableButterfly, putting some optimized path for that use cases.

>> Source/JavaScriptCore/runtime/ObjectConstructor.cpp:939
>> +                            // Cache the immutable butterfly!
> 
> nit: This comment has no purpose.

Removed.

>> Source/JavaScriptCore/runtime/StructureRareDataInlines.h:85
>> +    WTF::storeStoreFence();
> 
> nit: Only needs to be done on the "else"

OK, fixed.
Comment 31 Yusuke Suzuki 2018-12-17 22:54:54 PST
Committed r239324: <https://trac.webkit.org/changeset/239324>