Bug 230078

Summary: Move some profiling to UnlinkedCodeBlock
Product: WebKit Reporter: Saam Barati <saam>
Component: JavaScriptCoreAssignee: Saam Barati <saam>
Status: RESOLVED FIXED    
Severity: Normal CC: commit-queue, ews-watchlist, keith_miller, mark.lam, msaboff, tzagallo, webkit-bug-importer, xan.lopez, ysuzuki
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 230358    
Bug Blocks: 230316    
Attachments:
Description Flags
WIP
none
patch
ews-feeder: commit-queue-
patch
ews-feeder: commit-queue-
patch
ews-feeder: commit-queue-
patch
ews-feeder: commit-queue-
patch
none
patch
ysuzuki: review+, ews-feeder: commit-queue-
patch for landing
ews-feeder: commit-queue-
WIP
none
WIP (alternate implementation)
ews-feeder: commit-queue-
Patch
ysuzuki: review+
patch for landing
none
patch for landing
none
Patch
ysuzuki: review+, ews-feeder: commit-queue-
patch for landing
ews-feeder: commit-queue-
[fast-cq] patch for landing none

Description Saam Barati 2021-09-08 20:20:15 PDT
...
Comment 1 Saam Barati 2021-09-09 12:24:57 PDT
Created attachment 437766 [details]
WIP
Comment 2 Saam Barati 2021-09-09 13:17:47 PDT
Created attachment 437774 [details]
patch
Comment 3 Saam Barati 2021-09-09 13:55:23 PDT
Created attachment 437778 [details]
patch
Comment 4 Saam Barati 2021-09-09 14:32:26 PDT
Created attachment 437787 [details]
patch
Comment 5 Saam Barati 2021-09-09 14:40:42 PDT
Created attachment 437788 [details]
patch
Comment 6 Saam Barati 2021-09-09 14:51:59 PDT
Created attachment 437790 [details]
patch
Comment 7 Saam Barati 2021-09-09 14:54:19 PDT
Created attachment 437791 [details]
patch
Comment 8 Yusuke Suzuki 2021-09-09 14:55:22 PDT
Comment on attachment 437778 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=437778&action=review

r=me. I hope we will deploy 32bit pointers so that we do not have bunch of pointer width things in metadata.

> Source/JavaScriptCore/bytecode/BytecodeList.rb:1069
> +      profile: ValueProfile.*,

indentation looks wrong.

> Source/JavaScriptCore/bytecode/BytecodeList.rb:1257
>          iterationMetadata: IterationModeMetadata,
> -        iterableProfile: ValueProfile,
> +        iterableProfile: ValueProfile.*,
>          callLinkInfo: LLIntCallLinkInfo,
> -        iteratorProfile: ValueProfile,
> +        iteratorProfile: ValueProfile.*,
>          modeMetadata: GetByIdModeMetadata,
> -        nextProfile: ValueProfile,
> +        nextProfile: ValueProfile.*,

We should order these fields to make metadata small.

> Source/JavaScriptCore/bytecode/BytecodeList.rb:1282
>          iterationMetadata: IterationModeMetadata,
> -        iterableProfile: ArrayProfile,
> +        iterableProfile: ArrayProfile.*,
>          callLinkInfo: LLIntCallLinkInfo,
> -        nextResultProfile: ValueProfile,
> +        nextResultProfile: ValueProfile.*,
>          doneModeMetadata: GetByIdModeMetadata,
> -        doneProfile: ValueProfile,
> +        doneProfile: ValueProfile.*,
>          valueModeMetadata: GetByIdModeMetadata,
> -        valueProfile: ValueProfile,
> +        valueProfile: ValueProfile.*,

Ditto.

> Source/JavaScriptCore/bytecode/CodeBlock.cpp:782
> +        unsigned size = numberOfArgumentValueProfiles();
> +        if (m_metadata) {
> +#define COUNT(__op) \
> +            size += m_metadata->numEntries<__op>();
> +            FOR_EACH_OPCODE_WITH_VALUE_PROFILE(COUNT)
> +#undef COUNT
> +            size += m_metadata->numEntries<OpIteratorOpen>() * 3;
> +            size += m_metadata->numEntries<OpIteratorNext>() * 3;
> +        }
> +
> +        RELEASE_ASSERT(size);
> +
> +        auto& valueProfiles = unlinkedCodeBlock->ensureValueProfiles(size);

Can we allocate it apriori when creating UnlinkedCodeBlock? The size won't be changed based on CodeBlock, so allocating that when UnlinkedCodeBlock is created is simpler.

> Source/JavaScriptCore/bytecode/CodeBlock.cpp:834
> +        unsigned size = 0;
> +
> +        if (m_metadata) {
> +#define COUNT(__op) \
> +            size += m_metadata->numEntries<__op>();
> +            FOR_EACH_OPCODE_WITH_ARRAY_PROFILE(COUNT)
> +#undef COUNT
> +            size += m_metadata->numEntries<OpIteratorNext>();
> +        }
> +
> +        if (size) {
> +            auto& arrayProfiles = unlinkedCodeBlock->ensureArrayProfiles(size);
> +            unsigned index = 0;

Ditto.
Comment 9 Saam Barati 2021-09-09 15:17:48 PDT
Comment on attachment 437778 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=437778&action=review

>> Source/JavaScriptCore/bytecode/BytecodeList.rb:1069
>> +      profile: ValueProfile.*,
> 
> indentation looks wrong.

will fix.

>> Source/JavaScriptCore/bytecode/BytecodeList.rb:1257
>> +        nextProfile: ValueProfile.*,
> 
> We should order these fields to make metadata small.

Will do.

>> Source/JavaScriptCore/bytecode/CodeBlock.cpp:782
>> +        auto& valueProfiles = unlinkedCodeBlock->ensureValueProfiles(size);
> 
> Can we allocate it apriori when creating UnlinkedCodeBlock? The size won't be changed based on CodeBlock, so allocating that when UnlinkedCodeBlock is created is simpler.

yeah, we can put this in `UnlinkedCodeBlockGenerator::finalize` per your suggestion on Slack.
Comment 10 Saam Barati 2021-09-09 15:34:14 PDT
Regarding the size of pointers. In the future, we could make these pointer fields an index into the profile vectors.
Comment 11 Saam Barati 2021-09-09 15:34:35 PDT
(In reply to Saam Barati from comment #10)
> Regarding the size of pointers. In the future, we could make these pointer
> fields an index into the profile vectors.

That way, it'd live in the bytecode instead of the metadata, and it'd be smaller.
Comment 12 Saam Barati 2021-09-09 15:52:59 PDT
Created attachment 437796 [details]
patch for landing
Comment 13 Radar WebKit Bug Importer 2021-09-09 15:57:02 PDT
<rdar://problem/82947571>
Comment 14 Saam Barati 2021-09-10 10:47:40 PDT
I believe I found the crashers out. The issue is like this:

Say we have an UnlinkedCodeBlock UCB and two CodeBlocks that point to it, CB1 and CB2.

Let's say CB1 is in old space, and CB2 is in new space. CB1 doesn't run. But CB2 runs, and does some profiling, then dies.

Now CB1 points to some things. It's not dead, but it also wasn't visited. Therefore we won't finalize it.

So I'll have to adjust our finalization algorithm to finalize all live code blocks, not just all visited in this GC cycle code blocks.
Comment 15 Saam Barati 2021-09-10 20:13:10 PDT
Created attachment 437938 [details]
WIP

This version fixes the GC issues, I believe (EWS will tell us). But it seems just a touch slower than the other version (with the GC bugs).

I'm uploading this here while I try out a totally new version of a similar. I have an idea of a different way to share profiling, and see if it'll be a bigger win than we have now.

To resolve the GC issue, I had to teach the code to finalize UCBs that "ran" as CodeBlocks ran. I think this finalization costs us maybe 0.25% or so on Speedometer.

We still have a win overall though, but it's around 0.5-1% instead of a solid 1%. So we'll take this if the other version doesn't work out. But let's see if we can get back to a solid 1%
Comment 16 Saam Barati 2021-09-13 10:23:30 PDT
Created attachment 438051 [details]
WIP (alternate implementation)
Comment 17 Saam Barati 2021-09-13 15:04:41 PDT
Ok, the original implementation seems more performant than the simple implementation, unfortunately. But we'll go with what performs best.
Comment 18 Saam Barati 2021-09-13 15:46:28 PDT
Created attachment 438082 [details]
Patch
Comment 19 Yusuke Suzuki 2021-09-13 17:10:49 PDT
Comment on attachment 438082 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=438082&action=review

> Source/JavaScriptCore/bytecode/ArrayProfile.cpp:170
> +    JSGlobalObject* globalObject = lastSeenStructure->globalObject();
> +    if (lexicalGlobalObject && globalObject && globalObject != lexicalGlobalObject)
> +        m_observedDifferentGlobalObject = true;
> +    if (globalObject
> +        && !globalObject->isOriginalArrayStructure(lastSeenStructure)

Hmm, I have a question.

Two CodeBlocks ("A" and "B") from different JSGlobalObject can use the same UnlinkedCodeBlock at the same time. And in that case, it is possible that

1. "A" stores StructureID and JSGlobalObject to ArrayProfile
2. Getting prediction from "B".

Then, "B" taints this profile as,

    m_observedDifferentGlobalObject = true
    m_usesOriginalArrayStructures = false

But this is not correct since "A" and "B" are using its own JSGlobalObject and using their original Arrays, correct?

And tainting this value can cause an effect on the result of DFG since it makes CheckStructure to CheckArray.
Comment 20 Yusuke Suzuki 2021-09-13 17:15:49 PDT
Comment on attachment 438082 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=438082&action=review

>> Source/JavaScriptCore/bytecode/ArrayProfile.cpp:170
>> +        && !globalObject->isOriginalArrayStructure(lastSeenStructure)
> 
> Hmm, I have a question.
> 
> Two CodeBlocks ("A" and "B") from different JSGlobalObject can use the same UnlinkedCodeBlock at the same time. And in that case, it is possible that
> 
> 1. "A" stores StructureID and JSGlobalObject to ArrayProfile
> 2. Getting prediction from "B".
> 
> Then, "B" taints this profile as,
> 
>     m_observedDifferentGlobalObject = true
>     m_usesOriginalArrayStructures = false
> 
> But this is not correct since "A" and "B" are using its own JSGlobalObject and using their original Arrays, correct?
> 
> And tainting this value can cause an effect on the result of DFG since it makes CheckStructure to CheckArray.

Discussed with Saam, I missed that we are using globalObject from Structure instead of CodeBlock. So when setting m_usesOriginalArrayStructures, we don't care about Realm.
It is cared in DFG side.
Comment 21 Yusuke Suzuki 2021-09-13 17:59:00 PDT
Comment on attachment 438082 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=438082&action=review

r=me with one issue.

> Source/JavaScriptCore/runtime/JSScope.cpp:84
> +            catchScope.releaseAssertNoException();

Unfortunately, I think we can have TerminatedException.
Comment 22 Xan Lopez 2021-09-14 01:48:12 PDT
One of the failures EWS claims this introduces is:

Starting program: /home/igalia/xlopez/WebKit/WebKitBuild/Debug/bin/jsc --forceMiniVMMode=true -f JSTests/stress/has-own-property-name-cache-symbols-and-strings.js
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
[New Thread 0xf3fc0440 (LWP 3478)]

Thread 1 "jsc" received signal SIGABRT, Aborted.
__libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47
47	../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S: No such file or directory.
(gdb) bt
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47
#1  0xf5e75ea0 in __libc_signal_restore_set (set=0xfffed324) at ../sysdeps/unix/sysv/linux/internal-signals.h:86
#2  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:48
#3  0xf5e667a2 in __GI_abort () at abort.c:79
#4  0xf6cc46f8 in JSC::ScratchRegisterAllocator::allocateScratch<JSC::GPRInfo> (this=0xfffed648) at ../../Source/JavaScriptCore/jit/ScratchRegisterAllocator.cpp:97
#5  0xf6cbb558 in JSC::ScratchRegisterAllocator::allocateScratchGPR (this=0xfffed648) at ../../Source/JavaScriptCore/jit/ScratchRegisterAllocator.cpp:102
#6  0xf62d9948 in JSC::AccessCase::generateImpl (this=0xf3699460, state=...) at ../../Source/JavaScriptCore/bytecode/AccessCase.cpp:2276
#7  0xf62d78b2 in JSC::AccessCase::generateWithGuard (this=0xf3699460, state=..., fallThrough=...) at ../../Source/JavaScriptCore/bytecode/AccessCase.cpp:1755
#8  0xf63b9862 in JSC::PolymorphicAccess::regenerate (this=0xf369cf80, locker=..., vm=..., globalObject=0xf20f9038, codeBlock=0xf1eac000, ecmaMode=..., stubInfo=...)
    at ../../Source/JavaScriptCore/bytecode/PolymorphicAccess.cpp:672
#9  0xf63e1720 in operator() (__closure=0xfffeec9c) at ../../Source/JavaScriptCore/bytecode/StructureStubInfo.cpp:221
#10 0xf63e187a in JSC::StructureStubInfo::addAccessCase (this=0xf36f69a0, locker=..., globalObject=0xf20f9038, codeBlock=0xf1eac000, ecmaMode=..., ident=..., 
    accessCase=...) at ../../Source/JavaScriptCore/bytecode/StructureStubInfo.cpp:245
#11 0xf6cb6b96 in JSC::tryCachePutBy (globalObject=0xf20f9038, codeBlock=0xf1eac000, baseValue=..., oldStructure=0xf1efb840, propertyName=..., slot=..., stubInfo=..., 
    putByKind=JSC::PutByKind::ByVal, putKind=JSC::PutKind::NotDirect) at ../../Source/JavaScriptCore/jit/Repatch.cpp:877
#12 0xf6cb6d56 in JSC::repatchPutBy (globalObject=0xf20f9038, codeBlock=0xf1eac000, baseValue=..., oldStructure=0xf1efb840, propertyName=..., slot=..., stubInfo=..., 
    putByKind=JSC::PutByKind::ByVal, putKind=JSC::PutKind::NotDirect) at ../../Source/JavaScriptCore/jit/Repatch.cpp:897
#13 0xf6c5a820 in JSC::putByValOptimize (globalObject=0xf20f9038, codeBlock=0xf1eac000, baseValue=..., subscript=..., value=..., stubInfo=0xf36f69a0, profile=0xf36e8e30, 
    ecmaMode=...) at ../../Source/JavaScriptCore/jit/JITOperations.cpp:1053
#14 0xf6c5a9d2 in JSC::operationPutByValNonStrictOptimize (globalObject=0xf20f9038, encodedBaseValue=-17416011648, encodedSubscript=-17415806464, encodedValue=-4294967271, 
    stubInfo=0xf36f69a0, profile=0xf36e8e30) at ../../Source/JavaScriptCore/jit/JITOperations.cpp:1085
#15 0xf35fd662 in ?? ()

Pretty sure this is bug #230215, we are working on a fix.

Trying to get a trace for the other failure, but it's also in a for-in test (for-in-cached.js in LayouTests/js), so I suspect something similar is going on.
Comment 23 Saam Barati 2021-09-15 11:49:42 PDT
(In reply to Yusuke Suzuki from comment #21)
> Comment on attachment 438082 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=438082&action=review
> 
> r=me with one issue.
> 
> > Source/JavaScriptCore/runtime/JSScope.cpp:84
> > +            catchScope.releaseAssertNoException();
> 
> Unfortunately, I think we can have TerminatedException.

Right. Spoke to Yusuke on Slack. We're going to use DeferTerminationForAWhile inside abstractAccess.
Comment 24 Saam Barati 2021-09-15 11:54:51 PDT
Created attachment 438270 [details]
patch for landing
Comment 25 Saam Barati 2021-09-15 12:09:50 PDT
Comment on attachment 438270 [details]
patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=438270&action=review

> Source/JavaScriptCore/bytecode/UnlinkedMetadataTable.cpp:58
> +            buffer[i] = offset; // We align when we access this.

This fixes a bug that this patch exposes, where we would potentially think we had more metadata table entries than we really had in practice. This is because "forEach" used the next opcode's start pointer. That pointer is aligned to that opcode's metadata alignment. So that might make the previous opcode think it had an extra 1-7 entries (depending on size, alignment, etc). 

This patch fixes that by having the next opcode's start offset in the table always be the end offset of the previous opcode. This fixes the bug in forEach.

Then, when we access the metadata, we round up to alignment.
Comment 26 Saam Barati 2021-09-15 12:21:26 PDT
Created attachment 438274 [details]
patch for landing
Comment 27 Saam Barati 2021-09-15 12:35:54 PDT
(In reply to Saam Barati from comment #25)
> Comment on attachment 438270 [details]
> patch for landing
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=438270&action=review
> 
> > Source/JavaScriptCore/bytecode/UnlinkedMetadataTable.cpp:58
> > +            buffer[i] = offset; // We align when we access this.
> 
> This fixes a bug that this patch exposes, where we would potentially think
> we had more metadata table entries than we really had in practice. This is
> because "forEach" used the next opcode's start pointer. That pointer is
> aligned to that opcode's metadata alignment. So that might make the previous
> opcode think it had an extra 1-7 entries (depending on size, alignment,
> etc). 
> 
> This patch fixes that by having the next opcode's start offset in the table
> always be the end offset of the previous opcode. This fixes the bug in
> forEach.
> 
> Then, when we access the metadata, we round up to alignment.

Another approach in the future would be to generate the opcodeIDs for byte codes in descending order of alignment. So all things with alignment 8 come first, followed by 4, 2, 1. That would then allow us to never align (and instead assert we're always aligned).

We could do that in the ruby code in two ways:
1. figure out the alignment somehow. Perhaps compile some dummy code and return alignof of dummy structs
2. Just specify in the ruby bytecodelist.rb what the alignment is (perhaps of type, or per each metadata). Then we'd just have to keep that updated.
Comment 28 EWS 2021-09-15 15:19:03 PDT
Committed r282478 (241726@main): <https://commits.webkit.org/241726@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 438274 [details].
Comment 29 WebKit Commit Bot 2021-09-16 10:16:58 PDT
Re-opened since this is blocked by bug 230358
Comment 30 Saam Barati 2021-09-16 10:23:56 PDT
(In reply to WebKit Commit Bot from comment #29)
> Re-opened since this is blocked by bug 230358

It was a speedup on Speedo2. But slowdown on JetStream2. Gonna see if the simple version of the patch in prior versions is a speedup or neutral everywhere.
Comment 31 Saam Barati 2021-09-16 13:08:36 PDT
Created attachment 438391 [details]
Patch
Comment 32 Yusuke Suzuki 2021-09-16 13:17:54 PDT
Comment on attachment 438391 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=438391&action=review

r=me

> Source/JavaScriptCore/bytecode/ValueProfile.h:220
> +struct UnlinkedValueProfile {

I think `class` is better in this case.

> Source/JavaScriptCore/runtime/CachedTypes.cpp:2164
> +    m_valueProfiles = FixedVector<UnlinkedValueProfile>(cachedCodeBlock.numValueProfiles());
> +    m_arrayProfiles = FixedVector<UnlinkedArrayProfile>(cachedCodeBlock.numArrayProfiles());

Why not initializing them in the initializer list?

, m_valueProfiles(cachedCodeBlock.numValueProfiles())
Comment 33 Saam Barati 2021-09-16 13:20:03 PDT
(In reply to Yusuke Suzuki from comment #32)
> Comment on attachment 438391 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=438391&action=review
> 
> r=me
> 
> > Source/JavaScriptCore/bytecode/ValueProfile.h:220
> > +struct UnlinkedValueProfile {
> 
> I think `class` is better in this case.
> 
> > Source/JavaScriptCore/runtime/CachedTypes.cpp:2164
> > +    m_valueProfiles = FixedVector<UnlinkedValueProfile>(cachedCodeBlock.numValueProfiles());
> > +    m_arrayProfiles = FixedVector<UnlinkedArrayProfile>(cachedCodeBlock.numArrayProfiles());
> 
> Why not initializing them in the initializer list?
> 
> , m_valueProfiles(cachedCodeBlock.numValueProfiles())

Will fix.
Comment 34 Saam Barati 2021-09-16 13:29:19 PDT
Created attachment 438393 [details]
patch for landing
Comment 35 Saam Barati 2021-09-16 14:32:58 PDT
Created attachment 438403 [details]
[fast-cq] patch for landing
Comment 36 EWS 2021-09-16 17:11:31 PDT
Committed r282621 (241779@main): <https://commits.webkit.org/241779@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 438403 [details].