WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED DUPLICATE of
bug 277807
229175
[ARM64] Fix pre-index address mode
https://bugs.webkit.org/show_bug.cgi?id=229175
Summary
[ARM64] Fix pre-index address mode
Yijia Huang
Reported
2021-08-16 20:04:52 PDT
...
Attachments
Patch
(18.10 KB, patch)
2021-08-17 08:13 PDT
,
Yijia Huang
no flags
Details
Formatted Diff
Diff
Patch
(18.02 KB, patch)
2021-08-17 08:16 PDT
,
Yijia Huang
no flags
Details
Formatted Diff
Diff
Patch
(21.64 KB, patch)
2021-08-17 13:04 PDT
,
Yijia Huang
no flags
Details
Formatted Diff
Diff
Patch
(25.05 KB, patch)
2021-08-21 04:02 PDT
,
Yijia Huang
no flags
Details
Formatted Diff
Diff
Patch
(26.49 KB, patch)
2021-08-21 10:29 PDT
,
Yijia Huang
no flags
Details
Formatted Diff
Diff
Patch
(27.11 KB, patch)
2021-08-23 09:09 PDT
,
Yijia Huang
no flags
Details
Formatted Diff
Diff
Patch
(27.09 KB, patch)
2021-08-23 11:01 PDT
,
Yijia Huang
no flags
Details
Formatted Diff
Diff
Patch
(27.61 KB, patch)
2021-08-23 13:10 PDT
,
Yijia Huang
no flags
Details
Formatted Diff
Diff
Patch
(28.44 KB, patch)
2021-08-24 10:54 PDT
,
Yijia Huang
no flags
Details
Formatted Diff
Diff
Patch
(27.48 KB, patch)
2021-08-24 16:37 PDT
,
Yijia Huang
no flags
Details
Formatted Diff
Diff
Patch
(27.81 KB, patch)
2021-08-24 23:40 PDT
,
Yijia Huang
no flags
Details
Formatted Diff
Diff
Patch
(28.19 KB, patch)
2021-08-25 08:58 PDT
,
Yijia Huang
no flags
Details
Formatted Diff
Diff
Patch
(28.20 KB, patch)
2021-08-25 10:03 PDT
,
Yijia Huang
saam
: review+
Details
Formatted Diff
Diff
Patch for landing
(28.72 KB, patch)
2021-08-25 13:22 PDT
,
Yijia Huang
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Show Obsolete
(12)
View All
Add attachment
proposed patch, testcase, etc.
Yijia Huang
Comment 1
2021-08-17 08:13:21 PDT
Created
attachment 435686
[details]
Patch
Yijia Huang
Comment 2
2021-08-17 08:16:33 PDT
Created
attachment 435688
[details]
Patch
Yijia Huang
Comment 3
2021-08-17 13:04:26 PDT
Created
attachment 435703
[details]
Patch
Yijia Huang
Comment 4
2021-08-21 04:02:06 PDT
Created
attachment 436068
[details]
Patch
Yijia Huang
Comment 5
2021-08-21 10:29:58 PDT
Created
attachment 436077
[details]
Patch
Yijia Huang
Comment 6
2021-08-23 09:09:36 PDT
Created
attachment 436201
[details]
Patch
Yijia Huang
Comment 7
2021-08-23 11:01:16 PDT
Created
attachment 436211
[details]
Patch
Yijia Huang
Comment 8
2021-08-23 13:10:59 PDT
Created
attachment 436226
[details]
Patch
Radar WebKit Bug Importer
Comment 9
2021-08-23 20:05:29 PDT
<
rdar://problem/82273878
>
Yijia Huang
Comment 10
2021-08-24 10:54:47 PDT
Created
attachment 436308
[details]
Patch
Saam Barati
Comment 11
2021-08-24 14:15:33 PDT
Comment on
attachment 436308
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=436308&action=review
Mostly LGTM, but I have a lot of minor comments/suggestions, and a bigger comment/suggestion on the main dominance check algorithm.
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:61 > + HashMap<Value*, Vector<Value*>> addToParents;
name nit: I'd call this something like "addUses" or "usersOfAdd"
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:77 > + // Pre-Index Load Pattern:
Might also be worth a comment at the top of this file that strength reduction will leave the IR in the form we're matching now. It'll take an add(x, constant) that's an address and move the offset into the load itself, and that's why we can match this redundancy.
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:89 > + if (!baseOffsetToAddresses.contains(baseOffsetkey)) > + break; > + for (Value* address : baseOffsetToAddresses.get(baseOffsetkey)) {
this is doing two hash table lookups. I believe the canonical way to do this in WK is: auto iter = baseOffsetToAddresses.find(baseOffsetkey); if (iter == baseOffsetToAddresses.end()) break; for (Value* address : iter->value) { ... }
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:92 > + if (addToParents.contains(address) && addToParents.find(address)->value.size() > 0) > + continue;
ditto two hash table lookups here Also, more importantly, this algorithm looks a bit iffy. Currently, you may be saved by the controlEquivalent check, but I think a better algorithm here would just be to check that the load dominates all users of the add, perhaps in some separate check below after gathering all candidates. Let me give you an example that's saved by your controlEquivalent check, but is the reason why I don't love the current algorithm. ``` bb#0: x = add() branch(#1, #2) bb#1: load(x) jump#3 bb#2: use(x) jump #3 bb#3: return ``` Since we're iterating the graph in pre-order, we're guaranteed to visit #0 before #1 or #2. But, we're not guaranteed to visit #2 before #1. If we visit #1 before #2. I think it's much nicer to just directly check that all users of the add are dominated by the load we're moving to.
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:126 > + // Post-Index Load Pattern: > + // memory = Load(base, 0) > + // ... > + // address = Add(base, offset) > + // > + // Post-Index Store Pattern: > + // memory = Store(value1, base, 0) > + // ... > + // address = Add(base, offset)
nit: I think no need to write both load/store variants in all your comments, you can just be like: ``` Post-Index Pattern: memory = MemoryValue(base, 0) ... address = Add(base, offset) ```
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:129 > + if (!baseToMemories.contains(left)) > + break; > + for (MemoryValue* memory : baseToMemories.get(left)) {
ditto: two hash table lookups here, too
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:142 > + if (addToParents.contains(child)) > + addToParents.find(child)->value.append(value);
ditto, two hash table lookups here
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:187 > + Value* child = parent->child(i); > + if (child == address) > + parent->child(i) = newAddress;
small nit: I think you could write this without invoking parent->child(i) twice. Not crucial, but a touch nicer IMO, if you just make it: Value*& child = parent->child(i); if (child == address) child = newAddress;
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:208 > + // Convert Post-Index Load Pattern to the Canonical Form: > + // ... newOffset = Constant > + // ... newAddress = Add(base, newOffset) > + // memory = Load(base, 0) memory = Load(base, 0) > + // ... --> ... > + // address = Add(base, offset) address = Identity(newAddress) > + // > + // Convert Post-Index Store Pattern to the Canonical Form: > + // ... newOffset = Constant > + // ... newAddress = Add(base, newOffset) > + // memory = Store(value, base, 0) memory = Store(value, base, 0) > + // ... --> ... > + // address = Add(base, offset) address = Identity(newAddress)
ditto, let's just make this one comment abstracting the difference between store/load, since there is no need to double the size of the comment. Same goes to elsewhere in this file.
Yijia Huang
Comment 12
2021-08-24 16:37:32 PDT
Created
attachment 436345
[details]
Patch
Saam Barati
Comment 13
2021-08-24 19:17:04 PDT
Comment on
attachment 436345
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=436345&action=review
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:163 > + auto dominateUses = [&] () -> bool {
name nit: dominatesAllAddUses
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:165 > + if (iter == addUses.end() || !iter->value.size())
I think this will return false if the memory value is the only user of the add. Is that right? If so, that seems wrong. Actually, re-reading the code, I think the Load will always be put as a user of this. So size() is guaranteed to be >= 1. But see below for comments on dominance.
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:169 > + if (!dominators.dominates(memory->owner, parent->owner)) > + return false;
So this is just checking if the basic block of memory dominates parent, but this won't give you a precise answer if memory->owner == parent->owner. So if they're in the same block, we need to check that the position in that block dominates. However, maybe we can do this when visiting the Load instruction above. When visiting the Load, if we see that addUses isn't empty, it means there was a user in the same block. So we can remove it from being a candidate.
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:212 > + // This will reset values' indexes if there are any insertions.
why does this matter?
Yijia Huang
Comment 14
2021-08-24 22:49:22 PDT
Comment on
attachment 436345
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=436345&action=review
>> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:212 >> + // This will reset values' indexes if there are any insertions. > > why does this matter?
Because we preprocess the index information in the memoryToIndex map. Suppose there are several matched patterns in the same basic block. If we process them one by one in the sequence, the latter real value indexes will be changed with a displacement. And the mechanism of updating the insertions in the basic block is basic-block-based. So, we need to collect all matched patterns in one basic block and update the insertions at one time. And this is one of the challenges to implement this algorithm.
Yijia Huang
Comment 15
2021-08-24 22:58:40 PDT
Comment on
attachment 436345
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=436345&action=review
Because we preprocess the index information in the memoryToIndex map. Suppose there are several matched patterns in the same basic block. If we process them one by one in the sequence, the latter real value indexes will be changed with a displacement. And the mechanism of updating the insertions in the basic block is basic-block-based. So, we need to collect all matched patterns in one basic block and update the insertions at one time. And this is one of the challenges to implement this algorithm.
>>> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:212 >>> + // This will reset values' indexes if there are any insertions. >> >> why does this matter? > > Because we preprocess the index information in the memoryToIndex map. Suppose there are several matched patterns in the same basic block. If we process them one by one in the sequence, the latter real value indexes will be changed with a displacement. And the mechanism of updating the insertions in the basic block is basic-block-based. So, we need to collect all matched patterns in one basic block and update the insertions at one time. And this is one of the challenges to implement this algorithm.
So, insertionSet.insert is just collecting all inserted values but does not actually do the insertions. And insertionSet.execute(basicBlock); is the execution to insert those collected values to the insertionIndex in the target basic block. This is tricky. It took me some time to figure that out since we initially would think that insertionSet.insert will do the real insertions.
Yijia Huang
Comment 16
2021-08-24 23:17:23 PDT
Comment on
attachment 436345
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=436345&action=review
>> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:165 >> + if (iter == addUses.end() || !iter->value.size()) > > I think this will return false if the memory value is the only user of the add. Is that right? If so, that seems wrong. > > Actually, re-reading the code, I think the Load will always be put as a user of this. So size() is guaranteed to be >= 1. But see below for comments on dominance.
After strength reduction, the address is still in the B3 IR which indicates there must be other users. Here, I want to make sure this scenario is always true. Maybe I should an assertion here.
Yijia Huang
Comment 17
2021-08-24 23:40:53 PDT
Created
attachment 436371
[details]
Patch
Yijia Huang
Comment 18
2021-08-25 08:58:56 PDT
Created
attachment 436393
[details]
Patch
Yijia Huang
Comment 19
2021-08-25 10:03:28 PDT
Created
attachment 436398
[details]
Patch
Saam Barati
Comment 20
2021-08-25 12:06:45 PDT
Comment on
attachment 436398
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=436398&action=review
r=me with comments
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:93 > + auto addUsesIter = addUses.find(address); > + if (addUsesIter != addUses.end() && addUsesIter->value.size() > 0) > + continue;
I think this is a bit more subtle, and deserves a subtle comment. You may want to refine, but something like: ``` The goal is to move the Add to this Load/Store. We only move the Add to this Memory value if we guarantee it dominates all uses. If there are already other uses of the Add, it guarantees this Memory value doesn't dominate those uses. This is because we're visiting the program in pre-order traversal, so we visit this Memory value before all the things it dominates. So if the Add has other users, we must not dominate those users. Therefore, this MemoryValue won't be a candidate. ``` An alternative approach could be to just record indices of all users of the Add. And in your dominance check, you can also add a same-block dominance check, where the index of the memory value is < index of the Add user if they are in the same basic block. That algorithm requires more space, but is less subtle, and more explicit in what it's doing.
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:160 > + // Convert Pre-Index Load/Store Pattern to the Canonical Form:
both loops inside this loop check for "handledValues.contains(memory)". But that check should go here before the loops below.
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:185 > + Value* newAddress = insertionSet.insert<Value>(insertionIndex, Add, memory->origin(), address->child(0), address->child(1));
nit: I forgot about the nicer way to do this. We have a clone(Value*) function on Procedure. So I think the slightly more idiomatic way to do this is: Value* newAddress = proc.clone(address);
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:186 > + // update all address references with the newAddress
nit: remove this comment, since it's just saying exactly what the code below is doing. We avoid these kinds of comments in WebKit, like: // do X X
> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:219 > + // This will reset values' indexes if there are any insertions.
I see what you're saying now. I'd just say that after this executes, memoryToIndex no longer contains up to date information for this block. But that's ok because we never touch this block again.
> Source/JavaScriptCore/b3/B3ValueKeyInlines.h:41 > + // The observation is that when both child->index() and offset are 0's, > + // HashMap would not accept the ValueKey.
Isn't it sufficient to just do child->index() + 1? Why also offset + 1?
Yijia Huang
Comment 21
2021-08-25 12:22:32 PDT
Comment on
attachment 436398
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=436398&action=review
>> Source/JavaScriptCore/b3/B3ValueKeyInlines.h:41 >> + // HashMap would not accept the ValueKey. > > Isn't it sufficient to just do child->index() + 1? Why also offset + 1?
Actually, this update is for the Load/Store pair. When we have MemoryValue with base and offset, we can search for: ValueKey(base, offset + bytes) and ValueKey(base, offset - bytes) Here, the offset could be 0.
Yijia Huang
Comment 22
2021-08-25 13:22:44 PDT
Created
attachment 436412
[details]
Patch for landing
EWS
Comment 23
2021-08-25 14:19:38 PDT
Committed
r281587
(
240951@main
): <
https://commits.webkit.org/240951@main
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 436412
[details]
.
WebKit Commit Bot
Comment 24
2021-08-27 07:40:38 PDT
Re-opened since this is blocked by
bug 229609
Yijia Huang
Comment 25
2021-08-27 17:08:36 PDT
Yusuke found that stress/put-to-proto-chain-overrides-put.js.ftl-eager is failing only in OSS bot (
https://build.webkit.org/#/builders/102
). I bisected and found that this is caused by
https://trac.webkit.org/changeset/281587/webkit
. The failure only happens on ARM64. Not in ARM64E (That’s why OSS bot is failing while Internal bot is not failing). The way to reproduce this crash is, In one shell, building WebKit to stress the machine In the other multiple shells (4~ shells), run source testing.zsh The testing.zsh is: ``` for ((i = 0; i < 10000; i++)); do echo "Run $i" VM=~/dev/OpenSource/WebKitBuild/Release; DYLD_FRAMEWORK_PATH=$VM $VM/jsc --useFTLJIT\=false --useFunctionDotArguments\=true --validateExceptionChecks\=true --useDollarVM\=true --maxPerThreadStackUsage\=1572864 --airForceBriggsAllocator\=true --useRandomizingExecutableIslandAllocation\=true --forcePolyProto\=true --useDataIC\=true --useFTLJIT\=true --thresholdForJITAfterWarmUp\=10 --thresholdForJITSoon\=10 --thresholdForOptimizeAfterWarmUp\=20 --thresholdForOptimizeAfterLongWarmUp\=20 --thresholdForOptimizeSoon\=20 --thresholdForFTLOptimizeAfterWarmUp\=20 --thresholdForFTLOptimizeSoon\=20 --thresholdForOMGOptimizeAfterWarmUp\=20 --thresholdForOMGOptimizeSoon\=20 --maximumEvalCacheableSourceLength\=150000 --useEagerCodeBlockJettisonTiming\=true --repatchBufferingCountdown\=0 --collectContinuously\=true --useGenerationalGC\=false --verifyGC\=true JSTests/stress/put-to-proto-chain-overrides-put.js done ``` In the investigation, the problem is related to store-with-post-increment. I have confirmed that the canonicalization works fine. So, I suspect that the problem is due to the application of store-with-post-increment in the B3LowerToAir.cpp ``` incrementArg = Arg::postIndex(tmp(address), smallOffset); ```
Yijia Huang
Comment 26
2021-08-27 17:09:31 PDT
The detailed discussion is in this slack thread:
https://a1391192.slack.com/archives/GM9EK0UGL/p1630059992114800
Yijia Huang
Comment 27
2024-08-09 14:53:19 PDT
*** This bug has been marked as a duplicate of
bug 277807
***
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug