Summary: | [B3] Fusing immediates into test instructions should work again | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Filip Pizlo <fpizlo> | ||||||||||
Component: | JavaScriptCore | Assignee: | Filip Pizlo <fpizlo> | ||||||||||
Status: | RESOLVED FIXED | ||||||||||||
Severity: | Normal | CC: | benjamin, commit-queue, keith_miller, mark.lam, msaboff, saam | ||||||||||
Priority: | P2 | ||||||||||||
Version: | WebKit Nightly Build | ||||||||||||
Hardware: | All | ||||||||||||
OS: | All | ||||||||||||
Attachments: |
|
Description
Filip Pizlo
2016-07-21 23:05:37 PDT
Created attachment 284312 [details]
I don't know if I'm doing this right
The logic for deciding what sorts of immediates can be used for Branch(BitAnd) fusion was broken pretty badly. I'm trying to fix it.
Created attachment 284378 [details]
the patch
NOTE: I haven't yet had a chance to perf test this latest version, which is why it's not r? yet.
Created attachment 284400 [details]
the patch
Created attachment 284401 [details]
performance
It's neutral, except on LongSpider, where it's a regression on math-cordic! I investigated this regression in detail. It's hilarious. The benchmark is very small, so it's sensitive to weird stuff. And how weird this is.
Previously, there was code like this:
0x4ea9fda0347e: test %edx, %r10d
Where %r10 is initialized to a constant, which gets hoisted. Normally this would be bad codegen. And this patch fixes it to be:
0x365b91a03457: test $0x7fffffff, %edx
But by doing this, we perturb regalloc throughout the entire function. Lots of things change in subtle ways. None of the other effects look, on the surface, like either good or bad. But in this case, occupying %r10 and so perturbing everything is worth a 9% speed-up.
To me this means that we should ignore the math-cordic regression. The reason why we're seeing such a weird effect on this test is that this test contains only one hot function. This function doesn't even do a particularly good job of staying in FTL anyway.
Landed in https://trac.webkit.org/changeset/203666 Follow-up fix: https://trac.webkit.org/changeset/203668 Late in the game I removed the Imm->BitImm changes in a few places to recover perf regressions, and I went too far by removing it from BranchTest8. Also, I forgot to spin the load-branch fusion tests on not-x86. Rolled out in: http://trac.webkit.org/changeset/203758 Relanded in https://trac.webkit.org/changeset/203996 |