Bug 152365

Summary: B3->Air lowering incorrectly copy-propagates over ZExt32's
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Filip Pizlo <fpizlo>
Status: RESOLVED FIXED    
Severity: Normal CC: barraclough, benjamin, commit-queue, ggaren, keith_miller, mark.lam, mhahnenb, msaboff, nrotem, oliver, saam, sam
Priority: P2    
Version: WebKit Nightly Build   
Hardware: All   
OS: All   
Bug Depends on:    
Bug Blocks: 150279, 152478    
Attachments:
Description Flags
work in progress
none
more
none
probably ok
none
starting to work
none
the patch
benjamin: review-
rebased
none
the patch
none
the patch benjamin: review+

Description Filip Pizlo 2015-12-16 15:31:17 PST
Copy propagating over ZExt32's seemed pretty clever at the time, but this is not something that belongs in the lowering.  The problem is that it assumes that certain operations zero-extend for you.  But that's not the case if the operation spills.

Here are some possible solutions:

1) Just turn off copy propagation: This will work, but it will cause pointless zero-extension operations.

2) (1) + add copy propagation to Air: We could teach Air which operations do a 32-bit zero-extend as part of their Def if the destination is a register.  Like, we could add DefZExt32 and UseDefZExt32.  This means that the copy propagation would run after register allocation.

3) (2) + teach IRC to coalesce Move32's but without removing them.  Then instead of full-blown copy propagation, we can just have a forward analysis that proves which locations (registers, spill slots, etc) have a value that is already zero-extended.  The part of this that deals with spill slots could be flow-insensitive while the register part is flow-sensitive.

4) (3) + teach IRC that a Move32 is only coalescable if all of the values that flow into it are already zero-extended based on some forward flow analysis.  I think that this will work because coalescing only happens if both variables get a register, which means that the meaning of DefZExt32 and UseDefZExt32 is preserved.

5) Mandate that DefZExt32 and UseDefZExt32 don't admitsStack(), and keep the copy propagation in the lowering.  This means substantially less optimal code on x86.

Of these options, I think that I like (4) best, but I still need to convince myself that it's right.
Comment 1 Filip Pizlo 2015-12-16 15:46:44 PST
I should clarify that (4) does not require any copy propagation later.  I think that this makes this approach the most scalable as well because it doesn't require a flow-sensitive analysis.  Tmp's tend to have an SSA-like quality to them, so flow sensitivity buys relatively little.

We could even use this to convert more Move's into Move32's.  Currently the instruction selector uses Move's for 32-bit values because Move32's are not coalescable.  But they would become coalescable if we had an analysis that showed that the value is already zero-extended.  There are two ways we could go about this:

i) Have the instruction selector emit Move32's for Int32's.  Then coalesce Move32's that are proven to only see zero-extended values.  This means that if we screw up our zero-extension analysis in Air (by being too conservative), we'll end up with Move32's that didn't get coalesced.

ii) Have the instruction selector still emit Move's for Int32's, but then have an analysis that converts the Move's into coalescable Move32's.

The second option, (ii), has the interesting property that we could convert Move's to Move32's using either backward or forward reasoning.

Forward reasoning: As I already described above.

Backward reasoning: Add the notion of Use32 and make UseDefZExt32 imply that the use is  like Use32.  If the destination of a Move is only used by Use32's then we can use Move32 instead.

Either way, the newly minted Move32 will be register-coalescable, in the sense that if the definitions of the Move32's source all use a register then the Move32 behaves like Move. From the standpoint of IRC, it means that if the source of the Move32 doesn't get spilled then it's coalescable, which in turn means that it's just coalescable no matter what.

Probably what this means is an analysis that IRC uses, which tells it which Move's and Move32's are really register-coalescable Move32's. IRC will either remove them, or turn them into Move's.
Comment 2 Filip Pizlo 2015-12-18 11:41:41 PST
Created attachment 267646 [details]
work in progress
Comment 3 Filip Pizlo 2015-12-18 14:18:06 PST
Created attachment 267650 [details]
more
Comment 4 Filip Pizlo 2015-12-18 15:27:19 PST
Created attachment 267652 [details]
probably ok

I haven't tested it yet.
Comment 5 Filip Pizlo 2015-12-18 19:52:13 PST
Created attachment 267677 [details]
starting to work
Comment 6 Filip Pizlo 2015-12-18 21:34:35 PST
Created attachment 267681 [details]
the patch
Comment 7 WebKit Commit Bot 2015-12-18 21:36:55 PST
Attachment 267681 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/air/AirInst.h:113:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirInstInlines.h:58:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirInstInlines.h:102:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirUseCounts.h:80:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6080:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6097:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6097:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6099:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6099:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6101:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6103:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6105:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/air/AirLiveness.h:93:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirLiveness.h:219:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirLiveness.h:226:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirLiveness.h:235:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirInst.cpp:41:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirSpillEverything.cpp:70:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirSpillEverything.cpp:122:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirEliminateDeadCode.cpp:83:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirTmpWidth.cpp:40:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirTmpWidth.cpp:77:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirAllocateStack.cpp:107:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirAllocateStack.cpp:151:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:60:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:106:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:188:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:358:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:429:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 29 in 24 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 8 Benjamin Poulain 2015-12-19 00:32:57 PST
Comment on attachment 267681 [details]
the patch

This creates a mega class that knows about all the states, all the time. This is against WebKit's policies of reducing dependencies. Classes of limited scopes are typically less buggy and easier to hack on.
Comment 9 Filip Pizlo 2015-12-19 06:48:55 PST
(In reply to comment #8)
> Comment on attachment 267681 [details]
> the patch
> 
> This creates a mega class that knows about all the states, all the time.
> This is against WebKit's policies of reducing dependencies. Classes of
> limited scopes are typically less buggy and easier to hack on.

I don't believe that's a WebKit policy, and if it is, it surely doesn't apply to cases where we use classes to prevent threading lots of parameters trough helper functions. You don't need to do such threading if those helpers already have access to some shared state object. Its not useful to apply policies about dependencies to classes that are used as a short-lived scope. 

In the DFG we have always written phases as classes. This follows a long tradition of doing so in other compilers. For example most llvm phases so this as well. 

It also follows a long tradition of using classes this way in JSC:

- JSGlobalObject knows about all of the state of a realm.
- VM knows about all of the state of the Vm. 
- DFG::Graph knows about all of the state of DFG IR. 
- FTL::State knows about all of the state of an FTL compilation. 
- JIT knows about all of the state of a baseline compilation. 
- PolymoprphicAccess knows about all of the state of an inline cache compilation. 
- etc. 

I'm surprised that you don't want to follow the established style of writing compiler phases in WebKit. The use of classes to hold the state needed during a compile and then making the helpers be private instance methods goes back to the days of the baseline JIT - i.e. it predates either of our involvement in the VM. The DFG followed this style consistently, and most of the B3 code follows it with the exception of:

- phases that can get by just using closures. I'm still not sure if that's a good style but nobody has objected.
- IRC, which deliberately goes against the established style. 

Consistency is important. Code is easier to hack if it all flows in a similar way. I think that your argument about controlling dependencies is ultimately weaker than the need for consistency, and if your argument were to be accepted then we would probably need to refactor existing compiler code to follow it. We're not going to do that since it would be a tremendous amount of code churn and it's not clear what it would improve.
Comment 10 Benjamin Poulain 2015-12-19 07:58:54 PST
I am not arguing against a class for the phase. That is probably a good idea for consistency.

What I said is we should not have a single class that knows all the states of the program. You can have a class with the phase and still have a separate class for coloring. 99% of the states of IteratedRegisterCoalescingAllocator are only useful in the narrow context of coloring. It is already complicated as it is. It will only get worse as we improve its perf. We should move things out of it, not make it even more complicated.
Comment 11 Filip Pizlo 2015-12-19 08:12:06 PST
(In reply to comment #10)
> I am not arguing against a class for the phase. That is probably a good idea
> for consistency.
> 
> What I said is we should not have a single class that knows all the states
> of the program. You can have a class with the phase and still have a
> separate class for coloring. 99% of the states of
> IteratedRegisterCoalescingAllocator are only useful in the narrow context of
> coloring. It is already complicated as it is. It will only get worse as we
> improve its perf. We should move things out of it, not make it even more
> complicated.

Ah!  I'm fine with this approach. 

Do you want to do this in your refactoring patch, or should I do it in this patch?  I'd be fine waiting for your refactoring if you think it's something you can do today.
Comment 12 Benjamin Poulain 2015-12-19 08:35:45 PST
(In reply to comment #11)
> Ah!  I'm fine with this approach. 
> 
> Do you want to do this in your refactoring patch, or should I do it in this
> patch?  I'd be fine waiting for your refactoring if you think it's something
> you can do today.

It will probably be easier to merge this change over the refactoring than the other way around. I'll add a phase class to the refactoring.

I'll rest a bit first, Jet-lag is kicking hard. I'll try to look before PST evening.
Comment 13 Filip Pizlo 2015-12-19 13:32:24 PST
Created attachment 267703 [details]
rebased

I'm not sure if this works yet.
Comment 14 Filip Pizlo 2015-12-19 14:06:42 PST
Created attachment 267704 [details]
the patch
Comment 15 WebKit Commit Bot 2015-12-19 14:08:00 PST
Attachment 267704 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/air/AirInst.h:113:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirInstInlines.h:58:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirInstInlines.h:102:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirUseCounts.h:80:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6398:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6415:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6415:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6417:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6417:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6419:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6421:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6423:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/air/AirLiveness.h:93:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirLiveness.h:219:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirLiveness.h:226:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirLiveness.h:235:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirInst.cpp:41:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/ftl/FTLOSRExitHandle.cpp:44:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp:4228:  Extra space before )  [whitespace/parens] [2]
ERROR: Source/JavaScriptCore/b3/air/AirSpillEverything.cpp:70:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirSpillEverything.cpp:122:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirEliminateDeadCode.cpp:83:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirTmpWidth.cpp:55:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirTmpWidth.cpp:92:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirAllocateStack.cpp:107:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirAllocateStack.cpp:151:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:783:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:852:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 28 in 27 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 16 Filip Pizlo 2015-12-19 14:08:36 PST
Created attachment 267705 [details]
the patch
Comment 17 WebKit Commit Bot 2015-12-19 14:10:44 PST
Attachment 267705 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/air/AirInst.h:113:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirInstInlines.h:58:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirInstInlines.h:102:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirUseCounts.h:80:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6398:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6415:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6415:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6417:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6417:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6419:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6421:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:6423:  Consider using CHECK_EQ instead of CHECK(a == b)  [readability/check] [2]
ERROR: Source/JavaScriptCore/b3/air/AirLiveness.h:93:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirLiveness.h:219:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirLiveness.h:226:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirLiveness.h:235:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirInst.cpp:41:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/ftl/FTLOSRExitHandle.cpp:44:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirSpillEverything.cpp:70:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirSpillEverything.cpp:122:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirEliminateDeadCode.cpp:83:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirTmpWidth.cpp:55:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirTmpWidth.cpp:92:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirAllocateStack.cpp:107:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirAllocateStack.cpp:151:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:783:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:852:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 27 in 27 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 18 Benjamin Poulain 2015-12-20 06:30:12 PST
Comment on attachment 267705 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=267705&action=review

That's pretty neat. A few questions bellow:

> Source/JavaScriptCore/ChangeLog:43
> +        could make CCalls on slow paths use a variant of CCallSpecial that promises not to clobber any
> +        registers, and then have it emit spill code around the call itself. LLVM probably gets this
> +        optimization from its live range splitting.

Tell me when the benchmark runs, I'll have a look too.

> Source/JavaScriptCore/b3/air/AirArg.h:810
> +            functor(m_base, Use, GP, argRole == UseAddr ? argWidth : pointerWidth());

I am not following this.

Why is UseAddr different from actual Add/Index use?

> Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:661
> -        return mayBeCoalescable(inst) && inst.args[0].tmp() == inst.args[1].tmp();
> +        return mayBeCoalescableImpl(inst, nullptr) && inst.args[0].tmp() == inst.args[1].tmp();

It may be easier to just not use mayBeCoalescableImpl() and just check if the opcode is a move and the temps are equals.

> Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:1090
> +            // FIXME: One way to optimize this code is to remove the recomputation inside the fixpoint.
> +            // We need to recompute because spilling adds tmps, but we could just update tmpWidth when we
> +            // add those tmps.
> +            m_tmpWidth.recompute(m_code);

I thought you would only do that after spilling.
Often, FP does not spill so you could reuse everything already computed.

Updating just the new tmps is a good idea.

> Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:1141
> +                if (mayBeCoalescable && inst.args[0].isTmp() && inst.args[1].isTmp()

You don't really need to check "inst.args[0].isTmp() && inst.args[1].isTmp()" since it is implied by mayBeCoalescable.

> Source/JavaScriptCore/b3/air/AirTmpWidth.cpp:80
> +                    if (inst.args[0].value() <= std::numeric_limits<int8_t>::max())
> +                        widths.def = std::max(widths.def, Arg::Width8);

Not getting this. If I have a large negative number, wouldn't it end up as Width8?
Comment 19 Filip Pizlo 2015-12-20 07:38:44 PST
(In reply to comment #18)
> Comment on attachment 267705 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=267705&action=review
> 
> That's pretty neat. A few questions bellow:
> 
> > Source/JavaScriptCore/ChangeLog:43
> > +        could make CCalls on slow paths use a variant of CCallSpecial that promises not to clobber any
> > +        registers, and then have it emit spill code around the call itself. LLVM probably gets this
> > +        optimization from its live range splitting.
> 
> Tell me when the benchmark runs, I'll have a look too.

It runs if you apply this patch. 

> 
> > Source/JavaScriptCore/b3/air/AirArg.h:810
> > +            functor(m_base, Use, GP, argRole == UseAddr ? argWidth : pointerWidth());
> 
> I am not following this.
> 
> Why is UseAddr different from actual Add/Index use?

UseAddr is what Lea does. You could have a 32-bit Lea. A 32-bit Lea will have a Width32 UseAddr. If you do:

Lea32 4(%rax), %edx

Then you're actually only using the low 32 bits of %rax. This is unlike all other occurrences of 4(%rax), which use pointerWidth bits. 

> 
> > Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:661
> > -        return mayBeCoalescable(inst) && inst.args[0].tmp() == inst.args[1].tmp();
> > +        return mayBeCoalescableImpl(inst, nullptr) && inst.args[0].tmp() == inst.args[1].tmp();
> 
> It may be easier to just not use mayBeCoalescableImpl() and just check if
> the opcode is a move and the temps are equals.

There is some small benefit to making the Impl variant private. In particular the protocol of TmpWidth* being null is currently encapsulated, which is nice. 

> 
> > Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:1090
> > +            // FIXME: One way to optimize this code is to remove the recomputation inside the fixpoint.
> > +            // We need to recompute because spilling adds tmps, but we could just update tmpWidth when we
> > +            // add those tmps.
> > +            m_tmpWidth.recompute(m_code);
> 
> I thought you would only do that after spilling.
> Often, FP does not spill so you could reuse everything already computed.
> 
> Updating just the new tmps is a good idea.

I want to add all of those optimization a but since this is a bug fix patch and since it doesn't hurt our performance on any benchmark, I think it's better to land this naive version and optimize it later. How you update TmpWidth when you add temps in the spiller is not super obvious, so it's the kind of thing that is safer to do in a follow up patch. 

> 
> > Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:1141
> > +                if (mayBeCoalescable && inst.args[0].isTmp() && inst.args[1].isTmp()
> 
> You don't really need to check "inst.args[0].isTmp() &&
> inst.args[1].isTmp()" since it is implied by mayBeCoalescable.
> 

True. 

> > Source/JavaScriptCore/b3/air/AirTmpWidth.cpp:80
> > +                    if (inst.args[0].value() <= std::numeric_limits<int8_t>::max())
> > +                        widths.def = std::max(widths.def, Arg::Width8);
> 
> Not getting this. If I have a large negative number, wouldn't it end up as
> Width8?

There is a non-negative check that guards all of this.
Comment 20 Benjamin Poulain 2015-12-21 02:11:49 PST
Comment on attachment 267705 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=267705&action=review

Looks good to me.

The Lea case is tricky. I am glad you thought about it right away.

> Source/JavaScriptCore/ChangeLog:18
> +          ascribe a type to Tmp's. We could ascribe types to Tmp's, but then coalescing would become

Not sure what the possessive form refers to there. "Tmp's" -> "Tmps"?
Comment 21 Filip Pizlo 2015-12-21 07:49:18 PST
(In reply to comment #20)
> Comment on attachment 267705 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=267705&action=review
> 
> Looks good to me.
> 
> The Lea case is tricky. I am glad you thought about it right away.
> 
> > Source/JavaScriptCore/ChangeLog:18
> > +          ascribe a type to Tmp's. We could ascribe types to Tmp's, but then coalescing would become
> 
> Not sure what the possessive form refers to there. "Tmp's" -> "Tmps"?

Yeah, that's what I meant.
Comment 22 Filip Pizlo 2015-12-21 08:17:00 PST
Landed in http://trac.webkit.org/changeset/194331