Spending all our time in stack graph coloring.

<rdar://problem/88933599>

I found two bottlenecks: - The creation of the interference graph has cubic complexity with a large constant factor: there are n**2 edges, and adding a new edge can require shifting right the BitVector form of LikelyDenseUnsignedIntegerSet. It just so happens that this example not only has n**2 edges, but also requires this shift for almost every single one! - The actual call to assign() (in AirStackAllocation.cpp) is also responsible for a cubic blow-up, as for each stack slot (n of them) we try up to one possible offset per other stack slot (n of them), and test each possible offset against all other stack slots (n of them). The second bottleneck can be lessened to O(n**2 * log(n)) by sorting the stack slots by offset doing the tests smartly. The first bottleneck I can improve the constant factor by a lot by always rounding the offset of the bit vector to the next lowest multiple of 64. This means 64x fewer shifts, and the shifts can be much cheaper since they become mere memcpy. I intend to land these two changes in separate patches.

First patch: https://bugs.webkit.org/show_bug.cgi?id=236997

*** This bug has been marked as a duplicate of bug 236610 ***

Ignore the dupe -- this was completed as two checkins (see related bugs).