Bug 125740

Summary: Create a DFG SSA backend with global register allocation
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Filip Pizlo <fpizlo>
Status: NEW    
Severity: Normal CC: barraclough, benjamin, fpizlo, ggaren, mark.lam, mhahnenberg, msaboff, nrotem, oliver, sam
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
work in progress
none
moar
none
implemented the RA
none
for real this time
none
more
none
more
none
less
none
more none

Filip Pizlo
Reported 2013-12-14 12:15:12 PST
A lot of the speed-up from the FTL is from optimizations implemented on the DFG side. Maybe we should just lower to machine code directly from DFG IR, but just incrementally improve how we do it.
Attachments
work in progress (29.92 KB, patch)
2013-12-14 13:44 PST, Filip Pizlo
no flags
moar (38.01 KB, patch)
2013-12-14 15:08 PST, Filip Pizlo
no flags
implemented the RA (17.34 KB, patch)
2013-12-14 23:19 PST, Filip Pizlo
no flags
for real this time (51.50 KB, patch)
2013-12-15 14:13 PST, Filip Pizlo
no flags
more (62.85 KB, patch)
2013-12-15 18:20 PST, Filip Pizlo
no flags
more (67.63 KB, patch)
2013-12-15 20:49 PST, Filip Pizlo
no flags
less (60.41 KB, patch)
2013-12-16 22:01 PST, Filip Pizlo
no flags
more (61.56 KB, patch)
2014-03-02 14:56 PST, Filip Pizlo
no flags
Filip Pizlo
Comment 1 2013-12-14 12:25:30 PST
This is going to be kind of dirty at first, but I think it'll work. Here's what we need: 1) Compute an estimate of the data format and scratch register count for each node. These really are estimates, for the most part, and we have a lot of room for slop here. More on that below. 2) Compute the liveness that includes OSR availability. This is kind of fun because it combines two backwards analyses (DFG liveness and BC liveness), one forward analysis (OSR availability), and some CFA results (canExit). I'll refer to this as liveness-including-availability, or LIA. 3) Compute interference and use that to do coloring for both spill slots and registers, globally. Use a quick-and-dirty coloring algorithm for spills and a smart algorithm for registers. 4) Institute a policy that at the top of each basic block, all live nodes (according to LIA) should be in the register and stored according to the format/register/spill that those nodes report. 5) Code generation proceeds independently for each basic block and runs forward, as now. If the code generator wants more scratch registers, it may spill nodes. It may also decide to change the representation of a node. But it is required to return all nodes to their original formats at the end of the basic block. (5) is the reason why (1) can be sloppy. For example we may compute that a node is formatted as DataFormatJS and uses 1 scratch register, but then that node may actually return a DataFormatInt32 and use 2 scratch registers. Worst case, this causes one local spill (that gets restored by the end of the block) and causes the code generator to have to reformat the node at the end of the block.
Filip Pizlo
Comment 2 2013-12-14 13:44:58 PST
Created attachment 219257 [details] work in progress
Filip Pizlo
Comment 3 2013-12-14 15:08:43 PST
Filip Pizlo
Comment 4 2013-12-14 23:19:39 PST
Created attachment 219274 [details] implemented the RA I wrote Chaitin's algorithm. It was enjoyable.
Oliver Hunt
Comment 5 2013-12-15 13:45:39 PST
Comment on attachment 219258 [details] moar View in context: https://bugs.webkit.org/attachment.cgi?id=219258&action=review > Source/JavaScriptCore/dfg/DFGLocationAllocationPhase.cpp:199 > + HashMap<Node*, HashSet<Node*>> m_interference; unused? > Source/JavaScriptCore/dfg/DFGNode.h:1563 > + ASSERT(static_cast<unsigned>(m_scratches) == num); this cast shouldn't be necessary as both types are unsigned. If your goal is to avoid undefined behaviour shenanigans your best approach would be to do ASSERT(!(num & ~0xff)), but none of tat should matter as asserts are only on in debug builds so no edge case optimisation should occur.
Oliver Hunt
Comment 6 2013-12-15 13:46:13 PST
(In reply to comment #4) > Created an attachment (id=219274) [details] > implemented the RA > > I wrote Chaitin's algorithm. It was enjoyable. wrong patch?
Filip Pizlo
Comment 7 2013-12-15 14:13:28 PST
(In reply to comment #6) > (In reply to comment #4) > > Created an attachment (id=219274) [details] [details] > > implemented the RA > > > > I wrote Chaitin's algorithm. It was enjoyable. > > wrong patch? Yeah.
Filip Pizlo
Comment 8 2013-12-15 14:13:53 PST
Created attachment 219282 [details] for real this time
Filip Pizlo
Comment 9 2013-12-15 18:20:06 PST
Filip Pizlo
Comment 10 2013-12-15 20:49:36 PST
Oliver Hunt
Comment 11 2013-12-15 21:08:33 PST
Comment on attachment 219293 [details] more View in context: https://bugs.webkit.org/attachment.cgi?id=219293&action=review > Source/JavaScriptCore/dfg/DFGAllocationUnit.h:106 > + uintptr_t m_word; Can we make this a bitfield rather than manually doing it? Or do you foresee these varying at runtime in future?
Filip Pizlo
Comment 12 2013-12-15 21:27:45 PST
(In reply to comment #11) > (From update of attachment 219293 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=219293&action=review > > > Source/JavaScriptCore/dfg/DFGAllocationUnit.h:106 > > + uintptr_t m_word; > > Can we make this a bitfield rather than manually doing it? Or do you foresee these varying at runtime in future? How would this be a bitfield? How do you envision a bitfield for pointers?
Oliver Hunt
Comment 13 2013-12-16 09:46:45 PST
Comment on attachment 219293 [details] more View in context: https://bugs.webkit.org/attachment.cgi?id=219293&action=review >>> Source/JavaScriptCore/dfg/DFGAllocationUnit.h:106 >>> + uintptr_t m_word; >> >> Can we make this a bitfield rather than manually doing it? Or do you foresee these varying at runtime in future? > > How would this be a bitfield? How do you envision a bitfield for pointers? Sorry, i missed it being purely a pointer - i just saw it being used in shift operations. My bad
Filip Pizlo
Comment 14 2013-12-16 22:01:14 PST
Created attachment 219397 [details] less I think I have a better way of dealing with scratch registers.
Filip Pizlo
Comment 15 2014-03-02 14:56:15 PST
Note You need to log in before you can comment on or make changes to this bug.