Bug 125881

Summary: frameRegisterCount() should include maxFrameExtentForSlowPathCall.
Product: WebKit Reporter: Mark Lam <mark.lam>
Component: JavaScriptCoreAssignee: Mark Lam <mark.lam>
Status: RESOLVED FIXED    
Severity: Normal CC: fpizlo, ggaren, msaboff
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
the patch.
ggaren: review-
patch 2: applied requested fixes.
ggaren: review+
Patch for commit. Extra bits reviewed by msaboff. none

Mark Lam
Reported 2013-12-17 16:07:21 PST
In https://bugs.webkit.org/show_bug.cgi?id=125849#c8, Pizlo said, "The *entire point* of frameRegisterCount() is that it should already include maxFrameExtentForSlowPathCall." Patch coming to make frameRegisterCount() include maxFrameExtentForSlowPathCall.
Attachments
the patch. (8.52 KB, patch)
2013-12-17 16:14 PST, Mark Lam
ggaren: review-
patch 2: applied requested fixes. (8.10 KB, patch)
2013-12-17 17:29 PST, Mark Lam
ggaren: review+
Patch for commit. Extra bits reviewed by msaboff. (12.77 KB, patch)
2013-12-17 18:45 PST, Mark Lam
no flags
Mark Lam
Comment 1 2013-12-17 16:14:20 PST
Created attachment 219463 [details] the patch.
Filip Pizlo
Comment 2 2013-12-17 16:26:09 PST
Comment on attachment 219463 [details] the patch. View in context: https://bugs.webkit.org/attachment.cgi?id=219463&action=review > Source/JavaScriptCore/bytecode/CodeBlock.cpp:3376 > + return jitCode()->dfgCommon()->frameRegisterCountWithSlowPathFrameExtent(); > + This is wrong. This doesn't actually change the DFG's stack layout behavior. The DFG never calls CodeBlock::frameRegisterCount(). (Nor should it - the DFG knows that it's the DFG so it will call Graph::frameRegisterCount() directly. Also, during DFG compilation we don't have a JITCode yet, so this method would just segfault.) Just change Graph::frameRegisterCount() to include the parameter slow path thingy. And make sure that it does the right thing for both JS calls and slow path calls. Right now the DFG lays out a parameter area for itself already. The only thing missing is that it only accounts for the number of parameter slots needed for JS->JS calls (Graph::m_parameterSlots). So, just use something like std::max(m_parameterSlots, maxFrameExtentForSlowPathCall). Of course there will probably be some alignment or other arithmetic. > Source/JavaScriptCore/dfg/DFGCommonData.cpp:83 > +unsigned CommonData::frameRegisterCountWithSlowPathFrameExtent() > +{ > + size_t registerCount = frameRegisterCount + maxFrameExtentForSlowPathCallInRegisters; > + ASSERT(registerCount == WTF::roundUpToMultipleOf(stackAlignmentRegisters(), registerCount)); > + return registerCount; > +} > + Remove this. > Source/JavaScriptCore/dfg/DFGCommonData.h:94 > + unsigned frameRegisterCountWithSlowPathFrameExtent(); > + Remove. > Source/JavaScriptCore/jit/JIT.h:251 > + static unsigned frameRegisterCountInBytesFor(CodeBlock* codeBlock) > { > - ASSERT(!(codeBlock->m_numCalleeRegisters & 1)); > - return codeBlock->m_numCalleeRegisters; > + return -virtualRegisterForLocal(frameRegisterCountFor(codeBlock) - 1).offset() * sizeof(Register); > } Why does this exist? Why can't you just always say frameRegisterCountFor(codeBlock) * sizeof(Register)?
Geoffrey Garen
Comment 3 2013-12-17 16:59:12 PST
Comment on attachment 219463 [details] the patch. View in context: https://bugs.webkit.org/attachment.cgi?id=219463&action=review > Source/JavaScriptCore/bytecode/CodeBlock.cpp:3378 > + return jitCode()->dfgCommon()->frameRegisterCountWithSlowPathFrameExtent(); > + > case JITCode::FTLJIT: > return jitCode()->dfgCommon()->frameRegisterCount; This smells fishy. I don't think the DFG wants a frameRegisterCount, and a separate frameRegisterCountWithSlowPathFrameExtent helper function, where the former is an invitation to do things wrong, and the latter is the right answer. I think we just want to put the right answer into frameRegisterCount. My understanding is that's the whole point of that data member. frameRegisterCount is already initialized for us. The bug you need to fix is that DFG::Graph::frameRegisterCount() does not add in the max slow path thingy. (This also means that the DFG is using the wrong value for SP right now.) > Source/JavaScriptCore/jit/JIT.cpp:785 > + ASSERT(!(codeBlock->m_numCalleeRegisters & 1)); Let's remove this. > Source/JavaScriptCore/jit/JIT.h:250 > + return -virtualRegisterForLocal(frameRegisterCountFor(codeBlock) - 1).offset() * sizeof(Register); Instead of frameRegisterCountInBytes, it look like what we need is for VirtualRegister to provide an offsetInBytes() function. We don't want to directly encode the "-" here, because that makes an assumption about the number line used by VirtualRegister, and the whole point of virtualRegisterForLocal is to encapsulate that number line. > Source/JavaScriptCore/jit/JITOpcodes.cpp:640 > + size_t frameExtent = JIT::frameRegisterCountInBytesFor(codeBlock()); Let's rename the LHS to match the RHS.
Filip Pizlo
Comment 4 2013-12-17 17:20:23 PST
(In reply to comment #3) > (From update of attachment 219463 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=219463&action=review > > > Source/JavaScriptCore/bytecode/CodeBlock.cpp:3378 > > + return jitCode()->dfgCommon()->frameRegisterCountWithSlowPathFrameExtent(); > > + > > case JITCode::FTLJIT: > > return jitCode()->dfgCommon()->frameRegisterCount; > > This smells fishy. > > I don't think the DFG wants a frameRegisterCount, and a separate frameRegisterCountWithSlowPathFrameExtent helper function, where the former is an invitation to do things wrong, and the latter is the right answer. I think we just want to put the right answer into frameRegisterCount. My understanding is that's the whole point of that data member. Yup. > > frameRegisterCount is already initialized for us. The bug you need to fix is that DFG::Graph::frameRegisterCount() does not add in the max slow path thingy. (This also means that the DFG is using the wrong value for SP right now.) Yes. > > > Source/JavaScriptCore/jit/JIT.cpp:785 > > + ASSERT(!(codeBlock->m_numCalleeRegisters & 1)); > > Let's remove this. > > > Source/JavaScriptCore/jit/JIT.h:250 > > + return -virtualRegisterForLocal(frameRegisterCountFor(codeBlock) - 1).offset() * sizeof(Register); > > Instead of frameRegisterCountInBytes, it look like what we need is for VirtualRegister to provide an offsetInBytes() function. > > We don't want to directly encode the "-" here, because that makes an assumption about the number line used by VirtualRegister, and the whole point of virtualRegisterForLocal is to encapsulate that number line. > > > Source/JavaScriptCore/jit/JITOpcodes.cpp:640 > > + size_t frameExtent = JIT::frameRegisterCountInBytesFor(codeBlock()); > > Let's rename the LHS to match the RHS.
Mark Lam
Comment 5 2013-12-17 17:29:44 PST
Created attachment 219479 [details] patch 2: applied requested fixes.
Geoffrey Garen
Comment 6 2013-12-17 17:41:42 PST
Comment on attachment 219479 [details] patch 2: applied requested fixes. Let's remove the JSStack::CallerFrameAndPCSize line. The slow path helper, by definition, must include those. (That value seems to have been added in <http://trac.webkit.org/changeset/160506> as a crude estimate of maxFrameExtentForSlowPathCallInRegisters.)
Michael Saboff
Comment 7 2013-12-17 18:15:06 PST
(In reply to comment #6) > (From update of attachment 219479 [details]) > Let's remove the JSStack::CallerFrameAndPCSize line. The slow path helper, by definition, must include those. (That value seems to have been added in <http://trac.webkit.org/changeset/160506> as a crude estimate of maxFrameExtentForSlowPathCallInRegisters.) CallerFrameAndPCSize was added in <http://trac.webkit.org/changeset/160506> to account for the caller's frame pointer and return PC for outgoing calls. If we remove CallerFrameAndPCSize from this place, then maxFrameExtentForSlowPathCallInRegisters should be adjusted up on all platforms to account for the same. I believe that Mark will do this in a separate patch.
Filip Pizlo
Comment 8 2013-12-17 18:26:02 PST
Comment on attachment 219479 [details] patch 2: applied requested fixes. View in context: https://bugs.webkit.org/attachment.cgi?id=219479&action=review I smell weirdness. Can you comment on what is going on in the DFG with CallerFrameAndPCSize, and why the other engines don't do similar? > Source/JavaScriptCore/dfg/DFGGraph.cpp:708 > + unsigned maxOutgoingArgsSlots = std::max(m_parameterSlots, static_cast<unsigned>(JSStack::CallerFrameAndPCSize)); Isn't the point here that you're trying to account for the outgoing frame header size for calls on platforms that can pass all arguments in registers? In that case, why isn't this just part of maxFrameExtentForSlowPathCallInRegisters? And why doesn't the baseline JIT also use this as part of its frameRegisterCount() calculation? > Source/JavaScriptCore/jit/JIT.cpp:787 > + size_t registerCount = codeBlock->m_numCalleeRegisters + maxFrameExtentForSlowPathCallInRegisters; > + ASSERT(registerCount == WTF::roundUpToMultipleOf(stackAlignmentRegisters(), registerCount)); > + return registerCount; This feels like it should also take into account JSStack::CallerFrameAndPCSize, just like you're doing in the DFG. Or, even better, neither should take it into account because it would be factored into maxFrameExtentForSlowPathCallInRegisters or some other constant.
Filip Pizlo
Comment 9 2013-12-17 18:27:26 PST
(In reply to comment #7) > (In reply to comment #6) > > (From update of attachment 219479 [details] [details]) > > Let's remove the JSStack::CallerFrameAndPCSize line. The slow path helper, by definition, must include those. (That value seems to have been added in <http://trac.webkit.org/changeset/160506> as a crude estimate of maxFrameExtentForSlowPathCallInRegisters.) > > CallerFrameAndPCSize was added in <http://trac.webkit.org/changeset/160506> to account for the caller's frame pointer and return PC for outgoing calls. If we remove CallerFrameAndPCSize from this place, then maxFrameExtentForSlowPathCallInRegisters should be adjusted up on all platforms to account for the same. I believe that Mark will do this in a separate patch. OK, but it would really be better to do this as part of this patch. I don't like how this patch leaves the code in an inconsistent state. It'll make revision archeology very confusing.
Mark Lam
Comment 10 2013-12-17 18:45:37 PST
Created attachment 219488 [details] Patch for commit. Extra bits reviewed by msaboff.
Filip Pizlo
Comment 11 2013-12-17 18:46:37 PST
Comment on attachment 219488 [details] Patch for commit. Extra bits reviewed by msaboff. View in context: https://bugs.webkit.org/attachment.cgi?id=219488&action=review > Source/JavaScriptCore/dfg/DFGGraph.cpp:708 > - unsigned result = m_nextMachineLocal + std::max(m_parameterSlots, static_cast<unsigned>(JSStack::CallerFrameAndPCSize)); > + unsigned result = m_nextMachineLocal + std::max(m_parameterSlots, static_cast<unsigned>(maxFrameExtentForSlowPathCallInRegisters)); OMG so much better!
Filip Pizlo
Comment 12 2013-12-17 18:46:50 PST
Comment on attachment 219488 [details] Patch for commit. Extra bits reviewed by msaboff. r=me too
Mark Lam
Comment 13 2013-12-17 19:35:53 PST
Thanks for all the reviews. Landed in r160745: <http://trac.webkit.org/r160745>.
Geoffrey Garen
Comment 14 2013-12-17 20:18:12 PST
> CallerFrameAndPCSize was added in <http://trac.webkit.org/changeset/160506> to account for the caller's frame pointer and return PC for outgoing calls. Do you mean outgoing calls to VM helper functions, or outgoing calls to JavaScript functions? If we have an outgoing call to a JavaScript function, m_parameterSlots should be non-0, and maxing with CallerFrameAndPCSize should be a no-op. Also, it would be supremely weird to encode the stack frame needs of a JavaScript function call in a variable named maxFrameExtentForSlowPathCall. If we have an outgoing call to a VM helper function, the return PC and return BP of the host frame are pushed after the call, and so they do not influence what frameRegisterCount should be set to, since frameRegisterCount is the value that sp should hold before the call instruction. Either way, it smells like this patch covered up a really nasty bug by arbitrarily making the stack frame a tiny bit bigger.
Michael Saboff
Comment 15 2013-12-17 21:15:51 PST
(In reply to comment #14) > > CallerFrameAndPCSize was added in <http://trac.webkit.org/changeset/160506> to account for the caller's frame pointer and return PC for outgoing calls. > > Do you mean outgoing calls to VM helper functions, or outgoing calls to JavaScript functions? > > If we have an outgoing call to a JavaScript function, m_parameterSlots should be non-0, and maxing with CallerFrameAndPCSize should be a no-op. Also, it would be supremely weird to encode the stack frame needs of a JavaScript function call in a variable named maxFrameExtentForSlowPathCall. > > If we have an outgoing call to a VM helper function, the return PC and return BP of the host frame are pushed after the call, and so they do not influence what frameRegisterCount should be set to, since frameRegisterCount is the value that sp should hold before the call instruction. > > Either way, it smells like this patch covered up a really nasty bug by arbitrarily making the stack frame a tiny bit bigger. It was only for outgoing calls to VM helper functions. frameRegisterCount needs to include the size of the largest outgoing frame and it includes the return PC and return BP that the callee will push. When there aren't any outgoing JS calls, we need to include space for the return PC and BP of any outgoing VM helper calls. Therefore the max() of m_parameterSlots and CallerFrameAndPCSize. When the stack pointer is set up, we don't include (i.e. subtract out) CallerFrameAndPCSize as the callee will push return PC and return BP. The current patch rightly doesn't change the setting of the stack pointer, we are still subtracting out CallerFrameAndPCSize. The current patch didn't change any of the calculations results for the DFG, it just moved CallerFrameAndPCSize to be part of maxFrameExtentForSlowPathCall. If frameRegisterCount is the value that sp should set to before the call, we create a stack frame that has two extra slots. This is due to m_parameterSlots including the size of a full frame header. See ByteCodeParser::addCall() where we have: m_parameterSlots = JSStack::ThisArgument + argCount The stack check must check at least frameRegisterCount without subtracting CallerFrameAndPCSize.
Geoffrey Garen
Comment 16 2013-12-17 21:58:52 PST
> If frameRegisterCount is the value that sp should set to before the call, we create a stack frame that has > two extra slots. This is due to m_parameterSlots including the size of a full frame header. See > ByteCodeParser::addCall() where we have: > m_parameterSlots = JSStack::ThisArgument + argCount Can we fix this by initializing m_parameterSlots to a more appropriate value (JSStack::CallFrameHeaderSize - CallerFrameAndPCSize + argCount)? What's the motivation behind pretending that we have two more outgoing parameter slots than we have actually allocated in our stack frame? If somebody ever tried to iterate from 0 to m_parameterSlots, they would corrupt the stack. That doesn't seem so good. > The stack check must check at least frameRegisterCount without subtracting CallerFrameAndPCSize. Sure, the stack check has special needs. But the stack check computation is distinct from setting sp. The purpose of frameRegisterCount is to set sp. I believe our stated plan for accommodating CallerFrameAndPCSize in C++ helper functions was just to fold it into in the redzone we already need for C++ functions generally. Knowing that you have 16 bytes free on the stack does you no good if you're about to call a C++ function that uses that 16 bytes for CallerFrameAndPCSize plus an additional 1kB for its local variables.
Note You need to log in before you can comment on or make changes to this bug.