Bug 126320

Summary: CStack: Need a separate stack limit for the JS stack and the C stack
Product: WebKit Reporter: Mark Lam <mark.lam>
Component: JavaScriptCoreAssignee: Mark Lam <mark.lam>
Status: RESOLVED FIXED    
Severity: Normal CC: fpizlo, ggaren, mhahnenberg, msaboff, oliver
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 126266, 126321, 126328, 126331, 126334, 126335, 126487    
Bug Blocks: 125928    
Attachments:
Description Flags
the patch.
ggaren: review-
patch 2: work in progress. Uploading for personal inspection. Not ready for review yet.
none
patch 3: with lots of bug fixes and refinements.
none
patch 4: fixed some C loop bit rot from patch 3.
ggaren: review+
patch 5: updated patch for landing. none

Mark Lam
Reported 2013-12-30 20:52:52 PST
Though the JS stack is made up of frames on the C stack and the OS may allow a large C stack, we want to cap the amount of memory that the JS stack use to a more modest number. One reason for this is so that bad JS code with runaway recursion won't take too long to hit a StackOverflowError and fail out, thereby ensuring that the browser remains relatively interactive (i.e. not appear to hang in the presence of such bad code).
Attachments
the patch. (16.27 KB, patch)
2013-12-31 02:15 PST, Mark Lam
ggaren: review-
patch 2: work in progress. Uploading for personal inspection. Not ready for review yet. (49.70 KB, patch)
2014-01-07 18:36 PST, Mark Lam
no flags
patch 3: with lots of bug fixes and refinements. (69.60 KB, patch)
2014-01-08 17:02 PST, Mark Lam
no flags
patch 4: fixed some C loop bit rot from patch 3. (69.40 KB, patch)
2014-01-09 09:27 PST, Mark Lam
ggaren: review+
patch 5: updated patch for landing. (68.48 KB, patch)
2014-01-09 13:01 PST, Mark Lam
no flags
Mark Lam
Comment 1 2013-12-31 02:15:54 PST
Created attachment 220154 [details] the patch.
Mark Lam
Comment 2 2013-12-31 02:23:54 PST
Landed in r161180 on the jsCStack branch: <http://trac.webkit.org/r161180>.
Geoffrey Garen
Comment 3 2014-01-02 15:33:18 PST
Comment on attachment 220154 [details] the patch. This patch looks way too over-engineered. There doesn't need to be a chunk-based stack limit per VM entry. That's also likely to be wrong, since we need to account for intermediate C stack usage by the VM. This patch should be one line of code that puts a call to "std::max" around our existing calculation for stack bounds. That's all that's required to prevent runaway recursion.
Mark Lam
Comment 4 2014-01-02 19:26:27 PST
(In reply to comment #3) > (From update of attachment 220154 [details]) > This patch looks way too over-engineered. > > There doesn't need to be a chunk-based stack limit per VM entry. That's also likely to be wrong, since we need to account for intermediate C stack usage by the VM. Any C stack usage by a helper function that doesn’t do deep recursion should be covered by the host zone allowance. Any C stack usage by a C function that does deep recursion should have its own C stack recursion checks. The only stack usage we’re trying to account for in this patch is C stack usage by JS CallFrames and their locals / args. > This patch should be one line of code that puts a call to "std::max" around our existing calculation for stack bounds. That's all that's required to prevent runaway recursion. Please clarify how “std::max” solves the problem of needing a separate limit for JS stack growth. Consider the following scenario: Scenario 1: A bit of JS on each end. 1. Let’s say we want a JS stack size limit of 4M. 2. At the start of the C program that is using JSC, we execute a bit of JS code and use a bit of JS stack. 3. Then we call a C function that recurses deeply and uses almost all of the C stack. 4. Then we call back into JS code and discover that we are now at the 4M limit (if we measure stack usage by a simple comparison against the start of the C stack). Consider yet another scenario: Scenario 2: A bit of JS after a lot of C. 1. Let’s say we want a JS stack size limit of 4M. 2. At the start of the C program, it recurses deeply and uses up almost 4M of stack space. 3. Then we call into JS code and discover that we are now at the 4M limit. In both of these cases, hardly any JS code has executed. Do we throw a StackOverflowError there even though we might have another 4M of C stack space available? How are you proposing that we handle this scenario without some form of tracking per VMEntryScope usage of stack space (hence, chunks)? Or are you saying that the 4M limit is an absolute limit, and if the program has used up most of it for C function recursion, then it is OK to not allow JS code to use any more stack space even though there is plenty more C stack space available? I came up with the chunk tracking approach because both Filip and I are of the opinion that while we want to limit JS stack usage, we should not penalize it for stack used by C functions. This behavior is consistent with how JSC behaved back when the JS stack was in its own separate region of memory. Please explain your view of how things should work here, and how you want me to proceed. Thanks.
Geoffrey Garen
Comment 5 2014-01-03 12:17:32 PST
> Consider the following scenario: > > Scenario 1: A bit of JS on each end. > 1. Let’s say we want a JS stack size limit of 4M. > 2. At the start of the C program that is using JSC, we execute a bit of JS code and use a bit of JS stack. > 3. Then we call a C function that recurses deeply and uses almost all of the C stack. > 4. Then we call back into JS code and discover that we are now at the 4M limit (if we measure stack usage by a simple comparison against the start of the C stack). In this scenario, JavaScriptCore should throw a stack overflow exception. > Consider yet another scenario: > > Scenario 2: A bit of JS after a lot of C. > 1. Let’s say we want a JS stack size limit of 4M. > 2. At the start of the C program, it recurses deeply and uses up almost 4M of stack space. > 3. Then we call into JS code and discover that we are now at the 4M limit. In this scenario, JavaScriptCore should throw a stack overflow exception. > In both of these cases, hardly any JS code has executed. Do we throw a StackOverflowError there even though we might have another 4M of C stack space available? How are you proposing that we handle this scenario without some form of tracking per VMEntryScope usage of stack space (hence, chunks)? A global cap on stack extent. > Or are you saying that the 4M limit is an absolute limit, and if the program has used up most of it for C function recursion, then it is OK to not allow JS code to use any more stack space even though there is plenty more C stack space available? Define "plenty more C stack space available". We can make our stack limit as high as we like. But, once we hit the limit, there is not, in point of fact, "plenty more": there is, in point of fact, none more. > I came up with the chunk tracking approach because both Filip and I are of the opinion that while we want to limit JS stack usage, we should not penalize it for stack used by C functions. This behavior is consistent with how JSC behaved back when the JS stack was in its own separate region of memory. Consistency with arbitrary effects of the old way of doing things is a non-goal.
Mark Lam
Comment 6 2014-01-07 18:36:05 PST
Created attachment 220579 [details] patch 2: work in progress. Uploading for personal inspection. Not ready for review yet.
Mark Lam
Comment 7 2014-01-08 17:02:59 PST
Created attachment 220677 [details] patch 3: with lots of bug fixes and refinements.
Mark Lam
Comment 8 2014-01-08 17:10:39 PST
Patch 3 also fixes a build failure in WebCore due to previous patches that I committed. I've run the complete layout test on this patch and so far have only seen 2 failures that were unexpected: inspector-protocol/debugger/setPauseOnExceptions-all.html [ Crash Timeout Pass ] inspector-protocol/debugger/setPauseOnExceptions-uncaught.html [ Crash Timeout Pass ] In both of these cases, the crash is due to the stack frame failing to align properly. In both cases, the misalignment was introduced in the sp in a JS function that called operationVMHandleException(). That tells me the caller is supposed to be a JITted function. However, a dump of the CallFrame of that function shows that it is an interpreted frame: (gdb) p /x $rbp $17 = 0x7fff5fbfdca0 (gdb) p debugPrintCallFrame(0x7fff5fbfdca0) frame 0x7fff5fbfdca0 { name 'exceptionDOM' sourceURL 'file://.../LayoutTests/inspector-protocol/debugger/resources/exception.js' isVMEntrySentinel 0 isInlinedFrame 0 callee 0x10a7ddc30 returnPC 0x1005f6bb2 callerFrame 0x7fff5fbfdcf0 rawLocationBits 47 0x2f codeBlock 0x10f85d5e0 bytecodeOffset 47 0x2f / 61 line 8 column 30 jitType 2 <InterpreterThunk> isOptimizingJIT 0 hasCodeOrigins 0 } Regardless, the alignment issue is unrelated to this patch.
Mark Lam
Comment 9 2014-01-08 17:43:09 PST
(In reply to comment #8) > Regardless, the alignment issue is unrelated to this patch. The bug for the misaligned stack frame is at https://bugs.webkit.org/show_bug.cgi?id=126673.
Mark Lam
Comment 10 2014-01-09 09:27:46 PST
Created attachment 220745 [details] patch 4: fixed some C loop bit rot from patch 3.
Michael Saboff
Comment 11 2014-01-09 10:42:06 PST
Comment on attachment 220745 [details] patch 4: fixed some C loop bit rot from patch 3. View in context: https://bugs.webkit.org/attachment.cgi?id=220745&action=review > Source/JavaScriptCore/ChangeLog:69 > + of the very first one will restire the stack limit with the non-error Typo "restore" > Source/JavaScriptCore/ChangeLog:97 > + a. LLINT and JIT code performing stack checks where the expect the Should "where the expect" be "where they expect"? > Source/JavaScriptCore/ChangeLog:497 > Reviewed by Michael Saboff. > + Update: this is supplanted by the 2014-01-08 patch for https://bugs.webkit.org/show_bug.cgi?id=126320. I don't think you should retroactively update Reviewed patches. Your new change log entry explains that the current patch replaces prior patches. > Source/JavaScriptCore/ChangeLog:816 > - Not yet reviewed. > + Rejected by Geoffrey Garen. > + Update: this is supplanted by the 2014-01-08 patch for https://bugs.webkit.org/show_bug.cgi?id=126320. In this case where the patch was rejected, if the code has been completely removed, I support that the change log entry is removed. If there is any vestige of the code corresponding to this entry, I'm fine with this update. > Source/JavaScriptCore/ChangeLog:902 > - Not yet reviewed. > + Rejected by Geoffrey Garen. > + Update: this is supplanted by the 2014-01-08 patch for https://bugs.webkit.org/show_bug.cgi?id=126320. Same as above. > Source/JavaScriptCore/ChangeLog:979 > + Update: this is supplanted by the 2014-01-08 patch for https://bugs.webkit.org/show_bug.cgi?id=126320. Not needed as described above. > Source/JavaScriptCore/ChangeLog:1011 > - Not yet reviewed. > + Rejected by Geoffrey Garen. > + Update: this is supplanted by the 2014-01-08 patch for https://bugs.webkit.org/show_bug.cgi?id=126320. Same as above.
Mark Lam
Comment 12 2014-01-09 10:56:00 PST
Thanks. (In reply to comment #11) > > Source/JavaScriptCore/ChangeLog:69 > > + of the very first one will restire the stack limit with the non-error > > Typo "restore" Will fix. > > Source/JavaScriptCore/ChangeLog:97 > > + a. LLINT and JIT code performing stack checks where the expect the > > Should "where the expect" be "where they expect"? Will fix. > > Source/JavaScriptCore/ChangeLog:497 > > Reviewed by Michael Saboff. > > + Update: this is supplanted by the 2014-01-08 patch for https://bugs.webkit.org/show_bug.cgi?id=126320. > > I don't think you should retroactively update Reviewed patches. Your new change log entry explains that the current patch replaces prior patches. This is my attempt to help make the merging effort later simpler by providing the relevant information at the change that was made. I expect that the merge of this group of patches will be summed up as a single patch, and the ChangeLog on ToT will not show all the old changes that we supplanted. This will also make reading the diff for this patch on trunk a lot more consistent since on trunk where we don't expect to commit without a review (unlike this branch). I can remove these update comments if the team thinks that this does more harm than good.
Geoffrey Garen
Comment 13 2014-01-09 11:00:42 PST
Comment on attachment 220745 [details] patch 4: fixed some C loop bit rot from patch 3. View in context: https://bugs.webkit.org/attachment.cgi?id=220745&action=review > Source/JavaScriptCore/heap/ConservativeRoots.cpp:102 > + ASSERT(static_cast<size_t>(static_cast<char*>(end) - static_cast<char*>(begin)) < wtfThreadData().stack().size()); This is wrong because it will fire if Thread A garbage collects and marks Thread B's stack, and Thread B's stack is larger than Thread A's stack. > Source/JavaScriptCore/interpreter/JSStack.cpp:57 > + size_t capacity = Options::maxPerThreadRecursionStackUsage(); Let's take "Recursion" out of this name. The stack usage limit is independent of whether recursion happened. > Source/JavaScriptCore/interpreter/JSStack.cpp:152 > +void JSStack::setHostZoneSize(size_t hostZoneSize) It doesn't make sense for the JSStack class to have a concept of a "host zone size". The whole point of the JSStack class is that, when it is in use, it is segregated from the host stack, and does not contain any host stack data. > Source/JavaScriptCore/interpreter/JSStackInlines.h:163 > + ptrdiff_t maxExcessInRegisters = maxExcessCapacity; If the only point of this value is to work around a compilation issue, you should not give it a new name, which implies, incorrectly, that it has some different meaning than its right hand side. Instead, you should say "ptrdiff_t maxExcessCapacity = JSStack::maxExcessCapacity;".
Mark Lam
Comment 14 2014-01-09 11:09:33 PST
(In reply to comment #13) > (From update of attachment 220745 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=220745&action=review > > > Source/JavaScriptCore/heap/ConservativeRoots.cpp:102 > > + ASSERT(static_cast<size_t>(static_cast<char*>(end) - static_cast<char*>(begin)) < wtfThreadData().stack().size()); > > This is wrong because it will fire if Thread A garbage collects and marks Thread B's stack, and Thread B's stack is larger than Thread A's stack. Good point. But that makes it hard to reinstate this assertion. One option is to have the caller of gatherConservativeRoots() pass the stack bounds in and pipe it through. Or do you have a better idea for how to do this assertion? > > Source/JavaScriptCore/interpreter/JSStack.cpp:57 > > + size_t capacity = Options::maxPerThreadRecursionStackUsage(); > > Let's take "Recursion" out of this name. The stack usage limit is independent of whether recursion happened. OK. Will do. > > Source/JavaScriptCore/interpreter/JSStack.cpp:152 > > +void JSStack::setHostZoneSize(size_t hostZoneSize) > > It doesn't make sense for the JSStack class to have a concept of a "host zone size". The whole point of the JSStack class is that, when it is in use, it is segregated from the host stack, and does not contain any host stack data. This is not true. Some time back, you told me to remove the JSStack::entryChecks() at all the Interpreter::execute*() entry points. And instead, we’ll have a host zone even in the C loop JSStack so that doCallJavaScript() can push the VMEntrySentinel frame without requiring a stack check first. That’s why we have this host zone. > > Source/JavaScriptCore/interpreter/JSStackInlines.h:163 > > + ptrdiff_t maxExcessInRegisters = maxExcessCapacity; > > If the only point of this value is to work around a compilation issue, you should not give it a new name, which implies, incorrectly, that it has some different meaning than its right hand side. Instead, you should say "ptrdiff_t maxExcessCapacity = JSStack::maxExcessCapacity;". OK, will do.
Geoffrey Garen
Comment 15 2014-01-09 12:32:47 PST
Comment on attachment 220745 [details] patch 4: fixed some C loop bit rot from patch 3. View in context: https://bugs.webkit.org/attachment.cgi?id=220745&action=review >>> Source/JavaScriptCore/interpreter/JSStack.cpp:152 >>> +void JSStack::setHostZoneSize(size_t hostZoneSize) >> >> It doesn't make sense for the JSStack class to have a concept of a "host zone size". The whole point of the JSStack class is that, when it is in use, it is segregated from the host stack, and does not contain any host stack data. > > This is not true. Some time back, you told me to remove the JSStack::entryChecks() at all the Interpreter::execute*() entry points. And instead, we’ll have a host zone even in the C loop JSStack so that doCallJavaScript() can push the VMEntrySentinel frame without requiring a stack check first. That’s why we have this host zone. It sounds like you're using this zone for a few purposes: host code, bounded JS usage, and error recovery. So, "host zone" is probably not a great name anymore. How about "reserved zone". r=me, with a follow-up patch for that.
Mark Lam
Comment 16 2014-01-09 13:01:22 PST
Created attachment 220762 [details] patch 5: updated patch for landing. Per our offline discussion, I did not re-instate the stack span assertion in ConservativeRoots::genericAddSpan() (removed it from the patch) because we don't have a meaningful the stack bounds value to assert against there. I'll leave https://bugs.webkit.org/show_bug.cgi?id=126335 as the bug that removed the assertion and update its comment to indicate the reason for the removal of the assertion. I also left the "update" comments in the ChangeLog (for prior commits that point to this change) as Filip and I both thought that it is helpful, with the understanding that these comments will be squashed when we merge back to trunk.
Mark Lam
Comment 17 2014-01-09 13:08:06 PST
Thanks for all the reviews. Landed on the jsCStack branch in r161575: <http://trac.webkit.org/r161575>.
Mark Lam
Comment 18 2014-01-09 13:22:39 PST
(In reply to comment #15) > r=me, with a follow-up patch for that. Bug to rename "host zone" to "reserved zone" is at https://bugs.webkit.org/show_bug.cgi?id=126716.
Note You need to log in before you can comment on or make changes to this bug.