Summary: | BytecodeGenerator ".call" and ".apply" is exponential in nesting depth | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Mike Fikes <mike> | ||||||
Component: | JavaScriptCore | Assignee: | Saam Barati <saam> | ||||||
Status: | RESOLVED FIXED | ||||||||
Severity: | Normal | CC: | benapfiel, buildbot, chris, commit-queue, dieter, fpizlo, ggaren, jfbastien, kapouer, keith_miller, mark.lam, mcatanzaro, Meanjeanaco, mrowe, msaboff, nekohayo, oliver, ryanhaddad, saam, sherbondye, webkit-bug-importer | ||||||
Priority: | P2 | Keywords: | InRadar | ||||||
Version: | 528+ (Nightly build) | ||||||||
Hardware: | Mac (Intel) | ||||||||
OS: | OS X 10.10 | ||||||||
Attachments: |
|
Description
Mike Fikes
2014-12-19 16:51:41 PST
Confirmed reproduced on Linux debian jessie using webkit2gtk 2.6.2 on epiphany browser. WebKitWebProcess goes 100% CPU and consumes many gigabytes as well. I've been able to reproduce using run-jsc from the command line (OS X 10.10.2). The "delay" happens BEFORE any of the functions ever gets to be called. To see this, you can modify the inc function: inc = function(x) { debug(x); return x + 1; }; The observed behavior is: long delay with NO OUTPUT (suggesting the function calls aren't actually 'happening'), then all output is emitted quickly as expected. Tried using -p with jsc to get a profile, but jsc kept running and it started eating huge amounts of memory (> 16GBs just for the test case running with -p). I'll investigate the issue some more. I was able to reproduce this on an older PowerPC Mac running OS X 10.4.11 and Safari 4.1.3 (circa 2010). This still reproduces with Safari Technology Preview Release 23 (Safari 10.2, WebKit 12604.1.5) My guess is we're recursively emitting bytecode in a way that has exponential blowup. (In reply to comment #6) > My guess is we're recursively emitting bytecode in a way that has > exponential blowup. Yeah that's totally what's happening. That's hilarious. We should just have a back-off on that optimization. Like, maintain a count of how deep in the "doubling" due to .call, .apply, or other jneq_ptr-based opts. If more than K deep then don't do the optimization. Probably the most optimal way to do it would be upside down: don't do the optimization if there are more than K people below you in the tree who want to do it, so that the optimization kicks in for the leaves of that gross call tree. (In reply to comment #6) > My guess is we're recursively emitting bytecode in a way that has > exponential blowup. Yup. This is indeed what's happening. (In reply to comment #8) > (In reply to comment #6) > > My guess is we're recursively emitting bytecode in a way that has > > exponential blowup. > > Yup. This is indeed what's happening. We generated 134217673 bytecodes for this function! (In reply to comment #9) > (In reply to comment #8) > > (In reply to comment #6) > > > My guess is we're recursively emitting bytecode in a way that has > > > exponential blowup. > > > > Yup. This is indeed what's happening. > > We generated 134217673 bytecodes for this function! That's epic! (In reply to comment #7) > (In reply to comment #6) > > My guess is we're recursively emitting bytecode in a way that has > > exponential blowup. > > Yeah that's totally what's happening. That's hilarious. > > We should just have a back-off on that optimization. Like, maintain a count > of how deep in the "doubling" due to .call, .apply, or other jneq_ptr-based > opts. If more than K deep then don't do the optimization. > > Probably the most optimal way to do it would be upside down: don't do the > optimization if there are more than K people below you in the tree who want > to do it, so that the optimization kicks in for the leaves of that gross > call tree. Yeah this sounds reasonable to me. For some context, this issue was found here: http://dev.clojure.org/jira/browse/CLJS-910 patch forthcoming. Created attachment 307321 [details]
WIP
needs a changelog
Created attachment 307331 [details]
patch
Hmmm, something looks wrong. I ran tests locally where we always force the "slow" path, and some tests are failing. I think this means that our bytecode is already slightly broken, since we emit the slow path alongside a bytecode branch. I'm gonna try to figure out why tests are failing. (In reply to Saam Barati from comment #16) > Hmmm, something looks wrong. I ran tests locally where we always force the > "slow" path, and some tests are failing. I think this means that our > bytecode is already slightly broken, since we emit the slow path alongside a > bytecode branch. I'm gonna try to figure out why tests are failing. Actually, it's fine. The slow path just emits a different error message. Comment on attachment 307331 [details]
patch
lgtm
Comment on attachment 307331 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=307331&action=review LGTM too with some suggestions and comments. > Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:1206 > + if (m_callOrApplyChildDepth > 4 && emitCallCheck) { The ChangeLog said 6, but you're using a limit of 4 here. Also, should this be >= instead of > because distance of the innermost child from itself is 0, and you wanted to allow up to k levels of calls out from the innermost child, right? Or did I misunderstand your intent? Please also use a symbolic constant here instead of a literal? Maybe define the constant at the top of this file? Alternatively, if this you think this may make a difference for performance tuning, use a new JSC option instead. > Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:1286 > + if (m_callOrApplyChildDepth > 4 && emitCallCheck) { Ditto. > Source/JavaScriptCore/parser/Parser.h:890 > + struct CallOrApplyDepth { I suggest calling this CallOrApplyDepthScope since it is used like a scope. It'll also make the above "callOrApplyDepth ? callOrApplyDepth->maxChildDepth() : 0" line read better as "callOrApplyDepthScope ? callOrApplyDepthScope->maxChildDepth() : 0". I was confused at first when I read "callOrApplyDepth->maxChildDepth()". > Source/JavaScriptCore/parser/Parser.h:895 > + , m_childDepth(m_depth) I suggest renaming this to m_depthOfInnermostChild. The name m_childDepth is a bit ambiguous and can be confused with the depth of my immediate child when this value is actually for tracking the depth of the innermost child. > Source/JavaScriptCore/parser/Parser.h:900 > + size_t maxChildDepth() const I suggest naming this distanceFromInnermostChild(). The name maxChildDepth() is a bit confusing to me because "m_childDepth" (or my preferred "m_depthOfInnermostChild") is an absolute value set from the depth value that each CallOrApplyDepthScope is at, while the value returned by this function is actually a relative value of distance from the innermost child. I suggested "distanceFrom..." which it implies a relative value with the innermost child as the reference. Similarly, I suggest renaming the argument / parameter values to match m_depthOfInnermostChild or distanceFromInnermostChild as appropriate. Comment on attachment 307331 [details] patch Clearing flags on attachment: 307331 Committed r215453: <http://trac.webkit.org/changeset/215453> All reviewed patches have been landed. Closing bug. The test for this change is failing on debug bots: https://build.webkit.org/builders/Apple%20Sierra%20Debug%20JSC%20%28Tests%29/builds/191 stress/call-apply-exponential-bytecode-size.js.default: Exception: RangeError: Maximum call stack size exceeded. stress/call-apply-exponential-bytecode-size.js.default: eval@[native code] stress/call-apply-exponential-bytecode-size.js.default: randomApplyOrCall@call-apply-exponential-bytecode-size.js:40:16 stress/call-apply-exponential-bytecode-size.js.default: global code@call-apply-exponential-bytecode-size.js:44:25 stress/call-apply-exponential-bytecode-size.js.default: ERROR: Unexpected exit code: 3 (In reply to Ryan Haddad from comment #22) > The test for this change is failing on debug bots: > > https://build.webkit.org/builders/Apple%20Sierra%20Debug%20JSC%20%28Tests%29/ > builds/191 > > stress/call-apply-exponential-bytecode-size.js.default: Exception: > RangeError: Maximum call stack size exceeded. > stress/call-apply-exponential-bytecode-size.js.default: eval@[native code] > stress/call-apply-exponential-bytecode-size.js.default: > randomApplyOrCall@call-apply-exponential-bytecode-size.js:40:16 > stress/call-apply-exponential-bytecode-size.js.default: global > code@call-apply-exponential-bytecode-size.js:44:25 > stress/call-apply-exponential-bytecode-size.js.default: ERROR: Unexpected > exit code: 3 Will fix shortly as I address Mark's comments. Addressed Mark's comments and hopefully the debug stack overflow in: https://trac.webkit.org/changeset/215474/webkit |