WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
12535
Stack-optimizing compilers can trick GC into freeing in-use objects
https://bugs.webkit.org/show_bug.cgi?id=12535
Summary
Stack-optimizing compilers can trick GC into freeing in-use objects
Huan Ren
Reported
2007-02-01 13:23:01 PST
on windows platform, I observed quite a few crashes at kjs_pcre_exec(). the cause has been traced back to previous call at KJS::FunctionCallDotNode::evaluate -> KJS::StringImp::toObject -> KJS::StringInstance::StringInstance -> KJS::jsString where the garbage collector frees an object in use. This incorrect free casues memory corruption and leads to later crash at kjs_pcre_exec()
Attachments
trace pointers into bodies
(2.10 KB, patch)
2007-02-01 16:53 PST
,
Feng Qian
no flags
Details
Formatted Diff
Diff
a patch for preformance test
(3.36 KB, patch)
2007-02-05 14:16 PST
,
Feng Qian
mjs
: review-
Details
Formatted Diff
Diff
Proposing fix
(1.38 KB, patch)
2007-03-06 22:38 PST
,
Huan Ren
mjs
: review-
Details
Formatted Diff
Diff
Fix
(1.32 KB, patch)
2007-03-07 11:33 PST
,
Huan Ren
mjs
: review+
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
David Kilzer (:ddkilzer)
Comment 1
2007-02-01 13:37:23 PST
Please post an example URL or a reduced test case to this bug. Thanks! Also, what are you using to test WebKit on Windows? Spinneret or another application?
Huan Ren
Comment 2
2007-02-01 13:49:51 PST
the bug repros randomly as it depends on when GC is invoked. my proposed fix is in kjs::FunctionCallDotNode::evaluate JSValue *FunctionCallDotNode::evaluate(ExecState *exec) { JSValue *baseVal = base->evaluate(exec); // // In the optimized release build, the compiler may discard the reference // to baseObj immediately after it resolves the address of // baseVal->toObject(). This makes baseVal subject to garbage collection // inside baseVal->toObject(). // If the runtime type of baseVal is StringImp, StringImp::toObject() will // invoke a chain of function calls that allocate new objects while passing // its member variable val (type UString) as a parameter. // If Garbage Collector is triggered during memory allocation, baseVal is // deleted and thus its member val, which is passed as a parameter, becomes // invalid. When newly created object references val, we will have memory // corruption. // Collector::protect(baseVal); JSObject *baseObj = baseVal->toObject(exec); Collector::unprotect(baseVal); ...
Feng Qian
Comment 3
2007-02-01 14:55:38 PST
The is caused by a general unsafe practice in KJS code: take the address of a field in a JSCell object, and passed it around. For example: in KJS internal.h class StringImp : public JSCell { private: UString val; } In StringImp::toObject(ExecState* exec);, val is passed as a reference to the constructor StringInstance::StringInstance. Unfortunately, the compiler was smart enough that it thought this pointer is not used anymore, and put it in a volatile register. The constructor triggers GC, and this object is freed. Now the address of 'val' points to an invalid address. The problem is that the garbage collector only compares pointers to the start address of JSCell's. If there is a pointer to the middle of a JSCell, it won't be traced. One partial fix I'd suggest is to treat a JSCell as live if there is a root pointer pointing to its body. This will allow taking addresses of a field in a JSCell and passing it to stack/local variables. BUT this cannot solve the problem when such an address is assigned to somewhere in the heap. Comments?
Geoffrey Garen
Comment 4
2007-02-01 15:12:47 PST
(In reply to
comment #2
) Anrong, did you mean to say that the compiler may discard the reference to 'baseObj' or to 'baseVal'? Your reasoning points to 'baseVal', not 'baseObj'. Would this bug persist if toObject() made use of 'this', prohibiting the compiler from optimizing out baseVal? I don't think 'Collector::protect(baseVal);' is a very good solution. It's inefficient, and it doesn't seem to address the root cause of the problem, which may affect lots of different parts of the code.
Geoffrey Garen
Comment 5
2007-02-01 15:23:28 PST
(In reply to
comment #3
) I don't think pointers in the (non-GC) heap are a consideration. You already have to protect them explicitly. Honoring pointers to object bodies as pointers to the objects themselves seems like it would solve the problem. I worry that it might increase marking overhead substantially because we would start honoring more 'fake' pointers. Another solution would be for objects to create an explicit, volatile stack reference to 'this' before handing off pointers to their internals. Yet another solution (in this particular case) would be to store a temporary UString on the stack, and pass that UString by reference.
Feng Qian
Comment 6
2007-02-01 16:07:37 PST
(In reply to
comment #5
)
> (In reply to
comment #3
) > > I don't think pointers in the (non-GC) heap are a consideration. You already > have to protect them explicitly. > > Honoring pointers to object bodies as pointers to the objects themselves seems > like it would solve the problem. I worry that it might increase marking > overhead substantially because we would start honoring more 'fake' pointers.
I think the overhead might no much. Instead of checking if (offset % sizeof(CollectorCell) == 0) It does some arithmatic calculation: x = cellStart + (offset / sizeof(CollectorCell)) * sizeof(CollectorCell); We need some additional work for oversized cells. Size infomation is needed. We can malloc an additional int before the requested size, and store the cell size into it. This works like a Java array with length in its header. BTW, I experimented this change, and it works properly. Overall, these only cost a few instruction cycles when scanning root pointers. I think it is not a big issue. I will post a patch later.
> > Another solution would be for objects to create an explicit, volatile stack > reference to 'this' before handing off pointers to their internals. Yet another > solution (in this particular case) would be to store a temporary UString on the > stack, and pass that UString by reference. >
(In reply to
comment #5
)
> (In reply to
comment #3
) > > I don't think pointers in the (non-GC) heap are a consideration. You already > have to protect them explicitly. > > Honoring pointers to object bodies as pointers to the objects themselves seems > like it would solve the problem. I worry that it might increase marking > overhead substantially because we would start honoring more 'fake' pointers. > > Another solution would be for objects to create an explicit, volatile stack > reference to 'this' before handing off pointers to their internals. Yet another > solution (in this particular case) would be to store a temporary UString on the > stack, and pass that UString by reference. >
Feng Qian
Comment 7
2007-02-01 16:53:43 PST
Created
attachment 12864
[details]
trace pointers into bodies The patch doesn't following the format required by webkit.org, I put it here for discussion.
Maciej Stachowiak
Comment 8
2007-02-01 17:23:39 PST
(In reply to
comment #6
) >
> > Honoring pointers to object bodies as pointers to the objects themselves seems > > like it would solve the problem. I worry that it might increase marking > > overhead substantially because we would start honoring more 'fake' pointers. > > I think the overhead might no much. Instead of checking > if (offset % sizeof(CollectorCell) == 0) > It does some arithmatic calculation: > x = cellStart + (offset / sizeof(CollectorCell)) * sizeof(CollectorCell); > > We need some additional work for oversized cells. Size infomation is needed. We > can malloc an additional int before the requested size, and store the cell size > into it. This works like a Java array with length in its header. BTW, I > experimented this change, and it works properly. > > Overall, these only cost a few instruction cycles when scanning root pointers. > I think it is not a big issue.
The problem isn't the extra instructions to do the math. It's the cost of extra false positives. Note that right now before scanning the blocks we check: if (IS_CELL_ALIGNED(x) && x) { IS_CELL_ALIGNED checks for 8-byte alignment, which is not valid if we want to honor arbitrary pointers into object bodies. If we checked every nonzero value on the stack, we would do the block loop a lot more, which would be a significant performance regression.
Huan Ren
Comment 9
2007-02-01 18:07:28 PST
(In reply to
comment #4
)
> (In reply to
comment #2
) > > Anrong, did you mean to say that the compiler may discard the reference to > 'baseObj' or to 'baseVal'? Your reasoning points to 'baseVal', not 'baseObj'. > Would this bug persist if toObject() made use of 'this', prohibiting the > compiler from optimizing out baseVal? > > I don't think 'Collector::protect(baseVal);' is a very good solution. It's > inefficient, and it doesn't seem to address the root cause of the problem, > which may affect lots of different parts of the code. >
Geoffrey, I meant baseVal. The bug goes away if toObject() made use of 'this' thus effectively lock baseVal. Collector::protect(baseVal) fixes this code path. if this is a common pattern then other solutions being discussed here may be better if they have no global impact.
Huan Ren
Comment 10
2007-02-01 22:07:47 PST
(In reply to
comment #9
)
> (In reply to
comment #4
) > > (In reply to
comment #2
) > > > > Anrong, did you mean to say that the compiler may discard the reference to > > 'baseObj' or to 'baseVal'? Your reasoning points to 'baseVal', not 'baseObj'. > > Would this bug persist if toObject() made use of 'this', prohibiting the > > compiler from optimizing out baseVal? > > > > I don't think 'Collector::protect(baseVal);' is a very good solution. It's > > inefficient, and it doesn't seem to address the root cause of the problem, > > which may affect lots of different parts of the code. > > > > Geoffrey, > > I meant baseVal. The bug goes away if toObject() made use of 'this' thus > effectively lock baseVal. > Collector::protect(baseVal) fixes this code path. if this is a common pattern > then other solutions being discussed here may be better if they have no global > impact. >
The following fix has also been verified: #if COMPILER(MSVC) #pragma optimize("", off) #endif JSObject *StringImp::toObject(ExecState *exec) const { const StringImp *ptr = this; return new StringInstance(exec->lexicalInterpreter()->builtinStringPrototype(), ptr->val); } #if COMPILER(MSVC) #pragma optimize("", on) #endif I agree this fix is better than Collector::protect as it addresses all calling path. Turning off optimization has little cost here since the function is simple enough. (It turns out the compiler will optimize the variable away even we declare the local to be volatile) I could provide the patch.
Geoffrey Garen
Comment 11
2007-02-01 22:50:36 PST
Is that really the only function hit by this bug?
Feng Qian
Comment 12
2007-02-02 11:14:58 PST
(In reply to
comment #8
)
> (In reply to
comment #6
) > > > > > Honoring pointers to object bodies as pointers to the objects themselves seems > > > like it would solve the problem. I worry that it might increase marking > > > overhead substantially because we would start honoring more 'fake' pointers. > > > > I think the overhead might no much. Instead of checking > > if (offset % sizeof(CollectorCell) == 0) > > It does some arithmatic calculation: > > x = cellStart + (offset / sizeof(CollectorCell)) * sizeof(CollectorCell); > > > > We need some additional work for oversized cells. Size infomation is needed. We > > can malloc an additional int before the requested size, and store the cell size > > into it. This works like a Java array with length in its header. BTW, I > > experimented this change, and it works properly. > > > > Overall, these only cost a few instruction cycles when scanning root pointers. > > I think it is not a big issue. > > The problem isn't the extra instructions to do the math. It's the cost of extra > false positives. Note that right now before scanning the blocks we check: > > if (IS_CELL_ALIGNED(x) && x) { > > IS_CELL_ALIGNED checks for 8-byte alignment, which is not valid if we want to > honor arbitrary pointers into object bodies. > > If we checked every nonzero value on the stack, we would do the block loop a > lot more, which would be a significant performance regression. >
Maciej, is there a performance test suite to measure how much overhead it would be? Maybe we can think about using a interval tree instead of the block loop to speed up the range test. Another solution is to fix these classes returning field addresses. It may not be hard to do, but it cannot prevent violations in the future.
Huan Ren
Comment 13
2007-02-02 11:18:25 PST
(In reply to
comment #11
)
> Is that really the only function hit by this bug? >
StringImp::toObject is the only one I discovered during debugging. Is passing member variables of JSCell in function call a common pattern in kjs?. I did a quick search on other toObject() functions under JavaScriptCore/kjs and did not find this issue.
Alexey Proskuryakov
Comment 14
2007-02-03 03:50:18 PST
> #if COMPILER(MSVC) > #pragma optimize("", off) > #endif
Is there a more cross-platform way to defeat the stack re-use? Perhaps coping the value into memory allocated by alloca() could help.
Maciej Stachowiak
Comment 15
2007-02-04 11:36:33 PST
We use JavaScript iBench (not publicly available) and the 24fun BenchJs benchmark:
http://www.24fun.com/downloadcenter/benchjs/benchjs.html
I'm happy to performance test any proposed patch.
Maciej Stachowiak
Comment 16
2007-02-04 11:49:26 PST
<
rdar://problem/4975127
>
Huan Ren
Comment 17
2007-02-05 10:53:21 PST
(In reply to
comment #14
)
> > #if COMPILER(MSVC) > > #pragma optimize("", off) > > #endif > > Is there a more cross-platform way to defeat the stack re-use? Perhaps coping > the value into memory allocated by alloca() could help. >
If we use alloca(), there is still a possibility that the compiler may optimize it away, just like what it does to baseVal described earlier. There seems no gruantee to prevent all compilers from doing that. For compilers used on other platform, we can consider also using pragma to turn off optimization, probably with different syntax, for this funciton if needed.
Feng Qian
Comment 18
2007-02-05 11:30:12 PST
(In reply to
comment #13
)
> (In reply to
comment #11
) > > Is that really the only function hit by this bug? > > > > StringImp::toObject is the only one I discovered during debugging. Is > passing member variables of JSCell in function call a common pattern in kjs?. I > did a quick search on other toObject() functions under JavaScriptCore/kjs and > did not find this issue. >
I sent my reply to Mark. Here is it again: Please correct me if I am wrong. A member function returning the address of/a reference to a field could also be a violator. I found a few cases: class InternalFunctionImp const Identifier& functionName() const { return m_name; } private: Identifier m_name; class Identifier { const UString& ustring() const { return _ustring; } private: UString _ustring; } One can write some code like: InternalFunctionImpl* func = new InternalFunctionImpl(); Identifier* id = func->functionName(); UString* v = id->ustring(); func = NULL; // trigger GC here // now id and v points to invalid memory Other similar cases: class FunctionImpl: const ScopeChain& scope() const { return _scope; } private: ScopeChain _scope; I went through these files so far.
Feng Qian
Comment 19
2007-02-05 14:16:38 PST
Created
attachment 12947
[details]
a patch for preformance test Maciej, I created this patch by using svn-create-patch script. Please use this patch to measure the performance on benchmarks you have. New code can be turned on/off by changing the value of ALLOW_INNER_POINTERS to 1/0.
Maciej Stachowiak
Comment 20
2007-02-07 00:48:02 PST
Comment on
attachment 12947
[details]
a patch for preformance test Flagging review? for someone at Apple (probably me) to perf-test this.
Maciej Stachowiak
Comment 21
2007-02-09 18:10:32 PST
I measured and this patch causes a 2% performance regression on the iBench, at least on Mac OS X (where it is not even needed since the compiler won't optimize out object pointers like that). I think this needs to be addressed more along the lines of Anrong Hu's approach, where we keep the compiler from dropping the pointer. I think copying to alloca()'d memory, or even just keeping it in a struct on the stack might do it, and something that works for current MSVC seems good enough.
Maciej Stachowiak
Comment 22
2007-02-09 22:46:12 PST
Comment on
attachment 12947
[details]
a patch for preformance test r- for reasons stated above
Feng Qian
Comment 23
2007-02-10 22:19:24 PST
(In reply to
comment #21
)
> I measured and this patch causes a 2% performance regression on the iBench, at > least on Mac OS X (where it is not even needed since the compiler won't > optimize out object pointers like that). I think this needs to be addressed > more along the lines of Anrong Hu's approach, where we keep the compiler from > dropping the pointer. I think copying to alloca()'d memory, or even just > keeping it in a struct on the stack might do it, and something that works for > current MSVC seems good enough. >
I am fine with Anrong's approach, but that won't cure the problem. What about another solution: fixing classes that return the address/a reference to a field? Like I mentioned in
Comment #18
, a method can return the value instead of a reference/address of the field.
Huan Ren
Comment 24
2007-02-15 12:41:43 PST
(In reply to
comment #23
)
> (In reply to
comment #21
) > > I measured and this patch causes a 2% performance regression on the iBench, at > > least on Mac OS X (where it is not even needed since the compiler won't > > optimize out object pointers like that). I think this needs to be addressed > > more along the lines of Anrong Hu's approach, where we keep the compiler from > > dropping the pointer. I think copying to alloca()'d memory, or even just > > keeping it in a struct on the stack might do it, and something that works for > > current MSVC seems good enough. > > > > I am fine with Anrong's approach, but that won't cure the problem. What about > another solution: fixing classes that return the address/a reference to a > field? > Like I mentioned in
Comment #18
, a method can return the value instead of a > reference/address of the field. >
I will submit a fix.
Huan Ren
Comment 25
2007-02-15 13:15:26 PST
(In reply to
comment #24
)
> (In reply to
comment #23
) > > (In reply to
comment #21
) > > > I measured and this patch causes a 2% performance regression on the iBench, at > > > least on Mac OS X (where it is not even needed since the compiler won't > > > optimize out object pointers like that). I think this needs to be addressed > > > more along the lines of Anrong Hu's approach, where we keep the compiler from > > > dropping the pointer. I think copying to alloca()'d memory, or even just > > > keeping it in a struct on the stack might do it, and something that works for > > > current MSVC seems good enough. > > > > > > > I am fine with Anrong's approach, but that won't cure the problem. What about > > another solution: fixing classes that return the address/a reference to a > > field? > > Like I mentioned in
Comment #18
, a method can return the value instead of a > > reference/address of the field. > > > > > I will submit a fix. >
I'd like to assign this bug to me. There are also other cross-platform p1 bugs I want to investigate. Is there a way to assign bugs to myself through Bugzilla?
Geoffrey Garen
Comment 26
2007-02-15 14:17:24 PST
I think you can just use the "Reassign" radio button.
Maciej Stachowiak
Comment 27
2007-02-20 00:36:46 PST
Returning a value instead of a pointer into the object seems fine. We should probably document that as a rule.
Ed Rowe
Comment 28
2007-02-20 01:17:58 PST
Honoring pointers to object bodies as pointers to the objects themselves would solve THIS problem but not the general class of issues. Imagine if rather than passing a reference to a member UString to jsString() we were passing a file handle, or a raw pointer (that needs to be deleted) or basically any sort of state whose lifetime is bound to the JSValue and which is not a pointer into the JSValue's memory. In any of these cases, the compiler would still optimize away any pointer to the object or its body, and we'd still have the problem. (In reply to
comment #5
)
> (In reply to
comment #3
) > I don't think pointers in the (non-GC) heap are a consideration. You already > have to protect them explicitly. > Honoring pointers to object bodies as pointers to the objects themselves seems > like it would solve the problem. I worry that it might increase marking > overhead substantially because we would start honoring more 'fake' pointers. > Another solution would be for objects to create an explicit, volatile stack > reference to 'this' before handing off pointers to their internals. Yet another > solution (in this particular case) would be to store a temporary UString on the > stack, and pass that UString by reference.
Huan Ren
Comment 29
2007-03-06 22:38:40 PST
Created
attachment 13507
[details]
Proposing fix A more cross-platform fix as suggested: using alloca() and putting the object reference onto stack.
Maciej Stachowiak
Comment 30
2007-03-07 00:36:31 PST
Comment on
attachment 13507
[details]
Proposing fix 1) This will break the build on mac and linux since they call it alloca() not _alloca() and prototype it in alloca.h, not malloc.h. Please fix that. 2) Is this definitely the only place affected? It looks like further up in the comments, Ian Eng identified more potential trouble spots. 3) To fix just this one particular case, wouldn't it be simpler to just copy val to a local UString variable on the stack? That avoids relying on something unportable like alloca(). UString valCopy = val; return new StringInstance(exec->lexicalInterpreter()->builtinStringPrototype(), valCopy); r- to at least address
comment 1
, since this will break the build otherwise.
Huan Ren
Comment 31
2007-03-07 11:33:41 PST
Created
attachment 13519
[details]
Fix
Huan Ren
Comment 32
2007-03-07 11:34:38 PST
(In reply to
comment #31
)
> Created an attachment (id=13519) [edit] > Fix >
Re: (2) As a recap, the calling pattern that makes this happen is: A->B->C. (a) In A, the reference to B is optimized away. (b) In B, its data member is passed as parameter to C. (c) C does sth that may trigger gc, and then accesses B's member. Condition (b) is the culprit, and exists in other places in the code base. Ian identified some of them along the discussion. A crash needs all of (a), (b), (c) and this place is the only one we discovered so far. I'd think we just fix this crash now while looking for long term solution to address (b) in general. The step towards general solution, which I am looking into, is finding all instances of (b) so we have a better understanding of the scope of the problem. Re: (1), (3) So alloca is not portable either. (3) works as I verified it although future version of compiler might still optimize the reference away. Using MSVC macro works with all versions of VC compiler but it is platform dependent. (3) seems to be the best for now. New patch attached.
Maciej Stachowiak
Comment 33
2007-03-07 11:36:52 PST
Comment on
attachment 13519
[details]
Fix r=me (although the ChangeLog comment is wrong now in mentioning alloca).
Mark Rowe (bdash)
Comment 34
2007-03-07 17:57:55 PST
Landed in
r20043
. Can you please ensure that your ChangeLog entries use spaces rather than tabs for indentation in future.
Feng Qian
Comment 35
2007-03-30 15:45:20 PDT
(In reply to
comment #27
)
> Returning a value instead of a pointer into the object seems fine. We should > probably document that as a rule. >
Saw this bug in another platform using a stocked version of webkit. The compiler in use is GCC for Linux. Exact the same place. I missed the case in
comment #18
when a member field address of 'this' is passed to a callee. 'This' pointer is optimized away and 'this' object is considered as dead by GC. GC is triggered in the callee. Solution I mentioned in
comment #18
won't fix this problem.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug