Bug 52719

Summary: ScriptExecutionContext::stopActiveDOMObjects iterates a hash map that can change during iteration (for multiple reasons, including GC)
Product: WebKit Reporter: Jeremy Orlow <jorlow>
Component: WebCore JavaScriptAssignee: Darin Adler <darin>
Status: RESOLVED FIXED    
Severity: Normal CC: commit-queue, darin, esprehn+autocc, ggaren, kangil.han
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
Patch
none
Patch ap: review+

Description Jeremy Orlow 2011-01-19 06:43:16 PST
XHR::stop should not call reportExtraMemoryCost() (or should do it more carefully) because it can cause a GC (in at least JSC) "..because XHR::stop ends up calling reportExtraMemoryCost() for JSC, and that can collect the unprotected wrapper of another XHR (the current one is still protected at the moment)"

(Sorry if the bug report is less than ideal, but I don't fully understand what's going on here...)

Context:
[6:54pm] ap: jorlow: ayt?
[6:54pm] jorlow: ap: what's up?
[6:55pm] ap: jorlow: I'm finally looking at your patch again, and I'm wondering why we even need to stop or abort transactions
[6:55pm] jorlow: so that when you leave a page stuff doesn't keep running
[6:55pm] ap: jorlow: it seems like requests need to be stopped, not transactions
[6:55pm] jorlow: no
[6:55pm] • ap listens
[6:55pm] jorlow: if you leave a page, a transaction can't ever be committed
[6:56pm] jorlow: requests can't be aborted
[6:56pm] jorlow: only transactions
[6:57pm] ap: jorlow: ok. then looking at it from a different angle, why are requests ActiveDOMObjects?
[6:57pm] jorlow: they need to be...it might be a fair question why IDBTransactions are though
[6:57pm] jorlow: requests need to be because if someone is listening to onsuccess or onerror, then it's possible they'll abort the transaciton or schedule another operation in that handler
[6:58pm] jorlow: and that the transaction committed at that point would be a logic error
[6:59pm] jorlow: transactions, on the other hand, only fire their event listeners after abort or complete happens
[6:59pm] jorlow: hm.....ok.....so by my earlier logic, maybe we should be aborting transactions with ::stop either way....  
[6:59pm] jorlow: i.e. if they navigate away, that's a signal they want the app to stop doing work
[6:59pm] jorlow: do event targets need to be active dom objects?
[7:00pm] ap: jorlow: only if nothing else can guarantee that they live long enough
[7:00pm] ap: jorlow: e.g. XMLHttpRequest is an ActiveDOMObject, but XMLHttpRequest.upload is not
[7:01pm] ap: jorlow: because XMLHttpRequest keeps XMLHttpRequestUpload alive
[7:01pm] jorlow: hmm
[7:01pm] ap: jorlow: we just need to pick a central object that will outlive others, and manage their lifetime
[7:01pm] jorlow: they may not need to be active dom objects then
[7:01pm] jorlow: let me think on this some
[7:02pm] jorlow: (and grab dinner...brb)
[7:02pm] ap: jorlow: bon appetit 
[7:05pm] ap: jorlow: (well, manage lifetime and suspend/stop/resume), of course. Since requests belong to transactions, the latter should be able to centralize everything
[7:20pm] jorlow: ap: so the one reason left why we need these to be active dom objects is that we need the page not to go into cache
[7:20pm] jorlow: any suggestions on how else to do that?
[7:20pm] jorlow: ap: er....page go into page cache
[7:21pm] ap: jorlow: either transaction or request probably needs to be an active dom object, right?
[7:21pm] ap: jorlow: otherwise, what would protect them from being destroyed if there are no JS references to them
[7:22pm] jorlow: ap: yeah...make sense
[7:23pm] jorlow: but i think then we're back to a situation where ::stop() can cause the active dom object to be killed
[7:23pm] jorlow: ap: the IDBTransaction needs to be active...not IDBRequest, btw
[7:23pm] jorlow: since active transactions are the reason it can't go to cache
[7:24pm] ap: jorlow: sounds reasonable
[7:24pm] jorlow: but the IDBTransactionBackendImpl needs to have a ref to the IDBTransaction.  And when abort happens, that link is broken
[7:24pm] jorlow: and that can cause IDBTransaction to go to 0 refs
[7:25pm] jorlow: ap^^
[7:28pm] ap: jorlow: hmm... ok, so what happens in XMLHttpRequest case? ScriptExecutionContext::stopActiveDOMObjects() calls XMLHTTPRequest::stop() - so far, the same as with IDB, right?
[7:28pm] ap: jorlow: XMLHTTPRequest::stop() supposedly releases the last ref, and we should be getting into ScriptExecutionContext::destroyedActiveDOMObject()
[7:28pm] jorlow: hmmm
[7:28pm] ap: jorlow: so either we have the same problem with XHR, or there's some witty trick
[7:33pm] ap: jorlow: ah, I know what the trick is
[7:34pm] jorlow: ap: yeah?
[7:34pm] ap: jorlow: XMLHttpRequest always has a JS wrapper, which keeps a reference to it
[7:34pm] jorlow: why?
[7:35pm] ap: jorlow: there is no way to create an XMLHttpRequest without creating a wrapper
[7:35pm] jorlow: But couldn't the wrapper get GCed?
[7:36pm] ap: jorlow: not while there's an ongoing request (that's what ActiveDOMObject is primarily about)
[7:36pm] ap: jorlow: and if there is no request, and the wrapper gets GC'ed, then XMLHttpREquest itself is destroyed, too
[7:37pm] ap: jorlow: now, that doesn't quite work even for XHR
[7:38pm] ap: jorlow: because XHR::stop ends up calling reportExtraMemoryCost() for JSC, and that can collect the unprotected wrapper of another XHR (the current one is still protected at the moment)
[7:39pm] jorlow: ap: ouch..that's subtle
[7:39pm] jorlow: or any other active dom object, for that matter
[7:39pm] ap: jorlow: if there we really no JS code running from stop(), then the extra reference we're looking for would have been guaranteed by the wrapper
[7:40pm] ap: jorlow: but even XHR violates this requirement by forcing GC sometimes
[7:40pm] jorlow: ap: to be honest, i still don't fully understand why...I think maybe I just need to look at the code with these comments in mind (can't at the moment)
[7:40pm] jorlow: but it also seems as though maybe it doesn't matter
[7:40pm] MX80 left the chat room. (Remote host closed the connection)
[7:41pm] jorlow: and that we need a more general solution to the problem
[7:48pm] jorlow: is Helder Correia on irc?
[7:49pm] jorlow: ap would you agree with my last comment?
[7:49pm] jorlow: my thinking is that the subtlety of these 2 issues points to the need for a more general solution....maybe the one I originally proposed
[7:50pm] ap: jorlow: I don't know. Seems easy enough to fix XMLHttpRequest to not force GC from stop(), but it's indeed subtle
[7:50pm] jorlow: I think removing active dom objects during iteration is probably much more likely and less determinisitc than adding
[7:50pm] jorlow: and it kind of seems like it's only a matter of time before we see another one of these
[7:51pm] jorlow: personally, I'd rather go with the general solution...I'm not super happy with the one I presented, but I can't think of a better one
[7:51pm] ap: jorlow: if we go with your original solution, I'd prefer that we don't do the contains() check though
[7:51pm] jorlow: ap: how can we avoid it?
[7:52pm] jorlow: I guess if the vector we iterate over was a member variable we could
[7:52pm] ap: jorlow: having a no-longer-active inactive object be destroyed while stopping further objects is kind of normal indeed
[7:52pm] jorlow: by moving the last element in to replace the removed element
[7:53pm] ap: jorlow: but the fact that stopping one object can indirectly destroy others seems evil
[7:53pm] jorlow: ap: aren't all objects stopped or none?
[7:53pm] jorlow: and why is it evil?
[7:53pm] jorlow: i feel like the IDB case of doing this is pretty legit
[7:53pm] jorlow: even if the XHR one is kinda shady
[7:54pm] ap: jorlow: my thinking is that an active object is fairly independent, and ScriptExecutionContext manages these
[7:54pm] ap: jorlow: that's why there are so few guarantees in this iteration code
[7:55pm] ap: jorlow: having a "sub-manager" inside IDB sounds complicated
[7:55pm] ap: jorlow: and this complicated situation is what we have now - ScriptExecutionContext can stop an active object, but also one IDB active object can stop another one
[7:55pm] jorlow: ap: actually, i was just going to suggest that maybe we should have one active dom object within IDB that they all share
[7:56pm] ap: jorlow: right, and if we do that, we wont need a contains() check, correct?
[7:56pm] jorlow: we wouldn't need anything in my patch actually
[7:56pm] jorlow: if we fix the XHR thing as well
[7:56pm] jorlow: (though we should put in the ASSERTs you suggested)
[7:56pm] ap: jorlow: ok, so how about this plan:
[7:57pm] ap: 1. Fix XHR
[7:57pm] ap: 2. Fix IDB to have one active dom object
[7:57pm] ap: 3. add an assertion
[7:57pm] jorlow: ap: can I file a bug on you for #1?
[7:57pm] ap: 4. consider making a copy in a vector to make this less subtle
[7:57pm] ap: jorlow: sure, that would be great
[7:57pm] jorlow: k...will do
[7:57pm] jorlow: not sure about 4 tho
[7:58pm] jorlow: I feel like the rest of this is a replacement for it
[7:58pm] jorlow: I'll put a nice change log comment in I guess
[7:58pm] ap: jorlow: it says "consider", not "do" 
[7:58pm] jorlow: k
[7:58pm] jorlow: I'll try to summarize these in the bugs
[7:58pm] jorlow: might not get aroudn to filing until tomorrow though
Comment 1 Alexey Proskuryakov 2011-01-23 15:04:53 PST
So, here is the bug. ScriptExecutionContext::stopActiveDOMObjects() iterates over a HashMap of ActiveDOMObjects, naturally expecting that the HashMap doesn't change.

However, stop() can reduce the reference count of the object, and there is no strong guarantee that it won't be deleted. For XMLHttpRequest in particular, it's possible that the only remaining reference is kept by its JS wrapper - and since the wrapper is now unprotected, garbage collection can destroy it! And then, the only protection is that JavaScriptCore GC doesn't actually delete objects until something is allocated in their place.

So, calling reportExtraMemoryCost() from XMLHttpRequest::stop() is particularly dangerous, because it can start garbage collection, which will delete previously unprotected active DOM objects. I'm less sure now that we should actually fix XMLHttpRequest, and not unsafe iteration.

> [7:56pm] jorlow: (though we should put in the ASSERTs you suggested)

In fact, HashMap already has assertions for invalidated iterators, which should fail if an iterator is used after its map has changed.
Comment 2 Alexey Proskuryakov 2011-01-23 17:57:53 PST
We should try to add stronger assertions though (that JS code doesn't run, or that GC isn't triggered).
Comment 3 Geoffrey Garen 2011-01-24 09:34:40 PST
> I'm less sure now that we should actually fix XMLHttpRequest, and not unsafe iteration.

I think the right solution is to fix unsafe iteration. The simplest solution is probably to copy the hash map to a vector of RefPtr before iterating.

It's very difficult to establish a "nothing must happen" point in WebCore -- no JS execution, no GC, etc. And even if you did establish such a thing, it would be very difficult to test all the edge cases and make sure it worked.

Plus, I don't think we want to lose the memory footprint optimization of reporting extra cost once an XHR becomes orphaned.
Comment 4 Alexey Proskuryakov 2011-01-24 10:07:08 PST
The problem is that we still need to require no JS execution - because JS could add more active objects. So, guarding only against object destruction but not creation would be inconsistent.

Also, there's a technical issue with ActiveDOMObjects not being refcounted themselves, so you can't just make a Vector<RefPtr<ActiveDOMObject> >.
Comment 5 Geoffrey Garen 2011-01-24 10:59:38 PST
(In reply to comment #4)
How about this, for solving the iteration problem:
while (x = take(begin()))
    x->stop();
Comment 6 Alexey Proskuryakov 2011-01-24 11:07:35 PST
Then we'd have a freezing problem with code that actually runs JS and adds new ActiveDOMObjects.

I don't think there is any safe way to iterate without imposing any restrictions on code that runs during iteration - the question is what requirements we can practically enforce in this particular case.
Comment 7 Jeremy Orlow 2011-01-24 11:15:36 PST
(In reply to comment #1)
> In fact, HashMap already has assertions for invalidated iterators, which should fail if an iterator is used after its map has changed.

Sure, but we should also assert that we're not in the middle of iterating when we add/remove an object.  Asserting later keeps us from doing something silly, but doesn't make it easy to see what the problem is.

(In reply to comment #4)
> The problem is that we still need to require no JS execution - because JS could add more active objects. So, guarding only against object destruction but not creation would be inconsistent.

FWIW, I'm not sure this is such a big deal.


(In reply to comment #5)
> (In reply to comment #4)
> How about this, for solving the iteration problem:
> while (x = take(begin()))
>     x->stop();

That won't work because often need to iterate without destroying the list.

In https://bugs.webkit.org/show_bug.cgi?id=52366 I had a solution that copied to a vector and double checked each element was still in the map before calling stop.  I wasn't super happy with it, but it might be the best option if we want to support ActiveDOMObjects being destroyed during iteration.

I tried hacking around the ActiveDOMObject not being ref counted problem in several different ways, but none of them were even close to clean enough that I think they'd be acceptable.
Comment 8 Darin Adler 2014-04-19 22:56:22 PDT
We don’t need to allow arbitrary modification of the hash map, but we do need to allow removal from the map during the stopActiveDOMObject process. That’s the only practical problem.

Maybe later we can make an even better model, but this will shore things up for now.
Comment 9 Darin Adler 2014-04-19 22:59:58 PDT
(In reply to comment #7)
> In https://bugs.webkit.org/show_bug.cgi?id=52366 I had a solution that copied to a vector and double checked each element was still in the map before calling stop.  I wasn't super happy with it, but it might be the best option if we want to support ActiveDOMObjects being destroyed during iteration.

Yes, I think that is indeed what we want.
Comment 10 Darin Adler 2014-04-19 23:22:54 PDT
Created attachment 229757 [details]
Patch
Comment 11 Alexey Proskuryakov 2014-04-19 23:50:37 PDT
Comment on attachment 229757 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=229757&action=review

> Source/WebCore/ChangeLog:61
> +        (WebCore::ScriptExecutionContext::didCreateActiveDOMObject): Ditto. Got rid of the
> +        CRASH because there's no compelling value to this being a release assert instead of a
> +        normal assert.

Actually, I think that this release assert is critically important. It is very very bad to have unstopped active dom objects after stopActiveDOMObjects().

The idea of this patch looks right to me, but Id like to have another look when it's not so late at night.

It's also worth documenting the restrictions on ActiveDOMObjects somewhere more persistent - the fact that stop() is allowed to delete the object, but is not allowed to execute arbitrary JS is not obvious. Also, what is allowed in suspend() or resume()?

> Source/WebCore/dom/ScriptExecutionContext.cpp:279
> +    for (auto* object : objects) {

So the usual argument for why auto is good is that variable name has all the information a type name has.

It's hardly possible to say less than "object" says!
Comment 12 Darin Adler 2014-04-20 10:28:35 PDT
Created attachment 229767 [details]
Patch
Comment 13 Alexey Proskuryakov 2014-04-20 11:59:53 PDT
Comment on attachment 229767 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=229767&action=review

r=me. Nice patch!

> Source/WebCore/ChangeLog:61
> +        (WebCore::ScriptExecutionContext::didCreateActiveDOMObject): Ditto. Got rid of the
> +        CRASH because there's no compelling value to this being a release assert instead of a
> +        normal assert.

This comment is now obsolete.

> Source/WebCore/ChangeLog:84
> +        * workers/WorkerGlobalScope.h: Make hasPendingActivity function from the base class
> +        public instead of declaring it in this class.

Longer term, I'd like to come up with a better name for it, it's quite unclear which "pending activity" we are talking about. The workers spec has changed a lot, it might have a better name and/or design for this concept now.

> Source/WebCore/dom/ScriptExecutionContext.cpp:188
> +    // No protection against m_activeDOMObjects changing during iteration: canSuspend()
> +    // functions should not add new active DOM objects, nor execute arbitrary JavaScript.

The reworded comment doesn't seem quite accurate, as removing active DOM objects isn't OK either (and there is a debug only assertion for that).

I thought that we had a way to assert that JS was not executed, but I can't find it now. I only found ScriptExecutionContext::isJSExecutionForbidden(), which is a different thing, and seems like a misnomer.

It would be very useful to catch mistakes with unexpected JS execution sooner, not just when the script does something bad.

> Source/WebCore/dom/ScriptExecutionContext.cpp:221
> +    // No protection against m_activeDOMObjects changing during iteration: suspend()
> +    // functions should not add new active DOM objects, nor execute arbitrary JavaScript.

Ditto.

> Source/WebCore/dom/ScriptExecutionContext.cpp:249
> +    // No protection against m_activeDOMObjects changing during iteration: resume()
> +    // functions should not add new active DOM objects, nor execute arbitrary JavaScript.

Ditto.

> Source/WebCore/dom/ScriptExecutionContext.cpp:277
> +    for (auto* object : objects) {

...

> Source/WebCore/dom/ScriptExecutionContext.cpp:313
> +    // The m_activeDOMObjectAdditionForbidden check is a RELEASE_ASSERT because of the
> +    // consequences of having an ActiveDOMObject that is not correctly reflected in the set.
> +    // If we do have one of those, it can possibly be a security vulnerability. So we'd
> +    // rather have a crash than continue running with the set possibly compromised.

suspendActiveDOMObjectIfNeeded() mechanism makes this almost unnecessary, because even objects that are created during iteration will be created suspended or stopped.

But suspendActiveDOMObjectIfNeeded() itself is very fragile - every constructor has to call suspendIfNeeded() for it to work - so it's best to keep this protection I think.
Comment 14 Geoffrey Garen 2014-04-20 12:09:22 PDT
Comment on attachment 229767 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=229767&action=review

> Source/WebCore/dom/ScriptExecutionContext.cpp:109
> +// FIXME: Should we make this a member function of HashSet?
> +// Which name is best: takeOne, takeFirst, or takeRandom?

"takeAny"?

I like "takeOne" too.

"Random" is not so good because the results are not guaranteed to be random, so you'd be mistaken to use this function in a context like shuffling or ASLR.

"First" implies some kind of order, which is not quite true of a hash table.

Member function or not, I think it would be nice to put this function in HashSet.h, so readers of that header will see that you can do this sort of thing.
Comment 15 Darin Adler 2014-04-20 20:09:29 PDT
(In reply to comment #13)
> I thought that we had a way to assert that JS was not executed, but I can't find it now.

I remember that too, but I also can’t find it.

> It would be very useful to catch mistakes with unexpected JS execution sooner, not just when the script does something bad.

I agree. I wonder what happened to it? Lets add it.

I considered your comments and made a few additions to the patch, including comments in ActiveDOMObject.h.
Comment 16 Darin Adler 2014-04-20 21:39:36 PDT
Committed r167579: <http://trac.webkit.org/changeset/167579>