Summary: | [chromium] Creation of dedicated workers (isolates) leaks reference to HTMLDocument | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Brandon Jones <bajones> | ||||||||||||||
Component: | WebCore JavaScript | Assignee: | Kenneth Russell <kbr> | ||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||
Severity: | Normal | CC: | abarth, danno, dimich, gman, haraken, igor.oliveira, kbr, levin, mhahnenberg, pfeldman, simonjam, tkent, tonikitoo, tonyg, tzik, ulan, webkit.review.bot, yurys, zmo | ||||||||||||||
Priority: | P1 | ||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||
Hardware: | All | ||||||||||||||||
OS: | All | ||||||||||||||||
Bug Depends on: | 106016 | ||||||||||||||||
Bug Blocks: | |||||||||||||||||
Attachments: |
|
Description
Brandon Jones
2012-12-18 16:32:04 PST
Created attachment 180056 [details]
Test pages to provoke the described error
+some people who worked on the DOM bindings, isolates, and GC recently Initially we thought this was a bug in V8's GC, but because the WebGLRenderingContext objects aren't alive according to the heap profiler, I think it's a bug in the DOM bindings related to isolates. I have a bad feeling that if a dedicated worker is created, it may be preventing C++ objects which are referred to by JavaScript objects (whether those objects are in the DOM or not -- the WebGLRenderingContext isn't) from being properly unreferenced and deleted. I'm raising this to P1 until the cause is at least understood, as this bug could be causing major memory leaks for applications using workers. I don't think it's specific to WebGL. Please CC: anyone who may have worked in this area. I've been investigating this and have some updates which I'll post shortly. Assigning to myself. CC'ing a few more people who may be interested. Created attachment 180817 [details]
Simplified test case
Created attachment 180818 [details]
Patch used to debug problem
The problem is that the WebWorkerClientImpl instance (Source/WebKit/chromium/src/WebWorkerClientImpl.cpp) is never deleted, leaking a reference to the HTMLDocument via m_scriptExecutionContext. This prevents the destruction of all attached DOM elements, including the Canvas, so that when the JavaScript reference to the WebGLRenderingContext is GC'd, it doesn't ultimately cause the Canvas to be deleted. Since the Canvas owns its m_context via OwnPtr, the WebGL context and all associated resources are never cleaned up. Basically, any page using Workers in Chromium leaks the entire DOM when the page is reloaded. I found that the same leak didn't exist in Safari, despite a strong hunch that the problem wasn't a leak of a reference outside Chromium's WebKit API boundary. That should have been a hint about where to look. This bug was pretty tricky to track down. I'd appreciate any suggestions about how it could have been done better or more efficiently. Basically, per the attached patch, I instrumented TreeShared::ref() and deref() to fire when operating upon HTMLDocuments, since I found from early debugging that the Canvas wasn't being deleted and that was because the Document had a nonzero refcount at the last call to unref(). However, ref/unref were called too often against the document to debug directly. I added a mechanism to suppress the firing inside ref/unref and added those suppressions throughout WebKit. That got pretty close to the bug, but it turned out I suppressed the key place (within a call to ScriptElement::executeScript) where the unmatched call to HTMLDocument::ref was being made. I only got lucky because I saw a call to unref in WorkerMessagingProxy::~WorkerMessagingProxy and didn't see the construction, so made the suppression in executeScript more precise, which exposed the unbalanced call from WebWorkerClientImpl::WebWorkerClientImpl. I don't have a fix yet but am going to try to find one, as well as add a test to ensure this leak doesn't reappear. Created attachment 180978 [details]
Patch
Interesting! Who is the best person to review this patch? (In reply to comment #9) > Interesting! Who is the best person to review this patch? I'd guess dimich@ Comment on attachment 180978 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=180978&action=review r- due to "void*" comment. > Source/WebCore/workers/WorkerLoaderProxy.h:62 > + virtual void* toWebWorkerBase() = 0; Lets try to forward-declare the return type here. In general, it's unfortunate to introduce the 'embedder context' into WorkerLoaderProxy. Is there a better solution? Created attachment 181115 [details]
Patch
Comment on attachment 180978 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=180978&action=review >> Source/WebCore/workers/WorkerLoaderProxy.h:62 >> + virtual void* toWebWorkerBase() = 0; > > Lets try to forward-declare the return type here. > > In general, it's unfortunate to introduce the 'embedder context' into WorkerLoaderProxy. Is there a better solution? New patch uploaded which forward declares the return type, and avoids static_casts in the implementations. I gave this a lot of thought but can't come up with a better solution. The basic problem is that the class hierarchies of Chromium's dedicated and shared worker implementations now diverge significantly. It used to be the case in Chromium that every implementation of WorkerLoaderProxy that hung off the WorkerThread derived from WebWorkerBase. That is now no longer the case. For dedicated workers, WebCore's WorkerMessagingProxy is now used as the base class. WebSharedWorkerImpl continues to derive from WebWorkerBase. This means that it is no longer legal to cast between WebWorkerBase* and WorkerLoaderProxy*. It would be possible to add a static method "WebWorkerBase::fromWorkerLoaderProxy", and avoid touching WorkerLoaderProxy.h, but it would be very fragile. If any new implementation of WorkerLoaderProxy were added, the implementation of WebWorkerBase::fromWorkerLoaderProxy would have to be updated -- and there is no way to force a compilation failure to give the developer a hint that that code needs to be updated. The presence of the pure virtual in WorkerLoaderProxy provides this essential information. In sum, as poor as it is to modify WorkerLoaderProxy with a Chromium-specific forward declaration and method, I think it is the best technical solution given the constraints. Comment on attachment 181115 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=181115&action=review r+ with a note. > Source/WebKit/chromium/src/WebWorkerClientImpl.cpp:73 > // Delegates implementation of Worker{Loader,Context,Object}Proxy to WorkerMessagingProxy. This comment seems old now... thsi class does not really delegate, it is a WMP... Remove? Comment on attachment 181115 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=181115&action=review >> Source/WebKit/chromium/src/WebWorkerClientImpl.cpp:73 >> // Delegates implementation of Worker{Loader,Context,Object}Proxy to WorkerMessagingProxy. > > This comment seems old now... thsi class does not really delegate, it is a WMP... Remove? You're right. I'll update the patch. Created attachment 181129 [details]
Patch for landing
Fixed comment and small typo in failure case of layout test
Comment on attachment 181129 [details] Patch for landing Clearing flags on attachment: 181129 Committed r138693: <http://trac.webkit.org/changeset/138693> All reviewed patches have been landed. Closing bug. |