Bug 28058

Summary: SharedWorkers should be shared
Product: WebKit Reporter: Andrew Wilson <atwilson>
Component: WebCore JavaScriptAssignee: Andrew Wilson <atwilson>
Status: RESOLVED FIXED    
Severity: Normal    
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Attachments:
Description Flags
proposed patch
levin: review-
Patch addressing Levin's comments levin: review+, levin: commit-queue-

Description Andrew Wilson 2009-08-06 17:13:59 PDT
We need to cache SharedWorkerProxy objects in a single repository, so they can be shared across pages.
Comment 1 Andrew Wilson 2009-08-06 17:59:54 PDT
Created attachment 34238 [details]
proposed patch
Comment 2 David Levin 2009-08-07 01:49:16 PDT
Comment on attachment 34238 [details]
proposed patch

Thanks this is looking good.  Just a few things to consider/address.



> diff --git a/LayoutTests/fast/workers/shared-worker-shared.html-disabled b/LayoutTests/fast/workers/shared-worker-shared.html-disabled
> +<body>
> +<p>Test simple shared worker sharing cases.</p>

It is nice to tell people what to expect here.  When this runs you should see a bunch of PASSes followed by a DONE.

> +
> +function testNewWorker()
> +{
> +    // New name, so should be a distinct worker from the previous one

Please add ".".

> +function testAlreadyLoaded()
> +{
> +    // Make sure that referencing a worker that is already loaded yields the same instance

Please add ".".


> +function done()
> +{

Please print done here.


> diff --git a/WebCore/workers/DefaultSharedWorkerRepository.cpp b/WebCore/workers/DefaultSharedWorkerRepository.cpp

>  namespace WebCore {
>  
> -class SharedWorkerProxy : public WorkerLoaderProxy {
> +class SharedWorkerProxy : public ThreadSafeShared<SharedWorkerProxy>, public WorkerLoaderProxy {
>  public:
> +    static PassRefPtr<SharedWorkerProxy> create(const String& name, const KURL& url) { return adoptRef(new SharedWorkerProxy(name.copy(), url)); }

No need to do name.copy() here. It is done in the constructor as it should be.

> +    bool closing() { return m_closing; }
Make it const.

> +    KURL url() { return m_url; }
Since SharedWorkerProxy can be used on different threads, I think this should return m_url.copy() (and make the method const).


> +    String name() { return m_name; }
Since SharedWorkerProxy can be used on different threads, I think this should return m_name.copy() (and make the method const).

> +    SharedWorkerProxy(const String&, const KURL&);

Nice to add param name for the const String&.

> +void SharedWorkerProxy::addToDocumentSet(ScriptExecutionContext* context)

I didn't know what should be in a DocumentSet.  I do know what a referring document is.

So why not make the name addReferringDocument()?

> +    // FIXME: track referring documents so we can shutdown the thread when the last one exits.

Also nice to mention that this needs to remove the SharedWorkerProxy from the SharedWorkerProxyCache.

> +// Loads the script on behalf of a worker.
> +class SharedWorkerScriptLoader : public RefCounted<SharedWorkerScriptLoader>, public ActiveDOMObject, private WorkerScriptLoaderClient {

Why is the callback
   virtual void notifyFinished();
public? (Make it private if possible.)

I guess you did this before during the big code review but I'm noticing it now :)


> +void DefaultSharedWorkerRepository::connectToWorker(PassRefPtr<SharedWorker> worker, PassOwnPtr<MessagePortChannel> port, const KURL& url, const String& name, ExceptionCode& ec)
>  {
> +    MutexLocker lock(m_lock);
> +    ASSERT(worker->scriptExecutionContext()->securityOrigin()->canAccess(SecurityOrigin::create(url).get()));
> +    // Fetch a proxy corresponding to this SharedWorker.
> +    RefPtr<SharedWorkerProxy> proxy = getProxy(name, url);
> +    proxy->addToDocumentSet(worker->scriptExecutionContext());

This also needs to put the proxy in the worker.

> +    if (proxy->url() != url) {
> +        // Proxy already existed under alternate URL - return an error.
> +        ec = URL_MISMATCH_ERR;
> +        return;
> +    }

Up to here, it feels like generic code that should go in SharedWorkerRepository.  It would be nice to push as much as possible into common code.  Of course, doing that, means that getProxy() would be a method on SharedWorkerRepository that is implemented in here.  It would just call DefaultSharedWorkerRepository::getProxy which would do the lock.

> +PassRefPtr<SharedWorkerProxy> DefaultSharedWorkerRepository::getProxy(const String& name, const KURL& url)
> +{
> +    // Look for an existing worker, and create one if it doesn't exist.
> +    RefPtr<SecurityOrigin> origin = SecurityOrigin::create(url);

If SecurityOrigin actually stored the url, this would be broken because origin may be used on multiple threads, so it feels fragile.  At the same time it feels wasteful to do url.copy() when it isn't needed, but I favor less fragile so consider:

   RefPtr<SecurityOrigin> origin = SecurityOrigin::create(url.copy());

or you could do the copy only when doing the set below.

> +    SharedWorkerNameMap* nameMap = m_cache.get(origin);
> +    if (!nameMap) {
> +        nameMap = new SharedWorkerNameMap();
> +        m_cache.set(origin, nameMap);
> +    }


> diff --git a/WebCore/workers/DefaultSharedWorkerRepository.h b/WebCore/workers/DefaultSharedWorkerRepository.h
>  #include "SharedWorkerRepository.h"
>  
Why have a blank line here?

> +#include "StringHash.h"

> +        PassRefPtr<SharedWorkerProxy> getProxy(const String&, const KURL&);

Nice to have param name for the string.

> +        typedef HashMap<String, RefPtr<SharedWorkerProxy> > SharedWorkerNameMap;
> +        typedef HashMap<RefPtr<SecurityOrigin>, SharedWorkerNameMap*, SecurityOriginHash> SharedWorkerProxyCache;

This was pretty tricky to review because there are several objects in these HashMaps that don't like to be ref'ed on multiple threads at once (StringImpl, SecurityOrigin).
However, it is safe because you create things from scratch to put in them and they are only accessed/used within a mutex.

It would be easy to mess this up by putting something in that wasn't created fresh or by passing one of these items out without copying it.

I think it would be good to be a nice warning comment about the tight restrictions that one must follow when using this structure.


> +        SharedWorkerProxyCache m_cache;
>      };
Comment 3 Andrew Wilson 2009-08-07 10:22:29 PDT
(In reply to comment #2)

> 
> So why not make the name addReferringDocument()?
> 

The terminology used in the spec is "list of worker's documents" so I've updated this API to more precisely reflect that (document set is the old terminology that I invented and which Ian ignored :), and also added a reference to the spec.


> 
> > +void DefaultSharedWorkerRepository::connectToWorker(PassRefPtr<SharedWorker> worker, PassOwnPtr<MessagePortChannel> port, const KURL& url, const String& name, ExceptionCode& ec)
> >  {
> > +    MutexLocker lock(m_lock);
> > +    ASSERT(worker->scriptExecutionContext()->securityOrigin()->canAccess(SecurityOrigin::create(url).get()));
> > +    // Fetch a proxy corresponding to this SharedWorker.
> > +    RefPtr<SharedWorkerProxy> proxy = getProxy(name, url);
> > +    proxy->addToDocumentSet(worker->scriptExecutionContext());
> 
> This also needs to put the proxy in the worker.

Shouldn't be needed. The SharedWorker object is basically just a container for exposing the onerror attribute (for load errors) and the .port attribute. There's no connection between the SharedWorker object and a running shared worker thread, so we don't need to keep a reference to the proxy in the worker.

The proxy is only required for the Repository to interact with the worker.

> 
> > +    if (proxy->url() != url) {
> > +        // Proxy already existed under alternate URL - return an error.
> > +        ec = URL_MISMATCH_ERR;
> > +        return;
> > +    }
> 
> Up to here, it feels like generic code that should go in
> SharedWorkerRepository.  It would be nice to push as much as possible into
> common code. 

I'm not certain about that. For chrome, this stuff is all getting proxied off to the browser process with different params passed along to identify things like the parent context. So I'd like to avoid making assumptions about common code until I've started on the Chromium port - I'll make sure to do appropriate refactoring when I tackle that port. Basically, I don't want to do premature refactoring.

> Of course, doing that, means that getProxy() would be a method on
> SharedWorkerRepository that is implemented in here.  It would just call
> DefaultSharedWorkerRepository::getProxy which would do the lock.
> 

Can't do that, I suspect. You need to hold the lock through the entire method, as otherwise the proxy object might get evicted from the cache early if some other document exits before this one gets a chance to add itself to the worker documents.

> > +PassRefPtr<SharedWorkerProxy> DefaultSharedWorkerRepository::getProxy(const String& name, const KURL& url)
> > +{
> > +    // Look for an existing worker, and create one if it doesn't exist.
> > +    RefPtr<SecurityOrigin> origin = SecurityOrigin::create(url);
> 
> If SecurityOrigin actually stored the url, this would be broken because origin
> may be used on multiple threads, so it feels fragile.  At the same time it
> feels wasteful to do url.copy() when it isn't needed, but I favor less fragile
> so consider:
> 

Good catch. Yeah, looks like SecurityOrigin shares a bunch of references down into string fragments from the URL.



> It would be easy to mess this up by putting something in that wasn't created
> fresh or by passing one of these items out without copying it.
> 
> I think it would be good to be a nice warning comment about the tight
> restrictions that one must follow when using this structure.
> 

Agreed. I thought I had all the synchronization issues covered (using the proxy's copy of the name, for example), but I didn't think about the fact that we have hidden references in things like the SecurityOrigin. This is easy to screw up, so I've added a warning in the code and in the header.
Comment 4 Andrew Wilson 2009-08-07 13:49:32 PDT
Created attachment 34324 [details]
Patch addressing Levin's comments
Comment 5 David Levin 2009-08-07 15:10:53 PDT
Comment on attachment 34324 [details]
Patch addressing Levin's comments

Thanks for all the fixes. This looks good. I don't think cache/tracking reflects what it does though...  I think tracking/tracker works better, so it would be nice to do some name changes around that on landing.
Comment 6 Andrew Wilson 2009-08-07 16:25:13 PDT
Committed as r46925.