Bug 94170 - Allow blocking of Web SQL databases in third-party web workers
Summary: Allow blocking of Web SQL databases in third-party web workers
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Vicki Pfau
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-08-15 19:17 PDT by Vicki Pfau
Modified: 2012-08-22 15:59 PDT (History)
5 users (show)

See Also:


Attachments
Patch (37.07 KB, patch)
2012-08-17 16:43 PDT, Vicki Pfau
no flags Details | Formatted Diff | Diff
Patch (39.31 KB, patch)
2012-08-17 18:58 PDT, Vicki Pfau
abarth: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Vicki Pfau 2012-08-15 19:17:41 PDT
Following up from bug #94057, web workers still need work to be able to block third-party Web SQL databases.
Comment 1 Vicki Pfau 2012-08-17 16:43:06 PDT
Created attachment 159228 [details]
Patch
Comment 2 Vicki Pfau 2012-08-17 16:45:16 PDT
This patch only allows blocking in dedicated workers, as there isn't a well-defined top document origin in a shared worker, as shared workers by definition are shared between documents. Since each document might have a distinct parent document, shared workers aren't covered by this.
Comment 3 WebKit Review Bot 2012-08-17 16:59:37 PDT
Comment on attachment 159228 [details]
Patch

Attachment 159228 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/13531186
Comment 4 Peter Beverloo (cr-android ews) 2012-08-17 17:07:24 PDT
Comment on attachment 159228 [details]
Patch

Attachment 159228 [details] did not pass cr-android-ews (chromium-android):
Output: http://queues.webkit.org/results/13531188
Comment 5 Andrew Wilson 2012-08-17 17:29:48 PDT
Shared workers are indeed shared between documents, but I thought that the spec guaranteed that all parent documents must be from the same origin?
Comment 6 Vicki Pfau 2012-08-17 17:32:34 PDT
(In reply to comment #5)
> Shared workers are indeed shared between documents, but I thought that the spec guaranteed that all parent documents must be from the same origin?

The documents that attach to the documents must have the same origin, true. But those documents might be child documents of other documents via an iframe, and it's the top-most document that is what is important here. That's the one the user visited, and what we care about is third-party origins relative to the top origin.
Comment 7 Vicki Pfau 2012-08-17 18:58:31 PDT
Created attachment 159256 [details]
Patch
Comment 8 Vicki Pfau 2012-08-17 19:00:07 PDT
(In reply to comment #6)
> (In reply to comment #5)
> > Shared workers are indeed shared between documents, but I thought that the spec guaranteed that all parent documents must be from the same origin?
> 
> The documents that attach to the documents must have the same origin, true. But those documents might be child documents of other documents via an iframe, and it's the top-most document that is what is important here. That's the one the user visited, and what we care about is third-party origins relative to the top origin.

Typo here, I meant that the documents that attach to the workers must have the same origin.
Comment 9 Adam Barth 2012-08-18 22:55:46 PDT
Comment on attachment 159256 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=159256&action=review

> Source/WebCore/workers/DedicatedWorkerContext.cpp:45
> -PassRefPtr<DedicatedWorkerContext> DedicatedWorkerContext::create(const KURL& url, const String& userAgent, PassOwnPtr<GroupSettings> settings, DedicatedWorkerThread* thread, const String& contentSecurityPolicy, ContentSecurityPolicy::HeaderType contentSecurityPolicyType)
> +PassRefPtr<DedicatedWorkerContext> DedicatedWorkerContext::create(const KURL& url, const String& userAgent, PassOwnPtr<GroupSettings> settings, DedicatedWorkerThread* thread, const String& contentSecurityPolicy, ContentSecurityPolicy::HeaderType contentSecurityPolicyType, PassRefPtr<SecurityOrigin> topOrigin)

The parameters on this function are getting somewhat out of control.  Should we create a struct for them?

> Source/WebCore/workers/DedicatedWorkerThread.cpp:42
> -PassRefPtr<DedicatedWorkerThread> DedicatedWorkerThread::create(const KURL& scriptURL, const String& userAgent, const GroupSettings* settings, const String& sourceCode, WorkerLoaderProxy& workerLoaderProxy, WorkerObjectProxy& workerObjectProxy, WorkerThreadStartMode startMode, const String& contentSecurityPolicy, ContentSecurityPolicy::HeaderType contentSecurityPolicyType)
> +PassRefPtr<DedicatedWorkerThread> DedicatedWorkerThread::create(const KURL& scriptURL, const String& userAgent, const GroupSettings* settings, const String& sourceCode, WorkerLoaderProxy& workerLoaderProxy, WorkerObjectProxy& workerObjectProxy, WorkerThreadStartMode startMode, const String& contentSecurityPolicy, ContentSecurityPolicy::HeaderType contentSecurityPolicyType, const SecurityOrigin* topOrigin)

const SecurityOrigin <--- I'd drop the const.

> Source/WebCore/workers/SharedWorkerContext.cpp:62
> -    : WorkerContext(url, userAgent, settings, thread)
> +    : WorkerContext(url, userAgent, settings, thread, 0)

Should there be a FIXME about this 0?

> Source/WebCore/workers/SharedWorkerThread.cpp:47
> -    : WorkerThread(url, userAgent, settings, sourceCode, workerLoaderProxy, workerReportingProxy, startMode, contentSecurityPolicy, contentSecurityPolicyType)
> +    : WorkerThread(url, userAgent, settings, sourceCode, workerLoaderProxy, workerReportingProxy, startMode, contentSecurityPolicy, contentSecurityPolicyType, 0)

ditto
Comment 10 Adam Barth 2012-08-18 22:59:38 PDT
I'm not really in love with this patch, but it does seem like the right approach.  Should we have a workers expert look at the patch as well?  I'm pretty sure it's right, but another pair of eyes never hurts.

The comments above are just nits because I'm unhappy that so many lines of code need to be involved in ferrying this information around.  However, that's not really the fault of this patch, and I don't see a better way of doing it.
Comment 11 Vicki Pfau 2012-08-20 11:00:04 PDT
(In reply to comment #9)
> (From update of attachment 159256 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=159256&action=review
> 
> > Source/WebCore/workers/DedicatedWorkerContext.cpp:45
> > -PassRefPtr<DedicatedWorkerContext> DedicatedWorkerContext::create(const KURL& url, const String& userAgent, PassOwnPtr<GroupSettings> settings, DedicatedWorkerThread* thread, const String& contentSecurityPolicy, ContentSecurityPolicy::HeaderType contentSecurityPolicyType)
> > +PassRefPtr<DedicatedWorkerContext> DedicatedWorkerContext::create(const KURL& url, const String& userAgent, PassOwnPtr<GroupSettings> settings, DedicatedWorkerThread* thread, const String& contentSecurityPolicy, ContentSecurityPolicy::HeaderType contentSecurityPolicyType, PassRefPtr<SecurityOrigin> topOrigin)
> 
> The parameters on this function are getting somewhat out of control.  Should we create a struct for them?

These almost mirror the members of WorkerThreadStartupData, which is a struct that only appears in WorkerThread.cpp. Maybe it makes sense to expose it?

> 
> > Source/WebCore/workers/DedicatedWorkerThread.cpp:42
> > -PassRefPtr<DedicatedWorkerThread> DedicatedWorkerThread::create(const KURL& scriptURL, const String& userAgent, const GroupSettings* settings, const String& sourceCode, WorkerLoaderProxy& workerLoaderProxy, WorkerObjectProxy& workerObjectProxy, WorkerThreadStartMode startMode, const String& contentSecurityPolicy, ContentSecurityPolicy::HeaderType contentSecurityPolicyType)
> > +PassRefPtr<DedicatedWorkerThread> DedicatedWorkerThread::create(const KURL& scriptURL, const String& userAgent, const GroupSettings* settings, const String& sourceCode, WorkerLoaderProxy& workerLoaderProxy, WorkerObjectProxy& workerObjectProxy, WorkerThreadStartMode startMode, const String& contentSecurityPolicy, ContentSecurityPolicy::HeaderType contentSecurityPolicyType, const SecurityOrigin* topOrigin)
> 
> const SecurityOrigin <--- I'd drop the const.

Any particular reason? We don't need to do any non-const operations on it, apart from isolatedCopy which should have been const in the first place (and is made const in this patch).

> 
> > Source/WebCore/workers/SharedWorkerContext.cpp:62
> > -    : WorkerContext(url, userAgent, settings, thread)
> > +    : WorkerContext(url, userAgent, settings, thread, 0)
> 
> Should there be a FIXME about this 0?

There isn't really anything sensible to pass for shared workers, as there is no conceivable top origin. The only potential thing we could pass is the origin of the proxy, which amounts to the same origin as the worker. That would give us nothing, except for eliminating the 0.

(In reply to comment #10)
> I'm not really in love with this patch, but it does seem like the right approach.  Should we have a workers expert look at the patch as well?  I'm pretty sure it's right, but another pair of eyes never hurts.
> 

I cc'd atwilson, but I don't know who else is an "expert" in this subject. I would like an an expert to look at it though.

> The comments above are just nits because I'm unhappy that so many lines of code need to be involved in ferrying this information around.  However, that's not really the fault of this patch, and I don't see a better way of doing it.

I agree here. I would have liked a better way, but I didn't think there was one.
Comment 12 Adam Barth 2012-08-20 12:15:23 PDT
> > const SecurityOrigin <--- I'd drop the const.
> 
> Any particular reason? We don't need to do any non-const operations on it, apart from isolatedCopy which should have been const in the first place (and is made const in this patch).

We don't tend to use const pointers for these sorts of composite objects.  It's not really that meaningful a restriction in C++.  Maybe it's more useful in threaded code?  In any case, it's not a big deal.

> > > Source/WebCore/workers/SharedWorkerContext.cpp:62
> > > -    : WorkerContext(url, userAgent, settings, thread)
> > > +    : WorkerContext(url, userAgent, settings, thread, 0)
> > 
> > Should there be a FIXME about this 0?
> 
> There isn't really anything sensible to pass for shared workers, as there is no conceivable top origin. The only potential thing we could pass is the origin of the proxy, which amounts to the same origin as the worker. That would give us nothing, except for eliminating the 0.

Does that mean SharedWorkers will always be a loophole for gaining access to storage in a third-party context?
Comment 13 Andrew Wilson 2012-08-20 13:25:35 PDT
I'm a bit rusty on the Worker code and I don't know if there's a better way to accomplish this through something like ContentSecurityPolicy or something like that.

That said, I think a better approach would be:

a) Instead of exposing WorkerContext.topOrigin(), instead define a new canAccessDatabase() API on WorkerContext.
b) DedicatedWorkerContext can implement this by calling securityOrigin->canAccessDatabase(topOrigin) - SharedWorkerContext would not have a topOrigin member and would not take an origin in the constructor.
c) SharedWorkerContext could then have its own implementation of canAccessDatabase() which interacts with SharedWorkerRepository to determine what the right thing is to do (e.g. allow if all parent documents allow database access). Alternatively, take a look at how GroupSettings is handled in DefaultSharedWorkerRepository.cpp - we currently just grab the groupSettings from the very first document that creates the worker, and that might be good enough for this case too, at least for now.

The current patch assumes that there's always just a single topOrigin which is not correct for all workers (as you noted) - moving this logic into WorkerContext at least creates a framework for doing it correctly even if you don't have a perfect SharedWorker implementation in this patch.
Comment 14 Vicki Pfau 2012-08-20 14:04:48 PDT
(In reply to comment #13)
> I'm a bit rusty on the Worker code and I don't know if there's a better way to accomplish this through something like ContentSecurityPolicy or something like that.
> 
> That said, I think a better approach would be:
> 
> a) Instead of exposing WorkerContext.topOrigin(), instead define a new canAccessDatabase() API on WorkerContext.
> b) DedicatedWorkerContext can implement this by calling securityOrigin->canAccessDatabase(topOrigin) - SharedWorkerContext would not have a topOrigin member and would not take an origin in the constructor.
> c) SharedWorkerContext could then have its own implementation of canAccessDatabase() which interacts with SharedWorkerRepository to determine what the right thing is to do (e.g. allow if all parent documents allow database access). Alternatively, take a look at how GroupSettings is handled in DefaultSharedWorkerRepository.cpp - we currently just grab the groupSettings from the very first document that creates the worker, and that might be good enough for this case too, at least for now.
> 
> The current patch assumes that there's always just a single topOrigin which is not correct for all workers (as you noted) - moving this logic into WorkerContext at least creates a framework for doing it correctly even if you don't have a perfect SharedWorker implementation in this patch.

This looks better at first glance, but upon starting to implement it, it requires almost the exact same amount of plumbing the arguments through arbitrary constructors and function calls. At the end of the day, it doesn't really change much of anything. It just shuffles a few things around. The only potential advantage is to allow the behavior in SharedWorker to be more specialized--which might be desirable.
Comment 15 Andrew Wilson 2012-08-20 14:14:17 PDT
(In reply to comment #14)

> This looks better at first glance, but upon starting to implement it, it requires almost the exact same amount of plumbing the arguments through arbitrary constructors and function calls. At the end of the day, it doesn't really change much of anything. It just shuffles a few things around. The only potential advantage is to allow the behavior in SharedWorker to be more specialized--which might be desirable.

Yeah, I'm not particularly concerned about having to plumb stuff through the constructors. I'm more concerned with properly allowing SharedWorker to specialize this code.

If you really want to bake-in the assumption that all workers only have a single origin (i.e. you really want to keep your new WorkerContext->topOrigin() API instead of moving this logic into the WorkerContext subclass) then you should go all the way and plumb a top origin into the SharedWorkerThread constructor that you pull out of the first Document that creates the SharedWorker (see DefaultSharedWorkerRepository::workerScriptLoaded()).
Comment 16 Vicki Pfau 2012-08-20 14:31:16 PDT
(In reply to comment #15)
> (In reply to comment #14)
> 
> > This looks better at first glance, but upon starting to implement it, it requires almost the exact same amount of plumbing the arguments through arbitrary constructors and function calls. At the end of the day, it doesn't really change much of anything. It just shuffles a few things around. The only potential advantage is to allow the behavior in SharedWorker to be more specialized--which might be desirable.
> 
> Yeah, I'm not particularly concerned about having to plumb stuff through the constructors. I'm more concerned with properly allowing SharedWorker to specialize this code.
> 
> If you really want to bake-in the assumption that all workers only have a single origin (i.e. you really want to keep your new WorkerContext->topOrigin() API instead of moving this logic into the WorkerContext subclass) then you should go all the way and plumb a top origin into the SharedWorkerThread constructor that you pull out of the first Document that creates the SharedWorker (see DefaultSharedWorkerRepository::workerScriptLoaded()).

I have two ideas for how to block access within a shared worker:
1) Every time a document tries to connect to a worker, check the top origin and "taint" the worker if it's a third party. This will require additional engineering to invalidate existing databases that have been opened already in the worker.
2) Prevent connecting to shared workers when the origin is a third party.

I'm inclined to think 2 is the better option because it can prevent other information from leaking across pages with different top origins even though the shared worker always talks between pages within the same origin.
Comment 17 Andrew Wilson 2012-08-20 14:44:55 PDT
(In reply to comment #16)
> I have two ideas for how to block access within a shared worker:
> 1) Every time a document tries to connect to a worker, check the top origin and "taint" the worker if it's a third party. This will require additional engineering to invalidate existing databases that have been opened already in the worker.

Yeah, that's why I think we ought to just use the origin of the very first document to open a given worker for now. If you go visit foo.com in a top-level browser window and it creates a shared worker, it seems like it's OK if that worker is able to access storage for the remainder of its lifetime.

Also note that just tainting the worker isn't sufficient, since workers could in theory contact other, non-tainted workers via MessagePort messaging. So they could just proxy any storage requests to a non-tainted worker.

> 2) Prevent connecting to shared workers when the origin is a third party.
> 
> I'm inclined to think 2 is the better option because it can prevent other information from leaking across pages with different top origins even though the shared worker always talks between pages within the same origin.

I think you would break websites with this kind of change, as this technique is the only way to create a cross-domain shared worker. Doing this would prevent top level pages that wish to communicate (e.g. calendar.google.com and mail.google.com) from communicating with one another via shared workers hosted in google.com iframes.
Comment 18 Andrew Wilson 2012-08-20 14:50:26 PDT
(In reply to comment #17)
> (In reply to comment #16)
> > I have two ideas for how to block access within a shared worker:
> > 1) Every time a document tries to connect to a worker, check the top origin and "taint" the worker if it's a third party. This will require additional engineering to invalidate existing databases that have been opened already in the worker.
> 
Thinking about this some more, what do you think about doing the reverse of what you describe: basically, a worker is tainted until a document connects that will allow storage. Once a document connects that allows storage, we untaint the worker and all future storage calls will work.

I think ideally, you'd like:

a) foo.com top window opens shared worker
b) bar.com top window has foo.com iframe that opens shared worker

...to result in a SharedWorker with a consistent state (either tainted or untainted) no matter what order a) and b) happen in. If the transition from tainted->untainted is easier (because we don't have to abort/delete existing databases) then we should make the tainted->untainted transition sticky as I describe above.
Comment 19 Vicki Pfau 2012-08-20 18:30:28 PDT
Given that, when third-party storage blocking is enabled, shared workers have problems beyond just Web SQL databases, I think it might be best to get this patch in without provisions for shared workers and worry about whatever might be involved with shared workers in a followup patch. As such, I have filed bug #94559 for this followup work.
Comment 20 Adam Barth 2012-08-22 13:55:09 PDT
Yeah, the shared worker case requires more thought than the dedicated worker case.  Making progress in the dedicated worker case seems valuable.
Comment 21 Adam Barth 2012-08-22 13:58:09 PDT
Comment on attachment 159256 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=159256&action=review

I still think this patch is ugly, but I don't see a better way of solving this problem.

> Source/WebCore/workers/WorkerMessagingProxy.cpp:285
>                                                                           m_scriptExecutionContext->contentSecurityPolicy()->deprecatedHeader(),
> -                                                                         m_scriptExecutionContext->contentSecurityPolicy()->deprecatedHeaderType());
> +                                                                         m_scriptExecutionContext->contentSecurityPolicy()->deprecatedHeaderType(),
> +                                                                         document->topDocument()->securityOrigin());

I would have changed these all to use |document| to be clear that we're getting all this information from the same place.
Comment 22 Andrew Wilson 2012-08-22 13:59:05 PDT
FWIW, I still think it's worth adding *some* support for SharedWorkers (such as my suggestion to just use the policy from the very first document that opens a given SharedWorker - this would at least handle the basic, most common use case of a third-party iframe trying to open a shared worker). If you don't do this, then it doesn't seem useful to do the work to block dedicated workers because developers can just trivially switch to shared workers.
Comment 23 Adam Barth 2012-08-22 14:09:33 PDT
Sure, but we can do that in a future patch.  Currently, there are lots of avenues for persisting data in third-party contexts.  Surely we shouldn't wait for a patch that blocks all of them before making progress on any of them.
Comment 24 Vicki Pfau 2012-08-22 15:59:07 PDT
Committed r126365: <http://trac.webkit.org/changeset/126365>