Bug 183602 - imported/w3c/web-platform-tests/WebCryptoAPI/wrapKey_unwrapKey/wrapKey_unwrapKey.worker.html is crashing
Summary: imported/w3c/web-platform-tests/WebCryptoAPI/wrapKey_unwrapKey/wrapKey_unwrap...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: youenn fablet
URL:
Keywords: InRadar
Depends on:
Blocks: 183619
  Show dependency treegraph
 
Reported: 2018-03-13 11:23 PDT by youenn fablet
Modified: 2018-03-14 15:53 PDT (History)
10 users (show)

See Also:


Attachments
Patch (21.58 KB, patch)
2018-03-13 11:35 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch (21.77 KB, patch)
2018-03-13 12:47 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch (21.61 KB, patch)
2018-03-13 13:03 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch (21.61 KB, patch)
2018-03-13 13:59 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch (22.56 KB, patch)
2018-03-13 17:49 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch (22.64 KB, patch)
2018-03-14 08:50 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch (23.36 KB, patch)
2018-03-14 13:00 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch for landing (23.35 KB, patch)
2018-03-14 14:27 PDT, youenn fablet
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description youenn fablet 2018-03-13 11:23:03 PDT
Crypto API should not ref the context.
Comment 1 youenn fablet 2018-03-13 11:30:12 PDT
Looking at the Crypto code, I also noted two issues:
1. CryptoKeyRSA::generatePair can leak memory
2. encrypt (decrypt probably as well) are passing strings through different threads without doing any cross-thread copying.

I added a FIXME for 1.
I'll do a follow-up patch for (at least partially 2).
Comment 2 youenn fablet 2018-03-13 11:35:48 PDT
Created attachment 335705 [details]
Patch
Comment 3 youenn fablet 2018-03-13 12:47:49 PDT
Created attachment 335714 [details]
Patch
Comment 4 youenn fablet 2018-03-13 13:03:06 PDT
Created attachment 335718 [details]
Patch
Comment 5 youenn fablet 2018-03-13 13:59:44 PDT
Created attachment 335725 [details]
Patch
Comment 6 Jiewen Tan 2018-03-13 15:20:43 PDT
Comment on attachment 335725 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=335725&action=review

Hi Youenn, thanks for making this patch. However, I don't think a lock is necessary. A context can be referred to Document or WorkerGlobalScope, as far as I understood.

In Document, the postTask method invokes callOnMainThread() which has a lock already.
In WorkerGlobalScope, the postTask method ends up with MessageQueue::append() which has a lock as well.

Therefore, I am not sure if adding a lock is useful here. I propose we should just get rid of the context.ref() and context.deref() to see if that works. r- because of the locking mechanism. Will talk to Youenn offline.

> Source/WebCore/ChangeLog:10
> +        Use that method in Crypto instead of refing/unrefing  the context.

Extra space.

> Source/WebCore/dom/Document.cpp:571
> +    allDocumentsMap().remove(m_identifier);

Is this relevant to this patch?
Comment 7 Chris Dumez 2018-03-13 15:31:06 PDT
Comment on attachment 335725 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=335725&action=review

> Source/WebCore/crypto/CryptoAlgorithm.cpp:99
> +        [operation = WTFMove(operation), callback = WTFMove(callback), exceptionCallback = WTFMove(exceptionCallback), identifier = context.contextIdentifier()]() mutable {

I think contextIdentifier would be a better name than identifier here.

> Source/WebCore/crypto/algorithms/CryptoAlgorithmECDH.cpp:120
> +        [baseKey = WTFMove(baseKey), publicKey = ecParameters.publicKey.releaseNonNull(), length, unifiedCallback = WTFMove(unifiedCallback), identifier = context.contextIdentifier()]() mutable {

ditto.

> Source/WebCore/crypto/algorithms/CryptoAlgorithmSHA1.cpp:54
> +    workQueue.dispatch([digest = WTFMove(digest), message = WTFMove(message), callback = WTFMove(callback), identifier = context.contextIdentifier()]() mutable {

ditto.

> Source/WebCore/crypto/algorithms/CryptoAlgorithmSHA224.cpp:54
> +    workQueue.dispatch([digest = WTFMove(digest), message = WTFMove(message), callback = WTFMove(callback), identifier = context.contextIdentifier()]() mutable {

ditto.

> Source/WebCore/crypto/algorithms/CryptoAlgorithmSHA256.cpp:54
> +    workQueue.dispatch([digest = WTFMove(digest), message = WTFMove(message), callback = WTFMove(callback), identifier = context.contextIdentifier()]() mutable {

ditto.

> Source/WebCore/crypto/algorithms/CryptoAlgorithmSHA384.cpp:54
> +    workQueue.dispatch([digest = WTFMove(digest), message = WTFMove(message), callback = WTFMove(callback), identifier = context.contextIdentifier()]() mutable {

ditto.

> Source/WebCore/crypto/algorithms/CryptoAlgorithmSHA512.cpp:54
> +    workQueue.dispatch([digest = WTFMove(digest), message = WTFMove(message), callback = WTFMove(callback), identifier = context.contextIdentifier()]() mutable {

ditto.

> Source/WebCore/crypto/mac/CryptoKeyRSAMac.cpp:296
> +    auto identifier = context->contextIdentifier();

ditto.
Comment 8 Chris Dumez 2018-03-13 15:32:03 PDT
(In reply to Jiewen Tan from comment #6)
> Comment on attachment 335725 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=335725&action=review
> 
> Hi Youenn, thanks for making this patch. However, I don't think a lock is
> necessary. A context can be referred to Document or WorkerGlobalScope, as
> far as I understood.
> 
> In Document, the postTask method invokes callOnMainThread() which has a lock
> already.
> In WorkerGlobalScope, the postTask method ends up with
> MessageQueue::append() which has a lock as well.

We obviously need the lock since the map of scriptExecutionContexts gets modified / queried from various threads.
Comment 9 youenn fablet 2018-03-13 15:36:07 PDT
> In Document, the postTask method invokes callOnMainThread() which has a lock
> already.
> In WorkerGlobalScope, the postTask method ends up with
> MessageQueue::append() which has a lock as well.
> 
> Therefore, I am not sure if adding a lock is useful here. I propose we
> should just get rid of the context.ref() and context.deref() to see if that
> works. r- because of the locking mechanism. Will talk to Youenn offline.

As Chris said, we are reading and modifying from a non main thread.
We in particular need to ensure that the context is stopped early in its destructor while actually posting the task.

> > Source/WebCore/ChangeLog:10
> > +        Use that method in Crypto instead of refing/unrefing  the context.
> 
> Extra space.

OK

> > Source/WebCore/dom/Document.cpp:571
> > +    allDocumentsMap().remove(m_identifier);
> 
> Is this relevant to this patch?

Not really, it seems more readable.

(In reply to Chris Dumez from comment #7)
> Comment on attachment 335725 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=335725&action=review
> 
> > Source/WebCore/crypto/CryptoAlgorithm.cpp:99
> > +        [operation = WTFMove(operation), callback = WTFMove(callback), exceptionCallback = WTFMove(exceptionCallback), identifier = context.contextIdentifier()]() mutable {
> 
> I think contextIdentifier would be a better name than identifier here.

OK
Comment 10 Jiewen Tan 2018-03-13 15:46:33 PDT
(In reply to Chris Dumez from comment #8)
> (In reply to Jiewen Tan from comment #6)
> > Comment on attachment 335725 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=335725&action=review
> > 
> > Hi Youenn, thanks for making this patch. However, I don't think a lock is
> > necessary. A context can be referred to Document or WorkerGlobalScope, as
> > far as I understood.
> > 
> > In Document, the postTask method invokes callOnMainThread() which has a lock
> > already.
> > In WorkerGlobalScope, the postTask method ends up with
> > MessageQueue::append() which has a lock as well.
> 
> We obviously need the lock since the map of scriptExecutionContexts gets
> modified / queried from various threads.

Let me rephrase. I don't understand what difference this extra layer of protection makes. Will talk to Youenn in person.
Comment 11 Jiewen Tan 2018-03-13 16:01:43 PDT
(In reply to Jiewen Tan from comment #10)
> (In reply to Chris Dumez from comment #8)
> > (In reply to Jiewen Tan from comment #6)
> > > Comment on attachment 335725 [details]
> > > Patch
> > > 
> > > View in context:
> > > https://bugs.webkit.org/attachment.cgi?id=335725&action=review
> > > 
> > > Hi Youenn, thanks for making this patch. However, I don't think a lock is
> > > necessary. A context can be referred to Document or WorkerGlobalScope, as
> > > far as I understood.
> > > 
> > > In Document, the postTask method invokes callOnMainThread() which has a lock
> > > already.
> > > In WorkerGlobalScope, the postTask method ends up with
> > > MessageQueue::append() which has a lock as well.
> > 
> > We obviously need the lock since the map of scriptExecutionContexts gets
> > modified / queried from various threads.
> 
> Let me rephrase. I don't understand what difference this extra layer of
> protection makes. Will talk to Youenn in person.

Youenn explains that this patch tries to provide a more general way to postTask than just solving the bug. I am okay with that. I think renaming the title is necessary then.
Comment 12 Chris Dumez 2018-03-13 16:23:44 PDT
(In reply to Jiewen Tan from comment #11)
> (In reply to Jiewen Tan from comment #10)
> > (In reply to Chris Dumez from comment #8)
> > > (In reply to Jiewen Tan from comment #6)
> > > > Comment on attachment 335725 [details]
> > > > Patch
> > > > 
> > > > View in context:
> > > > https://bugs.webkit.org/attachment.cgi?id=335725&action=review
> > > > 
> > > > Hi Youenn, thanks for making this patch. However, I don't think a lock is
> > > > necessary. A context can be referred to Document or WorkerGlobalScope, as
> > > > far as I understood.
> > > > 
> > > > In Document, the postTask method invokes callOnMainThread() which has a lock
> > > > already.
> > > > In WorkerGlobalScope, the postTask method ends up with
> > > > MessageQueue::append() which has a lock as well.
> > > 
> > > We obviously need the lock since the map of scriptExecutionContexts gets
> > > modified / queried from various threads.
> > 
> > Let me rephrase. I don't understand what difference this extra layer of
> > protection makes. Will talk to Youenn in person.
> 
> Youenn explains that this patch tries to provide a more general way to
> postTask than just solving the bug. I am okay with that. I think renaming
> the title is necessary then.

I do not think it needs renaming. This is a way to fix flakiness in this crash.

If you believe you have a better way, let us know.

What you've proposed so far is:
"""
I propose we should just get rid of the context.ref() and context.deref() to see if that works.
"""

I do not think this is OK. You cannot call postTask() on the context from your workQueue then because the ScriptExecutionContext may get destroyed on its thread while your workQueue is running.
Comment 13 youenn fablet 2018-03-13 16:32:07 PDT
Ryosuke has some concern of locking when creating ScriptExecutionContext.
Plan is to lazily add the context to the map when needed, at identifier getter time.
Comment 14 Jiewen Tan 2018-03-13 16:40:18 PDT
(In reply to Chris Dumez from comment #12)
> (In reply to Jiewen Tan from comment #11)
> > (In reply to Jiewen Tan from comment #10)
> > > (In reply to Chris Dumez from comment #8)
> > > > (In reply to Jiewen Tan from comment #6)
> > > > > Comment on attachment 335725 [details]
> > > > > Patch
> > > > > 
> > > > > View in context:
> > > > > https://bugs.webkit.org/attachment.cgi?id=335725&action=review
> > > > > 
> > > > > Hi Youenn, thanks for making this patch. However, I don't think a lock is
> > > > > necessary. A context can be referred to Document or WorkerGlobalScope, as
> > > > > far as I understood.
> > > > > 
> > > > > In Document, the postTask method invokes callOnMainThread() which has a lock
> > > > > already.
> > > > > In WorkerGlobalScope, the postTask method ends up with
> > > > > MessageQueue::append() which has a lock as well.
> > > > 
> > > > We obviously need the lock since the map of scriptExecutionContexts gets
> > > > modified / queried from various threads.
> > > 
> > > Let me rephrase. I don't understand what difference this extra layer of
> > > protection makes. Will talk to Youenn in person.
> > 
> > Youenn explains that this patch tries to provide a more general way to
> > postTask than just solving the bug. I am okay with that. I think renaming
> > the title is necessary then.
> 
> I do not think it needs renaming. This is a way to fix flakiness in this
> crash.
> 
> If you believe you have a better way, let us know.
> 
> What you've proposed so far is:
> """
> I propose we should just get rid of the context.ref() and context.deref() to
> see if that works.
> """
> 
> I do not think this is OK. You cannot call postTask() on the context from
> your workQueue then because the ScriptExecutionContext may get destroyed on
> its thread while your workQueue is running.

You are right. I have some misunderstanding of the patch before. Now I walk myself through it.
Comment 15 youenn fablet 2018-03-13 17:49:44 PDT
Created attachment 335749 [details]
Patch
Comment 16 youenn fablet 2018-03-14 08:50:49 PDT
Created attachment 335770 [details]
Patch
Comment 17 Chris Dumez 2018-03-14 09:19:37 PDT
Comment on attachment 335770 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=335770&action=review

> Source/WebCore/crypto/CryptoAlgorithm.cpp:98
> +    // There is no guarantee that callback and exceptionCallback will be destroyed in the thread in which they were created.

Shouldn't this be a FIXME comment?

> Source/WebCore/crypto/mac/CryptoKeyRSAMac.cpp:299
> +    // Fix this by using unique pointers and move them from one lambda to the other.

But then it won't necessarily be destroyed on the right thread, right?

> Source/WebCore/dom/Document.cpp:572
> +    // We need to remove from the contexts map very early in the destructor so that posting task to that Document from another thread is safe.

s/so that posting task to that Document/so that calling postTask() on this document/

> Source/WebCore/dom/ScriptExecutionContext.cpp:70
> +    static NeverDestroyed<HashMap<ScriptExecutionContextIdentifier, ScriptExecutionContext*>> contexts;

Please add an ASSERT(allScriptExecutionContextsMapLock().isLocked());

> Source/WebCore/dom/ScriptExecutionContext.cpp:74
> +static Lock& allScriptExecutionContextsMapLock()

Should probably be moved up so you can add the assertion above.

> Source/WebCore/dom/ScriptExecutionContext.cpp:104
> +    if (!m_contextIdentifier.toUInt64()) {

Could we add a isValid() method to ObjectIdentifier ? I initially did not add it because I wasn't planning on adding a default constructor. However, now that we have a default constructor, I think we need such method.

> Source/WebCore/dom/ScriptExecutionContext.cpp:105
> +        Locker<Lock> locker(allScriptExecutionContextsMapLock());

I am not convinced this is thread safe, assuming this method can be called from several threads in parallel, because the check for an invalid identifier is outside the lock. 2 threads may end up thinking the identifier is invalid, then both generate a new identifier for the same document and add the context to the map twice with different identifiers. One way to fix this efficiently would be to add a second if (!m_contextIdentifier.isValid()) check *after* the locking.

Maybe we just need an assertion to make sure this method is always called from the scriptExecutionContext's context thread?

> Source/WebCore/dom/ScriptExecutionContext.cpp:117
> +    if (m_contextIdentifier.toUInt64()) {

I'd suggest an early return. Note importantly, I am not convinced this is thread safe if contextIdentifier() can get called on another thread while removeFromContextsMap() is running, and after this valid identifier check. You might not remove yourself from the map, and then the other thread may add you to the map by calling contextIdentifier().

> Source/WebCore/dom/ScriptExecutionContext.cpp:155
> +        ASSERT(!allScriptExecutionContextsMap().contains(m_contextIdentifier));

I'd suggest an ASSERT_WITH_MESSAGE() explaining that the subclass implementing postTask() should have removed us from the map already.
Comment 18 Chris Dumez 2018-03-14 09:58:34 PDT
Comment on attachment 335770 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=335770&action=review

>> Source/WebCore/dom/ScriptExecutionContext.cpp:105
>> +        Locker<Lock> locker(allScriptExecutionContextsMapLock());
> 
> I am not convinced this is thread safe, assuming this method can be called from several threads in parallel, because the check for an invalid identifier is outside the lock. 2 threads may end up thinking the identifier is invalid, then both generate a new identifier for the same document and add the context to the map twice with different identifiers. One way to fix this efficiently would be to add a second if (!m_contextIdentifier.isValid()) check *after* the locking.
> 
> Maybe we just need an assertion to make sure this method is always called from the scriptExecutionContext's context thread?

e.g. ASSERT(isContextThread());
Comment 19 youenn fablet 2018-03-14 10:24:40 PDT
Thanks for the feedback, I'll add the assertions accordingly.
As of the thread safety potential issue related to contextIdentifier() getter, the model is to call it from context thread only. The pattern is usually to start posting a task from context thread, go to another thread and then go back to context thread.

(In reply to Chris Dumez from comment #17)
> Comment on attachment 335770 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=335770&action=review
> 
> > Source/WebCore/crypto/CryptoAlgorithm.cpp:98
> > +    // There is no guarantee that callback and exceptionCallback will be destroyed in the thread in which they were created.
> 
> Shouldn't this be a FIXME comment?

No, it is just a note so that someone calling this method should make sure that destroying lambdas on another thread is ok.
I can remove the comment if not really necessary.

If we really need to destroy these in the worker thread, we should change the implementation.
It is true that, we have a false sense of security by creating lambdas and passing them across threads. Maybe we should add a pattern that would allow these kind of lambdas-being-destroyed-in-a-specific-thread for simplicity, but they do not seem required, at least in WebCrypto implementation.

> > Source/WebCore/crypto/mac/CryptoKeyRSAMac.cpp:299
> > +    // Fix this by using unique pointers and move them from one lambda to the other.
> 
> But then it won't necessarily be destroyed on the right thread, right?

The callbacks should be made so that it is fine destroying them from another thread.

> > Source/WebCore/dom/Document.cpp:572
> > +    // We need to remove from the contexts map very early in the destructor so that posting task to that Document from another thread is safe.
> 
> s/so that posting task to that Document/so that calling postTask() on this
> document/

OK

> > Source/WebCore/dom/ScriptExecutionContext.cpp:70
> > +    static NeverDestroyed<HashMap<ScriptExecutionContextIdentifier, ScriptExecutionContext*>> contexts;
> 
> Please add an ASSERT(allScriptExecutionContextsMapLock().isLocked());

OK

> > Source/WebCore/dom/ScriptExecutionContext.cpp:104
> > +    if (!m_contextIdentifier.toUInt64()) {
> 
> Could we add a isValid() method to ObjectIdentifier ? I initially did not
> add it because I wasn't planning on adding a default constructor. However,
> now that we have a default constructor, I think we need such method.

isValid or operator bool maybe?

> > Source/WebCore/dom/ScriptExecutionContext.cpp:105
> > +        Locker<Lock> locker(allScriptExecutionContextsMapLock());
> 
> I am not convinced this is thread safe, assuming this method can be called
> from several threads in parallel, because the check for an invalid
> identifier is outside the lock. 2 threads may end up thinking the identifier
> is invalid, then both generate a new identifier for the same document and
> add the context to the map twice with different identifiers. One way to fix
> this efficiently would be to add a second if
> (!m_contextIdentifier.isValid()) check *after* the locking.
> 
> Maybe we just need an assertion to make sure this method is always called
> from the scriptExecutionContext's context thread?

Right, the getter should only be called from the context thread.
I haven't seen any use case for calling this getter from another thread.
I'll add the assertion for now.

> > Source/WebCore/dom/ScriptExecutionContext.cpp:155
> > +        ASSERT(!allScriptExecutionContextsMap().contains(m_contextIdentifier));
> 
> I'd suggest an ASSERT_WITH_MESSAGE() explaining that the subclass
> implementing postTask() should have removed us from the map already.

OK
Comment 20 youenn fablet 2018-03-14 13:00:25 PDT
Created attachment 335790 [details]
Patch
Comment 21 Chris Dumez 2018-03-14 13:48:41 PDT
Comment on attachment 335790 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=335790&action=review

r=me with changes.

> Source/WebCore/dom/ScriptExecutionContext.cpp:119
> +    if (m_contextIdentifier.toUInt64()) {

if (m_contextIdentifier) {

> Source/WebCore/dom/ScriptExecutionContext.cpp:122
> +            "ScriptExecution subclass implementing postTask should have already removed itself from the map");

This assertion message does not make sense here.

> Source/WebCore/dom/ScriptExecutionContext.cpp:156
> +    if (m_contextIdentifier.toUInt64()) {

if (m_contextIdentifier) {

> Source/WebCore/dom/ScriptExecutionContext.cpp:158
> +        ASSERT(!allScriptExecutionContextsMap().contains(m_contextIdentifier));

This is the assertion that should have been an ASSERT_WITH_MESSAGE() and which should have used the message above.
Comment 22 youenn fablet 2018-03-14 14:27:12 PDT
Created attachment 335801 [details]
Patch for landing
Comment 23 WebKit Commit Bot 2018-03-14 15:52:14 PDT
Comment on attachment 335801 [details]
Patch for landing

Clearing flags on attachment: 335801

Committed r229616: <https://trac.webkit.org/changeset/229616>
Comment 24 WebKit Commit Bot 2018-03-14 15:52:16 PDT
All reviewed patches have been landed.  Closing bug.
Comment 25 Radar WebKit Bug Importer 2018-03-14 15:53:21 PDT
<rdar://problem/38477604>