Bug 192229 - [GStreamer][EME][ClearKey] Request keys from CDMInstance rather than passing via bus messages
Summary: [GStreamer][EME][ClearKey] Request keys from CDMInstance rather than passing ...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WPE WebKit (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Charlie Turner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-11-30 09:25 PST by Charlie Turner
Modified: 2019-02-05 08:04 PST (History)
5 users (show)

See Also:


Attachments
Patch (81.42 KB, patch)
2018-12-03 04:10 PST, Charlie Turner
no flags Details | Formatted Diff | Diff
Patch (86.17 KB, patch)
2018-12-03 10:21 PST, Charlie Turner
no flags Details | Formatted Diff | Diff
Patch (31.93 KB, patch)
2018-12-13 06:42 PST, Charlie Turner
no flags Details | Formatted Diff | Diff
Patch for landing (30.40 KB, patch)
2019-01-18 07:18 PST, Charlie Turner
no flags Details | Formatted Diff | Diff
Patch for landing (30.49 KB, patch)
2019-01-18 07:20 PST, Charlie Turner
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Charlie Turner 2018-11-30 09:25:10 PST
[EME] Create and move to a decryption API implemented by CDMInstances
Comment 1 Charlie Turner 2018-12-03 04:10:07 PST
Created attachment 356374 [details]
Patch
Comment 2 Charlie Turner 2018-12-03 04:15:51 PST
I will have to rebase over the GstContext patch, but would appreciate review anyway.
Comment 3 Xabier Rodríguez Calvar 2018-12-03 10:01:00 PST
Even when this shouldn't impact the Apple ports, let's add Jer here to see if he has any comments as well.
Comment 4 Charlie Turner 2018-12-03 10:21:29 PST
Created attachment 356384 [details]
Patch
Comment 5 Charlie Turner 2018-12-03 10:26:26 PST
(In reply to Xabier Rodríguez Calvar from comment #3)
> Even when this shouldn't impact the Apple ports, let's add Jer here to see
> if he has any comments as well.

Yes, I forgot that. Jer, for your attention, the new classes/structs in CDMInstance.h/CDMInstanceSession.h could use your review. TIA.
Comment 6 Jer Noble 2018-12-03 11:02:59 PST
I don't really understand why the new types and interfaces need to be added to CDMInstance if they're being used exclusively by CDMInstanceClearKey to talk to CDMInstanceSessionClearKey?  These new types should be added to CDMInstanceClearKey and use the existing type-checking system from GTK to cast directly from CDMInstance (or better yet, from CDMInstanceSession) to CDMInstanceClearKey.
Comment 7 Jer Noble 2018-12-03 11:16:22 PST
Comment on attachment 356384 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=356384&action=review

Generally, I'm not in favor of taking GStreamer-Specific code and moving it up into platform-independent code. So I don't really understand why this patch is necessary in the first place. Other questions:

> Source/WebCore/platform/encryptedmedia/CDMInstance.h:68
> +    enum class EncryptionMode {
> +        Unencrypted,
> +        Cenc,
> +        Cbcs,
> +    };

This is just a std::optional<CDMEncryptionScheme>.

> Source/WebCore/platform/encryptedmedia/CDMInstance.h:77
> +    struct SubSample {
> +        SubSample()
> +            : clearBytes(0), cipherBytes(0) { }
> +        SubSample(uint32_t clearBytes, uint32_t cipherBytes)
> +            : clearBytes(clearBytes), cipherBytes(cipherBytes) { }
> +        uint32_t clearBytes;
> +        uint32_t cipherBytes;
> +    };

Isn't this just a uint64_t? Or a std::pair<uint32_t>?  If it stays a struct, then you don't need a default or specialized constructor; it'd be written:

struct SubSample {
  uint32_t clearBytes { 0 };
  uint32_t cipherBytes { 0 };
};

> Source/WebCore/platform/encryptedmedia/clearkey/CDMClearKey.cpp:39
> +#include <gcrypt.h>

GStreamer specific code doesn't belong in this class.  A GStreamer-Specific CDM for ClearKey encryption should be used instead, or at the very least this class and file should be renamed.

> Source/WebCore/platform/encryptedmedia/clearkey/CDMClearKey.cpp:486
> +    ASSERT(!isMainThread());
> +    LOG(EME, "EME ClearKey - decrypting, looking up session for key ID [%s]", keyIDToHexString(config->keyID()).utf8().data());
>  
> -    for (auto& key : ClearKeyState::singleton().keys().values())
> -        allKeys.appendVector(key);
> +    auto* session = sessionForKeyID(config->keyID());
> +    if (!session) {
> +        auto locker = holdLock(m_waitingForKeyLock);
> +        m_waitingForKeyCondition.waitFor(m_waitingForKeyLock, config->waitForKeyTimeout(), [this, &config, &session] {
> +            return session = sessionForKeyID(config->keyID());
> +        });
> +        if (!session) {
> +            LOG(EME, "Failed to find matching session");
> +            return false;
> +        }
> +    }

This code seems weird. sessionForKeyID() isn't thread safe. What happens when the list of sessions is being mutated while sessionForKeyID() is called on a background thread? Why is it mandatory that this function is called off the main thread? Why isn't the client talking to the session directly? Why can't the decode fail if a key isn't available and retry when the client is notified that there's a new key available?

> Source/WebCore/platform/encryptedmedia/clearkey/CDMClearKey.cpp:609
> +
> +        LOG(EME, "EME ClearKey session - new keys are available, notifying any waiter");
> +        m_instance->signalNewKeysAvailable();

In addition to above, why can't this use the existing attemptToResumePlaybackOnClients() callback?
Comment 8 Charlie Turner 2018-12-03 11:29:56 PST
(In reply to Jer Noble from comment #6)
> I don't really understand why the new types and interfaces need to be added
> to CDMInstance if they're being used exclusively by CDMInstanceClearKey to
> talk to CDMInstanceSessionClearKey?

It seems like only CDMInstanceClearKey is using them, but we have more CDM implementations that we cannot upstream due to licensing (Widevine/PlayReady).

> These new types should be added to
> CDMInstanceClearKey and use the existing type-checking system from GTK to
> cast directly from CDMInstance (or better yet, from CDMInstanceSession) to
> CDMInstanceClearKey.

Given the above, that would result in a lot of duplication in our downstream situation.
Comment 9 Charlie Turner 2018-12-03 11:40:09 PST
(In reply to Jer Noble from comment #7)
> Comment on attachment 356384 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=356384&action=review
> 
> Generally, I'm not in favor of taking GStreamer-Specific code and moving it
> up into platform-independent code. So I don't really understand why this
> patch is necessary in the first place.

I want to move away from the ClearKey code being GStreamer specific, so I am trying to factor it into a cross-platform state. The classes/structs moved into CDMInstance are useful from other derived classes we have downstream for licensing reasons.

> 
> > Source/WebCore/platform/encryptedmedia/CDMInstance.h:68
> > +    enum class EncryptionMode {
> > +        Unencrypted,
> > +        Cenc,
> > +        Cbcs,
> > +    };
> 
> This is just a std::optional<CDMEncryptionScheme>.

Thanks, missed that.

> 
> > Source/WebCore/platform/encryptedmedia/CDMInstance.h:77
> > +    struct SubSample {
> > +        SubSample()
> > +            : clearBytes(0), cipherBytes(0) { }
> > +        SubSample(uint32_t clearBytes, uint32_t cipherBytes)
> > +            : clearBytes(clearBytes), cipherBytes(cipherBytes) { }
> > +        uint32_t clearBytes;
> > +        uint32_t cipherBytes;
> > +    };
> 
> Isn't this just a uint64_t? Or a std::pair<uint32_t>?  If it stays a struct,
> then you don't need a default or specialized constructor; it'd be written:
> 
> struct SubSample {
>   uint32_t clearBytes { 0 };
>   uint32_t cipherBytes { 0 };
> };

Will change.

> 
> > Source/WebCore/platform/encryptedmedia/clearkey/CDMClearKey.cpp:39
> > +#include <gcrypt.h>
> 
> GStreamer specific code doesn't belong in this class.  A GStreamer-Specific
> CDM for ClearKey encryption should be used instead, or at the very least
> this class and file should be renamed.

My goal here is to have cross-platform ClearKey implementation. I did mention in a REVIEW: comment in this patch about whether there are better ways of performing AES-128 decryption in WK, the stuff under WebCore/crypto looked promising but it's just for window.crypto.subtle, and not used elsewhere in WebCore. Do you have any objections to moving in this direction in general, i.e. having cross-platform ClearKey CDM?

> 
> > Source/WebCore/platform/encryptedmedia/clearkey/CDMClearKey.cpp:486
> > +    ASSERT(!isMainThread());
> > +    LOG(EME, "EME ClearKey - decrypting, looking up session for key ID [%s]", keyIDToHexString(config->keyID()).utf8().data());
> >  
> > -    for (auto& key : ClearKeyState::singleton().keys().values())
> > -        allKeys.appendVector(key);
> > +    auto* session = sessionForKeyID(config->keyID());
> > +    if (!session) {
> > +        auto locker = holdLock(m_waitingForKeyLock);
> > +        m_waitingForKeyCondition.waitFor(m_waitingForKeyLock, config->waitForKeyTimeout(), [this, &config, &session] {
> > +            return session = sessionForKeyID(config->keyID());
> > +        });
> > +        if (!session) {
> > +            LOG(EME, "Failed to find matching session");
> > +            return false;
> > +        }
> > +    }
> 
> This code seems weird. sessionForKeyID() isn't thread safe. What happens
> when the list of sessions is being mutated while sessionForKeyID() is called
> on a background thread?

That's a bug.

> Why is it mandatory that this function is called off
> the main thread?

Because of how the condition variable is woken up on the main thread when update() comes in, we need to wait on it off the main thread.

> Why isn't the client talking to the session directly?

That would be nice, and initially I wanted to do this, but when a CDMInstance is attached, we dispatch it to our GStreamer decryptor element. At the point it is dispatched, there's no way to know what session to send to the GStreamer pipeline. When GStreamer starts trying to process encrypted frames, only then does it have a key ID with which to look up an appropriate session. Hence, the CDMInstance is shipped, and extra methods added to lookup a session on demand. I agree this is not ideal, but there seemed no alternative.

> Why
> can't the decode fail if a key isn't available and retry when the client is
> notified that there's a new key available?

It would require sending more messages to the GStreamer pipeline, having the waiting for an available key in the decryptor class itself simplified the logic needed in the client in my use-case.

> 
> > Source/WebCore/platform/encryptedmedia/clearkey/CDMClearKey.cpp:609
> > +
> > +        LOG(EME, "EME ClearKey session - new keys are available, notifying any waiter");
> > +        m_instance->signalNewKeysAvailable();
> 
> In addition to above, why can't this use the existing
> attemptToResumePlaybackOnClients() callback?

Didn't know about that callback, will investigate.

Thank you for your review.
Comment 10 Jer Noble 2018-12-03 12:46:23 PST
(In reply to Charlie Turner from comment #9)
> (In reply to Jer Noble from comment #7)
> > Comment on attachment 356384 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=356384&action=review
> > 
> > Generally, I'm not in favor of taking GStreamer-Specific code and moving it
> > up into platform-independent code. So I don't really understand why this
> > patch is necessary in the first place.
> 
> I want to move away from the ClearKey code being GStreamer specific, so I am
> trying to factor it into a cross-platform state. The classes/structs moved
> into CDMInstance are useful from other derived classes we have downstream
> for licensing reasons.

But are they useful for any non-ClearKey CDMs?  So far, at least two of the added types can be replaced with existing types.

> > 
> > > Source/WebCore/platform/encryptedmedia/CDMInstance.h:68
> > > +    enum class EncryptionMode {
> > > +        Unencrypted,
> > > +        Cenc,
> > > +        Cbcs,
> > > +    };
> > 
> > This is just a std::optional<CDMEncryptionScheme>.
> 
> Thanks, missed that.

Additionally, since this is a property of the MediaKeySystemConfiguration, it doesn't make sense to provide this scheme as an input to decrypt(), since the CDM will have been already configured to be either 'cenc' or 'cbcs' by the client. The instance will only ever be either 'cenc' or 'cbcs' but won't switch between the two.  The 'keyID' property and I suspect that the 'iv' property are the same on a session basis.

> > 
> > > Source/WebCore/platform/encryptedmedia/CDMInstance.h:77
> > > +    struct SubSample {
> > > +        SubSample()
> > > +            : clearBytes(0), cipherBytes(0) { }
> > > +        SubSample(uint32_t clearBytes, uint32_t cipherBytes)
> > > +            : clearBytes(clearBytes), cipherBytes(cipherBytes) { }
> > > +        uint32_t clearBytes;
> > > +        uint32_t cipherBytes;
> > > +    };
> > 
> > Isn't this just a uint64_t? Or a std::pair<uint32_t>?  If it stays a struct,
> > then you don't need a default or specialized constructor; it'd be written:
> > 
> > struct SubSample {
> >   uint32_t clearBytes { 0 };
> >   uint32_t cipherBytes { 0 };
> > };
> 
> Will change.

Okay. I have other qualms about this struct; I don't understand why a SubSample is always going to be 4-bytes long, and whether this couldn't just be implemented with two Uint32Arrays, rather than a Vector<SubSample>.

So what's left is the 'timeout' property.  This has nothing to do with configuring the CDM, per se.

> > > Source/WebCore/platform/encryptedmedia/clearkey/CDMClearKey.cpp:39
> > > +#include <gcrypt.h>
> > 
> > GStreamer specific code doesn't belong in this class.  A GStreamer-Specific
> > CDM for ClearKey encryption should be used instead, or at the very least
> > this class and file should be renamed.
> 
> My goal here is to have cross-platform ClearKey implementation. I did
> mention in a REVIEW: comment in this patch about whether there are better
> ways of performing AES-128 decryption in WK, the stuff under WebCore/crypto
> looked promising but it's just for window.crypto.subtle, and not used
> elsewhere in WebCore. Do you have any objections to moving in this direction
> in general, i.e. having cross-platform ClearKey CDM?

No I don't, but so far this patch isn't cross-platform.

> > 
> > > Source/WebCore/platform/encryptedmedia/clearkey/CDMClearKey.cpp:486
> > > +    ASSERT(!isMainThread());
> > > +    LOG(EME, "EME ClearKey - decrypting, looking up session for key ID [%s]", keyIDToHexString(config->keyID()).utf8().data());
> > >  
> > > -    for (auto& key : ClearKeyState::singleton().keys().values())
> > > -        allKeys.appendVector(key);
> > > +    auto* session = sessionForKeyID(config->keyID());
> > > +    if (!session) {
> > > +        auto locker = holdLock(m_waitingForKeyLock);
> > > +        m_waitingForKeyCondition.waitFor(m_waitingForKeyLock, config->waitForKeyTimeout(), [this, &config, &session] {
> > > +            return session = sessionForKeyID(config->keyID());
> > > +        });
> > > +        if (!session) {
> > > +            LOG(EME, "Failed to find matching session");
> > > +            return false;
> > > +        }
> > > +    }
> > 
> > This code seems weird. sessionForKeyID() isn't thread safe. What happens
> > when the list of sessions is being mutated while sessionForKeyID() is called
> > on a background thread?
> 
> That's a bug.
> 
> > Why is it mandatory that this function is called off
> > the main thread?
> 
> Because of how the condition variable is woken up on the main thread when
> update() comes in, we need to wait on it off the main thread.

That doesn't really make sense. If the decrypt() function was non-blocking, the condition variable wouldn't be necessary in the first place.  For example, I don't see any place where the Condition is signaled if the CDMInstanceClearKey is destroyed. So if the instance is destroyed, the decrypt() thread will hang until the timeout is reached, and at that point, will reference deleted memory.

On a separate note, rather than using a Lock + Condition, WTF has a Semaphore object that should do both. You may want to look into using that directly. But I'd recommend just removing the Condition entirely.

If this really needs to be something that must be called on a background thread, please consider abstracting out "decode()" into something that is just a free-function that doesn't depend on any ivars.
Comment 11 Charlie Turner 2018-12-03 13:26:06 PST
(In reply to Jer Noble from comment #10)
> (In reply to Charlie Turner from comment #9)
> > (In reply to Jer Noble from comment #7)
> > > Comment on attachment 356384 [details]
> > > Patch
> > > 
> > > View in context:
> > > https://bugs.webkit.org/attachment.cgi?id=356384&action=review
> > > 
> > > Generally, I'm not in favor of taking GStreamer-Specific code and moving it
> > > up into platform-independent code. So I don't really understand why this
> > > patch is necessary in the first place.
> > 
> > I want to move away from the ClearKey code being GStreamer specific, so I am
> > trying to factor it into a cross-platform state. The classes/structs moved
> > into CDMInstance are useful from other derived classes we have downstream
> > for licensing reasons.
> 
> But are they useful for any non-ClearKey CDMs?  So far, at least two of the
> added types can be replaced with existing types.

The decrypt() method on CDMInstance is useful, the implementation details clearly haven't taken advantage of the existing types in WK.

> 
> > > 
> > > > Source/WebCore/platform/encryptedmedia/CDMInstance.h:68
> > > > +    enum class EncryptionMode {
> > > > +        Unencrypted,
> > > > +        Cenc,
> > > > +        Cbcs,
> > > > +    };
> > > 
> > > This is just a std::optional<CDMEncryptionScheme>.
> > 
> > Thanks, missed that.
> 
> Additionally, since this is a property of the MediaKeySystemConfiguration,
> it doesn't make sense to provide this scheme as an input to decrypt(), since
> the CDM will have been already configured to be either 'cenc' or 'cbcs' by
> the client.

It wasn't clear how that information can be retrieved from CDMInstance, but I will see if it can be exposed in the CDMInstance.

> The instance will only ever be either 'cenc' or 'cbcs' but won't
> switch between the two.  The 'keyID' property and I suspect that the 'iv'
> property are the same on a session basis.

I'm not sure how the keyID and IV are the same on a session basis, in fact they can change for each sample in the fundamental stream.

> 
> > > 
> > > > Source/WebCore/platform/encryptedmedia/CDMInstance.h:77
> > > > +    struct SubSample {
> > > > +        SubSample()
> > > > +            : clearBytes(0), cipherBytes(0) { }
> > > > +        SubSample(uint32_t clearBytes, uint32_t cipherBytes)
> > > > +            : clearBytes(clearBytes), cipherBytes(cipherBytes) { }
> > > > +        uint32_t clearBytes;
> > > > +        uint32_t cipherBytes;
> > > > +    };
> > > 
> > > Isn't this just a uint64_t? Or a std::pair<uint32_t>?  If it stays a struct,
> > > then you don't need a default or specialized constructor; it'd be written:
> > > 
> > > struct SubSample {
> > >   uint32_t clearBytes { 0 };
> > >   uint32_t cipherBytes { 0 };
> > > };
> > 
> > Will change.
> 
> Okay. I have other qualms about this struct; I don't understand why a
> SubSample is always going to be 4-bytes long

That's specified in the common encryption spec, again perhaps not generic enough, but I've not seen subsample formats that violate this.

> and whether this couldn't just
> be implemented with two Uint32Arrays, rather than a Vector<SubSample>.

Could be done either AoS of SoA, agree.

> So what's left is the 'timeout' property.  This has nothing to do with
> configuring the CDM, per se.

The root reason for the timeout is that we can be ready to decrypt in GStreamer before update() has happened. So there's a fundamental need for synchronisation somewhere. I chose the CDMInstance rather than the client to avoid needing more bus messages in our GStreamer implementation.

> 
> > > > Source/WebCore/platform/encryptedmedia/clearkey/CDMClearKey.cpp:39
> > > > +#include <gcrypt.h>
> > > 
> > > GStreamer specific code doesn't belong in this class.  A GStreamer-Specific
> > > CDM for ClearKey encryption should be used instead, or at the very least
> > > this class and file should be renamed.
> > 
> > My goal here is to have cross-platform ClearKey implementation. I did
> > mention in a REVIEW: comment in this patch about whether there are better
> > ways of performing AES-128 decryption in WK, the stuff under WebCore/crypto
> > looked promising but it's just for window.crypto.subtle, and not used
> > elsewhere in WebCore. Do you have any objections to moving in this direction
> > in general, i.e. having cross-platform ClearKey CDM?
> 
> No I don't, but so far this patch isn't cross-platform.

I'm looking to make it so, gcrypt is a cross-platform library, but I would have preferred to use WebCore classes for the decryption, but am unsure how to proceed along that path.

> 
> > > 
> > > > Source/WebCore/platform/encryptedmedia/clearkey/CDMClearKey.cpp:486
> > > > +    ASSERT(!isMainThread());
> > > > +    LOG(EME, "EME ClearKey - decrypting, looking up session for key ID [%s]", keyIDToHexString(config->keyID()).utf8().data());
> > > >  
> > > > -    for (auto& key : ClearKeyState::singleton().keys().values())
> > > > -        allKeys.appendVector(key);
> > > > +    auto* session = sessionForKeyID(config->keyID());
> > > > +    if (!session) {
> > > > +        auto locker = holdLock(m_waitingForKeyLock);
> > > > +        m_waitingForKeyCondition.waitFor(m_waitingForKeyLock, config->waitForKeyTimeout(), [this, &config, &session] {
> > > > +            return session = sessionForKeyID(config->keyID());
> > > > +        });
> > > > +        if (!session) {
> > > > +            LOG(EME, "Failed to find matching session");
> > > > +            return false;
> > > > +        }
> > > > +    }
> > > 
> > > This code seems weird. sessionForKeyID() isn't thread safe. What happens
> > > when the list of sessions is being mutated while sessionForKeyID() is called
> > > on a background thread?
> > 
> > That's a bug.
> > 
> > > Why is it mandatory that this function is called off
> > > the main thread?
> > 
> > Because of how the condition variable is woken up on the main thread when
> > update() comes in, we need to wait on it off the main thread.
> 
> That doesn't really make sense. If the decrypt() function was non-blocking,
> the condition variable wouldn't be necessary in the first place.  For
> example, I don't see any place where the Condition is signaled if the
> CDMInstanceClearKey is destroyed. So if the instance is destroyed, the
> decrypt() thread will hang until the timeout is reached, and at that point,
> will reference deleted memory.

Agree I missed a place where it needs signalling, the session decryption should probably be posted onto the main thread with a weakThis.

> On a separate note, rather than using a Lock + Condition, WTF has a
> Semaphore object that should do both. You may want to look into using that
> directly. But I'd recommend just removing the Condition entirely.

Thanks, will take a look.

> If this really needs to be something that must be called on a background
> thread, please consider abstracting out "decode()" into something that is
> just a free-function that doesn't depend on any ivars.

I will consider it, my gut feel is it will need the created session ivars at the very least to be able to find the one created with key ids requested from the client. It also seemed like in principle, a CDMInstance's primarly purpose was to manage keys and use them decrypt, so not associating decryption with the CDMInstance seems a bit odd, but I'll think more about it.

Thanks again for the review!
Comment 12 Jer Noble 2018-12-03 14:02:20 PST
(In reply to Charlie Turner from comment #11)
> (In reply to Jer Noble from comment #10)> 
> > So what's left is the 'timeout' property.  This has nothing to do with
> > configuring the CDM, per se.
> 
> The root reason for the timeout is that we can be ready to decrypt in
> GStreamer before update() has happened. 

Well, the root reason for the timeout is that you need to block, but can't trust that your object will ever be signaled.  We do something similar in Cocoa code, where we block forever in a background thread waiting for update(): <https://trac.webkit.org/browser/webkit/trunk/Source/WebCore/platform/graphics/avfoundation/objc/SourceBufferPrivateAVFObjC.mm#L198>, but use a separate semaphore to signal that the operation has been aborted. So I totally get /why/ you're doing what you're doing here. I just don't think this is the right place to /do/ what you're doing.

> So there's a fundamental need for
> synchronisation somewhere. I chose the CDMInstance rather than the client to
> avoid needing more bus messages in our GStreamer implementation.

Yes, but this is supposed to be platform-agnostic code, and it's implemented in a GStreamer-specific way.

Instead, consider this flow:

1) The GStreamer parser encounters an encrypted segment or sample.
2) The spec requires that parsing stops until a valid key has been delivered, so GStreamer blocks the background parser thread with a Semaphore
3) The client gets an 'encrypted' message, and initializes a CDM instance with some parameters, including keyID, encryption scheme, and possibly iv data.
4) GStreamer receives a "attempt to resume playback if necessary" message (by way of the attemptToDecryptWithInstance() MediaPlayerPrivate call)
5) GStreamer pulls the configuration parameters out of the CDMInstance, or fails if the required keyID is not present
6) GStreamer unblocks the decoding thread by signaling the semaphore.

None of this requires the CDMInstanceSessionClearKey object itself to be callable on a background thread.  The decode() method itself could be a static method on CDMInstanceClearKey or CDMInstanceSessionClearKey, and accept all the parameters needed to decode the vector of subsamples. There would be no Semaphore in the CDMInstance class itself, no timeout parameter, and no additional GStreamer bus messages.

> > > > > Source/WebCore/platform/encryptedmedia/clearkey/CDMClearKey.cpp:39
> > > > > +#include <gcrypt.h>
> > > > 
> > > > GStreamer specific code doesn't belong in this class.  A GStreamer-Specific
> > > > CDM for ClearKey encryption should be used instead, or at the very least
> > > > this class and file should be renamed.
> > > 
> > > My goal here is to have cross-platform ClearKey implementation. I did
> > > mention in a REVIEW: comment in this patch about whether there are better
> > > ways of performing AES-128 decryption in WK, the stuff under WebCore/crypto
> > > looked promising but it's just for window.crypto.subtle, and not used
> > > elsewhere in WebCore. Do you have any objections to moving in this direction
> > > in general, i.e. having cross-platform ClearKey CDM?
> > 
> > No I don't, but so far this patch isn't cross-platform.
> 
> I'm looking to make it so, gcrypt is a cross-platform library, but I would
> have preferred to use WebCore classes for the decryption, but am unsure how
> to proceed along that path.
> 
> > 
> > > > 
> > > > > Source/WebCore/platform/encryptedmedia/clearkey/CDMClearKey.cpp:486
> > > > > +    ASSERT(!isMainThread());
> > > > > +    LOG(EME, "EME ClearKey - decrypting, looking up session for key ID [%s]", keyIDToHexString(config->keyID()).utf8().data());
> > > > >  
> > > > > -    for (auto& key : ClearKeyState::singleton().keys().values())
> > > > > -        allKeys.appendVector(key);
> > > > > +    auto* session = sessionForKeyID(config->keyID());
> > > > > +    if (!session) {
> > > > > +        auto locker = holdLock(m_waitingForKeyLock);
> > > > > +        m_waitingForKeyCondition.waitFor(m_waitingForKeyLock, config->waitForKeyTimeout(), [this, &config, &session] {
> > > > > +            return session = sessionForKeyID(config->keyID());
> > > > > +        });
> > > > > +        if (!session) {
> > > > > +            LOG(EME, "Failed to find matching session");
> > > > > +            return false;
> > > > > +        }
> > > > > +    }
> > > > 
> > > > This code seems weird. sessionForKeyID() isn't thread safe. What happens
> > > > when the list of sessions is being mutated while sessionForKeyID() is called
> > > > on a background thread?
> > > 
> > > That's a bug.
> > > 
> > > > Why is it mandatory that this function is called off
> > > > the main thread?
> > > 
> > > Because of how the condition variable is woken up on the main thread when
> > > update() comes in, we need to wait on it off the main thread.
> > 
> > That doesn't really make sense. If the decrypt() function was non-blocking,
> > the condition variable wouldn't be necessary in the first place.  For
> > example, I don't see any place where the Condition is signaled if the
> > CDMInstanceClearKey is destroyed. So if the instance is destroyed, the
> > decrypt() thread will hang until the timeout is reached, and at that point,
> > will reference deleted memory.
> 
> Agree I missed a place where it needs signalling, the session decryption
> should probably be posted onto the main thread with a weakThis.
> 
> > On a separate note, rather than using a Lock + Condition, WTF has a
> > Semaphore object that should do both. You may want to look into using that
> > directly. But I'd recommend just removing the Condition entirely.
> 
> Thanks, will take a look.
> 
> > If this really needs to be something that must be called on a background
> > thread, please consider abstracting out "decode()" into something that is
> > just a free-function that doesn't depend on any ivars.
> 
> I will consider it, my gut feel is it will need the created session ivars at
> the very least to be able to find the one created with key ids requested
> from the client. It also seemed like in principle, a CDMInstance's primarly
> purpose was to manage keys and use them decrypt, so not associating
> decryption with the CDMInstance seems a bit odd, but I'll think more about
> it.
> 
> Thanks again for the review!
Comment 13 Charlie Turner 2018-12-07 04:23:08 PST
(In reply to Jer Noble from comment #12)
> (In reply to Charlie Turner from comment #11)
> > (In reply to Jer Noble from comment #10)> 
> > > So what's left is the 'timeout' property.  This has nothing to do with
> > > configuring the CDM, per se.
> > 
> > The root reason for the timeout is that we can be ready to decrypt in
> > GStreamer before update() has happened. 
> 
> Well, the root reason for the timeout is that you need to block, but can't
> trust that your object will ever be signaled.  We do something similar in
> Cocoa code, where we block forever in a background thread waiting for
> update():
> <https://trac.webkit.org/browser/webkit/trunk/Source/WebCore/platform/
> graphics/avfoundation/objc/SourceBufferPrivateAVFObjC.mm#L198>, but use a
> separate semaphore to signal that the operation has been aborted. So I
> totally get /why/ you're doing what you're doing here. I just don't think
> this is the right place to /do/ what you're doing.

Thanks for the pointer, it was instructive to see the FPS implementation in MSE.

> 
> > So there's a fundamental need for
> > synchronisation somewhere. I chose the CDMInstance rather than the client to
> > avoid needing more bus messages in our GStreamer implementation.
> 
> Yes, but this is supposed to be platform-agnostic code, and it's implemented
> in a GStreamer-specific way.
> 
> Instead, consider this flow:
> 
> 1) The GStreamer parser encounters an encrypted segment or sample.
> 2) The spec requires that parsing stops until a valid key has been
> delivered, so GStreamer blocks the background parser thread with a Semaphore
> 3) The client gets an 'encrypted' message, and initializes a CDM instance
> with some parameters, including keyID, encryption scheme, and possibly iv
> data.

This is where it breaks down for the "real" CDMs (not ClearKey) I need to support. With FPS, you seem to always be able to populate m_keys with with the result of parsing the SINF box in the init data, when it is first encountered. For Widevine, this is not the case. We normally detect an encrypted segment via the PSSH box. This box has two versions. Version 0 specifies no key IDs as present in the box (version 1 does). The EME spec says only that content creators SHOULD use version 1, but they do not in a lot of cases. What Widevine encoded content tends to do is stuff a proprietary formatted init data block in the Data field of the PSSH box, and you pass that blindly into the Widevine CDM. Then you need to wait for Widevine to call you back about the availability of a license server messages, from which point you can call an API to ask what keys the CDM has available (additionally, even if you did mirror the protobuf parsing and hack out the key IDs in the init data, Widevine has key generation, so it will generate more key IDs than are even available in the init data...). But this is all async, and can happen after WebCore thinks we can attemptToDecrypt(), since a CDM is created before update(). This was the reason why yet another rendezvous was adding in decode();.

So I can't maintain an m_keys member like you do in your SourceBufferPrivateAVFObjC class to support session querying in general. You have the option of issuing attemptToDecrypt() when you get setCDMInstance, I can not do this in general without further blocking on not only update(); but also the actual CDM issuing callbacks about available keys, after the JS update() has completed...


> 4) GStreamer receives a "attempt to resume playback if necessary" message
> (by way of the attemptToDecryptWithInstance() MediaPlayerPrivate call)

As above, we can't in general use the "attempt to resume playback if necessary" message to unblock our streaming thread. This is the reason for the further blocking in decode() in this patch. While for ClearKey it is pointless, for Widevine it is required due to the key availability issue. PlayReady can be even worse in that certain versions, that CDM actually give you no way to know what key IDs it has available! There should be a nicer way to handle this point, but perhaps we'll just have to continue with some extra blocking points in our downstream decryptors to workaround these CDM behaviours, ClearKey and FPS are much nicer to integrate into the current WebCore APIs that Widevine and Playready.

> None of this requires the CDMInstanceSessionClearKey object itself to be
> callable on a background thread.  The decode() method itself could be a
> static method on CDMInstanceClearKey or CDMInstanceSessionClearKey, and
> accept all the parameters needed to decode the vector of subsamples. There
> would be no Semaphore in the CDMInstance class itself, no timeout parameter,
> and no additional GStreamer bus messages.

My conclusion here is that I will have to maintain CDM specific workarounds in our downstream decryptors that can not assume the availability of a CDM instance implies they can continue decrypting. Rather, the derived CDMInstance's will have extra methods that answer whether the underlying CDM has keys available yet, independent of "attempt to decrypt", which doesn't take these cases into account.

For ClearKey, I can mimic more closely what you're doing with FPS, in terms of extracting and storing key IDs on init data encountered, and move the decode() logic back to something GStreamer/gcrypt specific, to avoid touching the CDMInstance base classes. For Widevine/PlayReady, we have problems described above getting the synchronisation to play nicely with the attemptToDecrypt() algorithm trigger points.
Comment 14 Charlie Turner 2018-12-13 04:09:38 PST
Hi Jer,

I have two proposals to sort out our issue about not having key IDs available in advance in the general case.

In our off-main-thread decryptor, we extract key IDs from the encoded samples and have the issue that lack of key IDs in init data / key rotation can force us to block midway through the content, so we need some kind of blocking during decode(), for example some YouTube protected content can change key IDs every 5 minutes during playback, so even if you could extract key IDs from the init data, you're still out of luck.

So the proposals from the POV of a platform implementation of "attempt to decrypt with instance",

Option 1. CDMInstance as a key manager and decoder.

gstreamerDecode(newSample) {
   nextKeyID = newSample->keyID;
   iv = newSample->iv;
   if (cached key ids does not contain the nextKeyID) {
      // waiting for key = true
      // block waiting for next attempt to decrypt with instance.
      m_session = CDMInstance::sessionForKeyID(newKeyID);
      if (!m_session) { error out; }
      // update cached key ids
      // waiting for key = false
   }
   session->decrypt(nextKeyID, iv, encryptedBuffer); // now non-blocking, but called off main-thread
}

sessionForKeyID could be reused in your implementation as well, since I notice you have a similar use-case there for looking up sessions with key IDs. I haven't figured out how decode() works in the FPS stuff however.

Option 2. CDMInstance has a key manager only, rather than also a decoder

Same flow as Option 1., however, once the next "attempt to decrypt with instance" comes through, we instead perform decode() in our decryptor, asking the CDMInstance just for the new key data,

gstreamerDecode(newSample) {
   nextKeyID = newSample->keyID;
   iv = newSample->iv;
   if (cached key ids does not contain the nextKeyID) {
      // waiting for key = true
      // block waiting for next attempt to decrypt with instance.
      RefPtr<SharedBuffer> keyData = cdmInstance->getKey(nextKeyID);
      if (!keyData) { error out; }
      // update cached key ids
   }
   // perform decryptor specific decode() using the key data
}

Where getKey would need to query the available sessions and ask a matching one, if any, for its key data.

Either way, the CDMInstance API is not adequate to support our CDMs in general, since we have no way of extract available key IDs in advance of decode() in all cases. Even if we could, key rotation is problem that would require this anyway. I'm not sure how FPS handles key rotation, or whether this is a case you need to deal with there.

Of the two, I feel option 2. is the least invasive. However conceptually it seems clearer for me that CDMInstance would be responsible for key management and decode.
Comment 15 Charlie Turner 2018-12-13 06:42:08 PST
Created attachment 357229 [details]
Patch

Move back to multi-decryptor architecture, and use CDM downcasting to avoid making any changes to CDMInstance. This avoids needing to touch the base class, and instead moves us to relying on the CDMInstance for key management, rather than key IDs extracted from init data that is not possible in general. Still think it would be a good idea for CDMInstance to have an availableKeys() method or something similar as explained above, but to unblock here, this alternative is now propsed
Comment 16 Jer Noble 2018-12-13 11:02:09 PST
(In reply to Charlie Turner from comment #14)
> Hi Jer,
> 
> I have two proposals to sort out our issue about not having key IDs
> available in advance in the general case.
> 
> In our off-main-thread decryptor, we extract key IDs from the encoded
> samples and have the issue that lack of key IDs in init data / key rotation
> can force us to block midway through the content, so we need some kind of
> blocking during decode(), for example some YouTube protected content can
> change key IDs every 5 minutes during playback, so even if you could extract
> key IDs from the init data, you're still out of luck.

Is it really the case that YouTube will rotate it's keyIDs outside of a new initialization segment? That seems weird, but maybe it's a Widevine vs. FPS thing.

> I haven't figured out how decode() works in the FPS stuff however.

The "decode()" step happens internal to the media engine. They only call up to the WebKit layer when they find they don't have the keys necessary to decrypt.

> Either way, the CDMInstance API is not adequate to support our CDMs in
> general, since we have no way of extract available key IDs in advance of
> decode() in all cases. Even if we could, key rotation is problem that would
> require this anyway. I'm not sure how FPS handles key rotation, or whether
> this is a case you need to deal with there.
> 
> Of the two, I feel option 2. is the least invasive. However conceptually it
> seems clearer for me that CDMInstance would be responsible for key
> management and decode.

I'm not sure I agree with your "however".  The intent of these classes was to be a communication mechanism to the actual CDM, not to become CDMs in and of themselves.  Maybe we named them incorrectly.

All that said, I don't think either of your two options are incredibly problematic. I would prefer that, rather than trying to call into either the CDMInstance or CDMInstanceSession to retrieve key data on the decode() thread, that instead the decode() thread blocks, dispatches to the main thread, and signals the decode() thread to continue once key data is fetched.  If not, we have to be very, very careful to make all the accessors in every CDMInstance subclass thread-safe.

(Alternatively, if we decide that the main thread is too crazy a place to do this work, we can decide as a project to move all CDM access to a well known thread, but we'd still want to block/dispatch to get new keys.)
Comment 17 Xabier Rodríguez Calvar 2018-12-14 07:50:45 PST
Comment on attachment 357229 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=357229&action=review

> Source/WebCore/platform/graphics/gstreamer/GStreamerCommon.h:106
> +        if (m_isValid && GST_IS_BUFFER(m_buffer))

I haven't checked the implications of this happening. From what I understand the buffer can be unreffed out while we hold a map on it, right?. Checking this is a "poor-man's" fail-safe. If the buffer was valid before and it is not now, it means that the memory is gone and even it will most probably will answer false, there's the small possibility of it crashing.

Something makes me worry even more and it is what happens if we access the memory with the data() accessors.

Considering this, I think we should ref the buffer inside the GstMappedBuffer object by holding a GRefPtr instead of a plain pointer. I'd also ASSERT(GST_IS_BUFFER(buffer)) in the constructor.

> Source/WebCore/platform/graphics/gstreamer/GStreamerCommon.h:114
> +    Ref<SharedBuffer> sharedBuffer() const { return SharedBuffer::create(data(), size()); }

Might it be interesting to cache the shared buffer and return a const &?

> Source/WebCore/platform/graphics/gstreamer/eme/WebKitClearKeyDecryptorGStreamer.cpp:121
> +static bool handleKeyResponse(WebKitMediaCommonEncryptionDecrypt* self, RefPtr<CDMInstance> cdmInstance)

could we use const & here?

> Source/WebCore/platform/graphics/gstreamer/eme/WebKitClearKeyDecryptorGStreamer.cpp:128
> +static bool findAndSetKey(WebKitMediaClearKeyDecryptPrivate* priv, const Ref<SharedBuffer> keyID)

const &?

> Source/WebCore/platform/graphics/gstreamer/eme/WebKitCommonEncryptionDecryptorGStreamer.cpp:324
> -            priv->cdmInstance = value ? reinterpret_cast<CDMInstance*>(g_value_get_pointer(value)) : nullptr;
> +            // Capture the CDMInstance into a separate variable to avoid missing a refcount.
> +            CDMInstance* instance = value ? reinterpret_cast<CDMInstance*>(g_value_get_pointer(value)) : nullptr;
> +            // ... And force a refcount bump using operator=.
> +            priv->cdmInstance = instance;

I don't think this change is needed. It's the same thing but with more stack variables and comments. I think we don't need them.

> Source/WebCore/platform/graphics/gstreamer/eme/WebKitCommonEncryptionDecryptorGStreamer.cpp:326
> +                GST_DEBUG_OBJECT(self, "received a new CDM instance %p, refcount %u", priv->cdmInstance.get(), priv->cdmInstance->refCount());

ref count was probably useful while debugging the ref counting issues we had discussed, but I don't think it is needed here.

> Source/WebCore/platform/graphics/gstreamer/eme/WebKitCommonEncryptionDecryptorGStreamer.cpp:360
> +        if (klass->handleKeyResponse(self, priv->cdmInstance)) {
>              GST_DEBUG_OBJECT(self, "key received");
>              priv->keyReceived = true;

Considering what we have discussed internally. This does not make sense anymore. Do you have plans to remove the keyReceived attribute and rework this part?
Comment 18 Charlie Turner 2018-12-18 02:32:33 PST
(In reply to Jer Noble from comment #16)
> (In reply to Charlie Turner from comment #14)
> > Hi Jer,
> > 
> > I have two proposals to sort out our issue about not having key IDs
> > available in advance in the general case.
> > 
> > In our off-main-thread decryptor, we extract key IDs from the encoded
> > samples and have the issue that lack of key IDs in init data / key rotation
> > can force us to block midway through the content, so we need some kind of
> > blocking during decode(), for example some YouTube protected content can
> > change key IDs every 5 minutes during playback, so even if you could extract
> > key IDs from the init data, you're still out of luck.
> 
> Is it really the case that YouTube will rotate it's keyIDs outside of a new
> initialization segment? That seems weird, but maybe it's a Widevine vs. FPS
> thing.

Widevine sends us a license-renewal (https://www.w3.org/TR/encrypted-media/#dom-mediakeymessagetype-license-renewal) every minute, YouTube protected content's licenses will expire every 5 minutes if it's not renewed. In theory, this could change the available keys.


> > Of the two, I feel option 2. is the least invasive. However conceptually it
> > seems clearer for me that CDMInstance would be responsible for key
> > management and decode.
> 
> I'm not sure I agree with your "however".  The intent of these classes was
> to be a communication mechanism to the actual CDM, not to become CDMs in and
> of themselves.  Maybe we named them incorrectly.

The Widevine CDM API has a decrypt() method, so part of communication mechanisms with this CDM is decryption. Same for Playready. I guess the APIs for FPS / (Playready/Widevine) are sufficiently different that the confusion came about as to what communication needs to be made available to clients of a CDMInstance/CDMInstanceSession.

Currently for Widevine, the CDMInstanceWidevine registers callbacks on the CDM handle for things like onMessage, onKeyStatusUpdate, etc, etc. We then have a decryptor element in GStreamer that needs to call the CDM handle's decrypt() method, after the CDMInstanceWidevine has done all the necessary setup / updating on this handle. It makes no sense to me that we can't use a method on CDMInstance to tell it to decrypt. That's not decrypting *in* the CDMInstance class, it's allowing communicating the "decrypt intent" to the underlying CDM.

> 
> All that said, I don't think either of your two options are incredibly
> problematic. I would prefer that, rather than trying to call into either the
> CDMInstance or CDMInstanceSession to retrieve key data on the decode()
> thread, that instead the decode() thread blocks, dispatches to the main
> thread, and signals the decode() thread to continue once key data is
> fetched.  If not, we have to be very, very careful to make all the accessors
> in every CDMInstance subclass thread-safe.

Dispatching to the main thread and waiting for key updates would block the main thread for too long, we can be waiting upto 6 seconds for key data to be returned in YouTube tests for example.

> 
> (Alternatively, if we decide that the main thread is too crazy a place to do
> this work, we can decide as a project to move all CDM access to a well known
> thread, but we'd still want to block/dispatch to get new keys.)
Comment 19 Jer Noble 2018-12-18 06:41:37 PST
(In reply to Charlie Turner from comment #18)
> (In reply to Jer Noble from comment #16)
> > > Of the two, I feel option 2. is the least invasive. However conceptually it
> > > seems clearer for me that CDMInstance would be responsible for key
> > > management and decode.
> > 
> > I'm not sure I agree with your "however".  The intent of these classes was
> > to be a communication mechanism to the actual CDM, not to become CDMs in and
> > of themselves.  Maybe we named them incorrectly.
> 
> The Widevine CDM API has a decrypt() method, so part of communication
> mechanisms with this CDM is decryption. Same for Playready. I guess the APIs
> for FPS / (Playready/Widevine) are sufficiently different that the confusion
> came about as to what communication needs to be made available to clients of
> a CDMInstance/CDMInstanceSession.
> 
> Currently for Widevine, the CDMInstanceWidevine registers callbacks on the
> CDM handle for things like onMessage, onKeyStatusUpdate, etc, etc. We then
> have a decryptor element in GStreamer that needs to call the CDM handle's
> decrypt() method, after the CDMInstanceWidevine has done all the necessary
> setup / updating on this handle. It makes no sense to me that we can't use a
> method on CDMInstance to tell it to decrypt. That's not decrypting *in* the
> CDMInstance class, it's allowing communicating the "decrypt intent" to the
> underlying CDM.

Gotcha. I was still thinking about the CDMClearKeyInstance case.

> Dispatching to the main thread and waiting for key updates would block the
> main thread for too long, we can be waiting upto 6 seconds for key data to
> be returned in YouTube tests for example.

Oh I wasn't suggesting that the work done on the main thread be synchronous!  I was imagining something like (pseudocode):

void DecoderThread::needKey(auto keyID)
{
    WTF::Semaphore sem;
    WTF::CompletionHandler<void(auto keyData)> callback = [&] (auto keyData) {
        m_keyData = keyData;
        sem.signa();
    };

    callOnMainThread([callback = WTFMove(callback)] () mutable {
        m_cdmInstance.getKeyForKeyID(keyID, WTFMove(callback));
    });
    sem.wait();

    if (hasValidKey())
        decode();
}

void CDMInstanceWidevine::getKeyForKeyID(auto keyID, auto callback)
{
    m_keyNeeded = keyID;
    m_getKeyCallback = WTFMove(callback);
    startAskingForKey(keyID);
}

void CDMInstanceWidevine::keyWasAdded(auto keyID)
{
    if (keyID == m_keyNeeded && m_getKeyCallback)
        m_getKeyCallback(keyForKeyID(keyID));
}
Comment 20 Jer Noble 2018-12-18 06:43:42 PST
BTW, please don't let this side conversation interrupt reviewing this patch; it looks like this new patch is mostly confined to GStreamer code, so all my previous nits have been thoroughly picked.
Comment 21 Xabier Rodríguez Calvar 2018-12-18 08:07:45 PST
(In reply to Jer Noble from comment #19)
> > Dispatching to the main thread and waiting for key updates would block the
> > main thread for too long, we can be waiting upto 6 seconds for key data to
> > be returned in YouTube tests for example.
> 
> Oh I wasn't suggesting that the work done on the main thread be synchronous!
> I was imagining something like (pseudocode):
> 
> void DecoderThread::needKey(auto keyID)
> {
>     WTF::Semaphore sem;
>     WTF::CompletionHandler<void(auto keyData)> callback = [&] (auto keyData)
> {
>         m_keyData = keyData;
>         sem.signa();
>     };
> 
>     callOnMainThread([callback = WTFMove(callback)] () mutable {
>         m_cdmInstance.getKeyForKeyID(keyID, WTFMove(callback));
>     });
>     sem.wait();
> 
>     if (hasValidKey())
>         decode();
> }
> 
> void CDMInstanceWidevine::getKeyForKeyID(auto keyID, auto callback)
> {
>     m_keyNeeded = keyID;
>     m_getKeyCallback = WTFMove(callback);
>     startAskingForKey(keyID);
> }
> 
> void CDMInstanceWidevine::keyWasAdded(auto keyID)
> {
>     if (keyID == m_keyNeeded && m_getKeyCallback)
>         m_getKeyCallback(keyForKeyID(keyID));
> }

Yes, we understood that, but in this case it has the same result, which is that the decryptor streaming thread always needs to call things on the main thread FOR EVERY FRAME, which is where the problem is. We can't do that for every frame, it's too much overhead. Actually we though of caching the key depending on the key id so that we don't have to defer to the main thread for every frame but then we have the problem of key renewal and the spec says very clearly that whenever we are about to decrypt, we need to check the key's validity which makes caching not spec compliant and of course very bug prone.

The only working solution is making the CDM instance thread safe regarding key retrieval. This would be enough for us.

And considering that we are adding key retrieval, having the decryption inside the CDM instance wouldn't be such an "offense", specially because the spec says at:

https://www.w3.org/TR/encrypted-media/#attempt-to-decrypt

[...]
3.2 Let cdm be the CDM instance represented by media keys's cdm instance value.
[...]
4.5.1 Use the cdm to decrypt block using block key.
[...]

I think Charlie's impl gets everybody closer to the spec. I think we should agree on going this way and try to find a way that does not break you people.
Comment 22 Jer Noble 2018-12-18 10:30:28 PST
(In reply to Xabier Rodríguez Calvar from comment #21)
> And considering that we are adding key retrieval, having the decryption
> inside the CDM instance wouldn't be such an "offense", specially because the
> spec says at:
> 
> https://www.w3.org/TR/encrypted-media/#attempt-to-decrypt
> 
> [...]
> 3.2 Let cdm be the CDM instance represented by media keys's cdm instance
> value.

Yes, this is why I said that the WebCore class "CDMInstance" may be mis-named, since it implies that the class is the "CDM instance" the spec refers to, when it's really just the platform-specific part of the MediaKeysInstance object which communicates with the _actual_ CDM.

Additionally, the point of the MediaKeys and CDMInstance classes is for the page to communicate with the (actual) CDM; it's not necessary for the media player to use the same interface to communicate with the CDM.

The only reason I'm pushing back on making broad changes to the CDMInstance.h abstract base class itself is that I don't want to encode platform-specific behavior in the interfaces that all platforms have to implement.  I'm totally fine with GStreamer-specific interfaces between GStreamer platform objects.  And since Charlie's current patch is limited to only GStreamer-specific code, there's not much left to argue about. :)
Comment 23 Charlie Turner 2018-12-18 13:11:47 PST
(In reply to Jer Noble from comment #19)
> (In reply to Charlie Turner from comment #18)
> > (In reply to Jer Noble from comment #16)
> > > > Of the two, I feel option 2. is the least invasive. However conceptually it
> > > > seems clearer for me that CDMInstance would be responsible for key
> > > > management and decode.
> > > 
> > > I'm not sure I agree with your "however".  The intent of these classes was
> > > to be a communication mechanism to the actual CDM, not to become CDMs in and
> > > of themselves.  Maybe we named them incorrectly.
> > 
> > The Widevine CDM API has a decrypt() method, so part of communication
> > mechanisms with this CDM is decryption. Same for Playready. I guess the APIs
> > for FPS / (Playready/Widevine) are sufficiently different that the confusion
> > came about as to what communication needs to be made available to clients of
> > a CDMInstance/CDMInstanceSession.
> > 
> > Currently for Widevine, the CDMInstanceWidevine registers callbacks on the
> > CDM handle for things like onMessage, onKeyStatusUpdate, etc, etc. We then
> > have a decryptor element in GStreamer that needs to call the CDM handle's
> > decrypt() method, after the CDMInstanceWidevine has done all the necessary
> > setup / updating on this handle. It makes no sense to me that we can't use a
> > method on CDMInstance to tell it to decrypt. That's not decrypting *in* the
> > CDMInstance class, it's allowing communicating the "decrypt intent" to the
> > underlying CDM.
> 
> Gotcha. I was still thinking about the CDMClearKeyInstance case.

Yeah, it's awkward discussing designs for things we can't upload, I'm sorry it's like this :)

> > Dispatching to the main thread and waiting for key updates would block the
> > main thread for too long, we can be waiting upto 6 seconds for key data to
> > be returned in YouTube tests for example.
> 
> Oh I wasn't suggesting that the work done on the main thread be synchronous!
> I was imagining something like (pseudocode):

Ah, gotcha. Your next comment makes things much clearer (in terms of the original push-back to the changes on CDMInstance).

> Additionally, the point of the MediaKeys and CDMInstance classes is for the page to communicate with the (actual) CDM; it's not necessary for the media player to use the same interface to communicate with the CDM.

I understand now why we had a mismatch on wanting decode() to be part of the CDMInstance API, of course the JS page has no interest in that. However, there a few curiosities now,

 - Already CDMInstance implementors like FairPlay have a number of extra methods to allow the MediaPlayer to tell the CDM things the page doesn't need to, including methods that appear to get some sort of FPS platform handle (like     AVContentKeySession* contentKeySession())
 - Why MediaPlayer::cdmInstanceAttached(CDMInstance&) is defined as so. It seemed like that was there to give the MediaPlayer a handle to talk with the (actual) CDM.
 - Given the above, perhaps some class ProxyCDM (actual CDM) should be available from the CDMInstance, it might be worthwhile to define a common interface to talk with real CDMs, since that's what this patch is about.

At the moment, the way this patch is going, we'd need on our CDMInstanceWidevine some sort of getKeyForKeyID() as you outlined and a getWidevineHandle() with which to decrypt. If you're not on-board with that, then we can go the route of having instance specific APIs as we've discussed, and not touch the base class. This may be the best way forward, since it's not clear we can define a common interface that is useful to both FPS and (Widevine/Playready). We definitely have a naming problem at least! :)
Comment 24 Charlie Turner 2018-12-18 13:13:19 PST
(In reply to Charlie Turner from comment #23)
> If you're not on-board with that,

By this I mean, not on-board with a ProxyCDM idea.
Comment 25 Jer Noble 2018-12-18 15:09:28 PST
(In reply to Charlie Turner from comment #23)
> I understand now why we had a mismatch on wanting decode() to be part of the
> CDMInstance API, of course the JS page has no interest in that. However,
> there a few curiosities now,
> 
>  - Already CDMInstance implementors like FairPlay have a number of extra
> methods to allow the MediaPlayer to tell the CDM things the page doesn't
> need to, including methods that appear to get some sort of FPS platform
> handle (like     AVContentKeySession* contentKeySession())
>  - Why MediaPlayer::cdmInstanceAttached(CDMInstance&) is defined as so. It
> seemed like that was there to give the MediaPlayer a handle to talk with the
> (actual) CDM.
>  - Given the above, perhaps some class ProxyCDM (actual CDM) should be
> available from the CDMInstance, it might be worthwhile to define a common
> interface to talk with real CDMs, since that's what this patch is about.

That sounds like a really promising refactoring! I'm totally on board.
Comment 26 Xabier Rodríguez Calvar 2018-12-19 03:47:13 PST
Comment on attachment 357229 [details]
Patch

There are some changes that you can do in follow ups, though some others like the cdminstance ref counting and maybe the consts that IMHO should be done here.
Comment 27 Charlie Turner 2019-01-18 07:18:46 PST
Created attachment 359479 [details]
Patch for landing
Comment 28 Charlie Turner 2019-01-18 07:20:26 PST
Created attachment 359480 [details]
Patch for landing
Comment 29 WebKit Commit Bot 2019-01-18 07:59:27 PST
Comment on attachment 359480 [details]
Patch for landing

Clearing flags on attachment: 359480

Committed r240148: <https://trac.webkit.org/changeset/240148>
Comment 30 WebKit Commit Bot 2019-01-18 07:59:30 PST
All reviewed patches have been landed.  Closing bug.