Bug 216696 - Add support for HTMLMediaElement.setSinkId
Summary: Add support for HTMLMediaElement.setSinkId
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebRTC (show other bugs)
Version: WebKit Local Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: youenn fablet
URL:
Keywords: InRadar
Depends on: 216693
Blocks:
  Show dependency treegraph
 
Reported: 2020-09-18 08:10 PDT by youenn fablet
Modified: 2021-08-24 10:20 PDT (History)
24 users (show)

See Also:


Attachments
Patch (35.03 KB, patch)
2020-09-18 08:28 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch (44.66 KB, patch)
2020-09-18 10:57 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch (49.80 KB, patch)
2020-09-20 09:29 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch (46.17 KB, patch)
2020-09-20 10:29 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch (48.83 KB, patch)
2020-09-21 00:37 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch (52.59 KB, patch)
2020-09-21 06:00 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch (58.92 KB, patch)
2020-09-21 12:07 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch for landing (57.42 KB, patch)
2020-09-23 03:09 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch for landing (57.38 KB, patch)
2020-09-23 04:17 PDT, youenn fablet
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description youenn fablet 2020-09-18 08:10:58 PDT
Add support for HTMLMediaElement.setSinkId
Comment 1 youenn fablet 2020-09-18 08:28:03 PDT
Created attachment 409130 [details]
Patch
Comment 2 Eric Carlson 2020-09-18 10:47:37 PDT
Comment on attachment 409130 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=409130&action=review

> Source/WebCore/ChangeLog:8
> +        Implement setSinkId and sinkId as per https://w3c.github.io/mediacapture-output/#htmlmediaelement-extensions,

s/,/./

> LayoutTests/imported/w3c/ChangeLog:15
> +        * web-platform-tests/audio-output/META.yml: Added.
> +        * web-platform-tests/audio-output/idlharness.https.window-expected.txt: Added.
> +        * web-platform-tests/audio-output/idlharness.https.window.html: Added.
> +        * web-platform-tests/audio-output/idlharness.https.window.js: Added.
> +        * web-platform-tests/audio-output/setSinkId.https-expected.txt: Added.
> +        * web-platform-tests/audio-output/setSinkId.https.html: Added.
> +        * web-platform-tests/audio-output/w3c-import.log: Added.

It looks like you forgot to add these.
Comment 3 youenn fablet 2020-09-18 10:57:25 PDT
Created attachment 409146 [details]
Patch
Comment 4 youenn fablet 2020-09-18 10:57:57 PDT
> It looks like you forgot to add these.

The danger of git reset HEAD~1...
Here they are!
Comment 5 youenn fablet 2020-09-20 09:29:26 PDT
Created attachment 409234 [details]
Patch
Comment 6 youenn fablet 2020-09-20 10:29:30 PDT
Created attachment 409237 [details]
Patch
Comment 7 Sam Weinig 2020-09-20 13:54:32 PDT
Comment on attachment 409237 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=409237&action=review

> Source/WebCore/html/HTMLMediaElement.cpp:2863
> +    if (!document().processingUserGestureForMedia() && document().topDocument().settings().speakerSelectionRequiresUserGesture()) {

What is the goal of accessing settings from topDocument here?

> Source/WebCore/html/HTMLMediaElement.idl:92
> +    [Conditional=MEDIA_STREAM, EnabledBySetting=ExposeSpeakers&PerElementSpeakerSelection] readonly attribute DOMString audioOutputDevice;
> +    [Conditional=MEDIA_STREAM, EnabledBySetting=ExposeSpeakers&PerElementSpeakerSelection, ImplementedAs=audioOutputHashedDeviceId] readonly attribute DOMString sinkId;
> +    [Conditional=MEDIA_STREAM, EnabledBySetting=ExposeSpeakers&PerElementSpeakerSelection] Promise<undefined> setAudioOutputDevice(DOMString deviceId);
> +    [Conditional=MEDIA_STREAM, EnabledBySetting=ExposeSpeakers&PerElementSpeakerSelection, ImplementedAs=setAudioOutputDevice] Promise<undefined> setSinkId(DOMString deviceId);

From the spec you linked in the ChangeLog, I only saw sinkId and setSinkId. Where do audioOutputDevice and setAudioOutputDevice come from? Regardless, these can and should now be defined from their own IDL file using a partial interface (partial interfaces no longer require a separate implementation class with static functions, so the implementation can stay in HTMLMediaElement.h/cpp if that is the right place for it to live.

> Source/WebCore/platform/graphics/avfoundation/objc/MediaPlayerPrivateAVFoundationObjC.mm:956
> +#if PLATFORM(MAC)

Rather than using PLATFORM(MAC) here, I have been trying to encourage people to use more fine grained defines. Perhaps ENABLE(MEDIA_STREAM_OUTPUT_DEVICE_ID) or something like that?

> Source/WebKit/Shared/WebPreferencesExperimental.yaml:146
> +  category: experimental

it is no longer needed (or desired) to set the category as it it implied by the file.

> Source/WebKit/Shared/WebPreferencesExperimental.yaml:147
> +  condition: ENABLE(WEB_RTC)

Is this the right condition? In the rest of the patch, you have mostly been using ENABLE(MEDIA_STREAM).
Comment 8 youenn fablet 2020-09-21 00:37:49 PDT
Created attachment 409263 [details]
Patch
Comment 9 youenn fablet 2020-09-21 00:40:45 PDT
(In reply to Sam Weinig from comment #7)
> Comment on attachment 409237 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=409237&action=review
> 
> > Source/WebCore/html/HTMLMediaElement.cpp:2863
> > +    if (!document().processingUserGestureForMedia() && document().topDocument().settings().speakerSelectionRequiresUserGesture()) {
> 
> What is the goal of accessing settings from topDocument here?

This is mirroring what is done for audio playing restriction.
I changed it to use the current document for now.

> > Source/WebCore/html/HTMLMediaElement.idl:92
> > +    [Conditional=MEDIA_STREAM, EnabledBySetting=ExposeSpeakers&PerElementSpeakerSelection] readonly attribute DOMString audioOutputDevice;
> > +    [Conditional=MEDIA_STREAM, EnabledBySetting=ExposeSpeakers&PerElementSpeakerSelection, ImplementedAs=audioOutputHashedDeviceId] readonly attribute DOMString sinkId;
> > +    [Conditional=MEDIA_STREAM, EnabledBySetting=ExposeSpeakers&PerElementSpeakerSelection] Promise<undefined> setAudioOutputDevice(DOMString deviceId);
> > +    [Conditional=MEDIA_STREAM, EnabledBySetting=ExposeSpeakers&PerElementSpeakerSelection, ImplementedAs=setAudioOutputDevice] Promise<undefined> setSinkId(DOMString deviceId);
> 
> From the spec you linked in the ChangeLog, I only saw sinkId and setSinkId.
> Where do audioOutputDevice and setAudioOutputDevice come from? Regardless,
> these can and should now be defined from their own IDL file using a partial
> interface (partial interfaces no longer require a separate implementation
> class with static functions, so the implementation can stay in
> HTMLMediaElement.h/cpp if that is the right place for it to live.

WG agreed to introduce navigator.mediaDevices.setAudioOutputDevice to set the audio output for a whole page.
I filed an issue to rename SetSinkId to setAudioOutputDevice for consistency.
I removed it for now and will introduce it later.

> > Source/WebCore/platform/graphics/avfoundation/objc/MediaPlayerPrivateAVFoundationObjC.mm:956
> > +#if PLATFORM(MAC)
> 
> Rather than using PLATFORM(MAC) here, I have been trying to encourage people
> to use more fine grained defines. Perhaps
> ENABLE(MEDIA_STREAM_OUTPUT_DEVICE_ID) or something like that?

OK, let's use HAVE() instead.
Comment 10 youenn fablet 2020-09-21 06:00:11 PDT
Created attachment 409274 [details]
Patch
Comment 11 Sam Weinig 2020-09-21 08:27:16 PDT
Comment on attachment 409274 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=409274&action=review

> Source/WTF/ChangeLog:8
> +
> +        * wtf/PlatformHave.h:

Please fill in the changelog.

> Source/WebKit/ChangeLog:14
> +        Add internal flag to enable/disable user gesture requirement for setting audio output.
> +
> +        * Shared/WebPreferencesExperimental.yaml:
> +        * UIProcess/API/C/WKPreferences.cpp:
> +        (WKPreferencesSetSpeakerSelectionRequiresUserGesture):
> +        (WKPreferencesGetSpeakerSelectionRequiresUserGesture):
> +        * UIProcess/API/C/WKPreferencesRefPrivate.h:

This ChangeLog doesn't look accurate. You changed both WebPreferencesInternal.yaml and WebPreferencesExperimental.yaml.

> Source/WebCore/html/HTMLMediaElement.idl:90
> +    [Conditional=MEDIA_STREAM, EnabledBySetting=ExposeSpeakers&PerElementSpeakerSelection, ImplementedAs=audioOutputHashedDeviceId] readonly attribute DOMString sinkId;
> +    [Conditional=MEDIA_STREAM, EnabledBySetting=ExposeSpeakers&PerElementSpeakerSelection, ImplementedAs=setAudioOutputDevice] Promise<undefined> setSinkId(DOMString deviceId);

Please move these out into a partial interface in their own IDL file. Also, why are they implemented under another name? Why not just call them sinkId/setSinkId in the implementation?

> Source/WebKit/UIProcess/API/C/WKPreferencesRefPrivate.h:172
> +// Defaults to false.

This currently seems to default to true.

> Source/WebKit/UIProcess/API/C/WKPreferencesRefPrivate.h:174
> +WK_EXPORT void WKPreferencesSetSpeakerSelectionRequiresUserGesture(WKPreferencesRef preferencesRef, bool flag);
> +WK_EXPORT bool WKPreferencesGetSpeakerSelectionRequiresUserGesture(WKPreferencesRef preferencesRef);

Are these needed beyond use by WebKitTestRunner? If it's only going to be used by WebKitTestRunner, please use WKPreferencesSetInternalDebugFeatureForKey(preferences, false, WKStringCreateWithUTF8CString("SpeakerSelectionRequiresUserGesture"); instead of adding this new SPI (you can find others doing this around line 845 of TestController.cpp in WebKitTestRunner.

> LayoutTests/ChangeLog:8
> +        Skip new tests on WK1.

Out of interest, what is WebKit2 specific about these?
Comment 12 Sam Weinig 2020-09-21 08:28:20 PDT
Comment on attachment 409274 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=409274&action=review

> Source/WebCore/ChangeLog:9
> +        Introduce a setting to expose these methods and to enable/disable user gesture requirement.

How can we test the user gesture requirement? It seems like that's something we wouldn't want to break.
Comment 13 youenn fablet 2020-09-21 10:44:33 PDT
> > Source/WebCore/html/HTMLMediaElement.idl:90
> > +    [Conditional=MEDIA_STREAM, EnabledBySetting=ExposeSpeakers&PerElementSpeakerSelection, ImplementedAs=audioOutputHashedDeviceId] readonly attribute DOMString sinkId;
> > +    [Conditional=MEDIA_STREAM, EnabledBySetting=ExposeSpeakers&PerElementSpeakerSelection, ImplementedAs=setAudioOutputDevice] Promise<undefined> setSinkId(DOMString deviceId);
> 
> Please move these out into a partial interface in their own IDL file. Also,
> why are they implemented under another name? Why not just call them
> sinkId/setSinkId in the implementation?

sinkId/setSinkId is not very meaningful and we want to introduce navigator.mediaDevices.setAudioOutput which is the same but for a whole page.
The idea would be to deprecate setSinkId in favour of setAudioOutput.

> > LayoutTests/ChangeLog:8
> > +        Skip new tests on WK1.
> 
> Out of interest, what is WebKit2 specific about these?

navigator.mediaDevices
Comment 14 youenn fablet 2020-09-21 12:07:31 PDT
Created attachment 409293 [details]
Patch
Comment 15 Sam Weinig 2020-09-22 09:26:11 PDT
Comment on attachment 409293 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=409293&action=review

> Source/WebCore/platform/mediastream/mac/AudioMediaStreamTrackRendererUnit.cpp:42
> +#if PLATFORM(COCOA)
> +#include "CoreAudioCaptureDevice.h"
> +#include "CoreAudioCaptureDeviceManager.h"
> +#endif

Since this is a Mac only file, I don't think the platform check is needed here.

> Source/WebCore/platform/mediastream/mac/AudioMediaStreamTrackRendererUnit.cpp:61
> +#if PLATFORM(MAC)

Given the path of the file, I assume it is Mac only. Is this check really needed?

> Source/WebCore/platform/mediastream/mac/AudioMediaStreamTrackRendererUnit.cpp:196
> +#if PLATFORM(MAC)

Same here.

> Source/WebCore/platform/mediastream/mac/AudioMediaStreamTrackRendererUnit.h:73
> +#if PLATFORM(MAC)

And here.

> Source/WebCore/testing/Internals.cpp:570
> +    page.settings().setSpeakerSelectionRequiresUserGesture(false);

It seems like this is duplicated here and in TestController.cpp.

> Source/WebKit/UIProcess/API/C/WKPreferencesRefPrivate.h:174
> +// Defaults to true.
> +WK_EXPORT void WKPreferencesSetSpeakerSelectionRequiresUserGesture(WKPreferencesRef preferencesRef, bool flag);
> +WK_EXPORT bool WKPreferencesGetSpeakerSelectionRequiresUserGesture(WKPreferencesRef preferencesRef);

Why is new SPI needed here?
Comment 16 youenn fablet 2020-09-22 10:12:24 PDT
(In reply to Sam Weinig from comment #15)
> Comment on attachment 409293 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=409293&action=review
> 
> > Source/WebCore/platform/mediastream/mac/AudioMediaStreamTrackRendererUnit.cpp:42
> > +#if PLATFORM(COCOA)
> > +#include "CoreAudioCaptureDevice.h"
> > +#include "CoreAudioCaptureDeviceManager.h"
> > +#endif
> 
> Since this is a Mac only file, I don't think the platform check is needed
> here.

Right.

> > Source/WebCore/platform/mediastream/mac/AudioMediaStreamTrackRendererUnit.cpp:61
> > +#if PLATFORM(MAC)
> 
> Given the path of the file, I assume it is Mac only. Is this check really
> needed?

It is needed.
Source/WebCore/platform/mediastream/mac should really be renamed Source/WebCore/platform/mediastream/cocoa.

> > Source/WebCore/testing/Internals.cpp:570
> > +    page.settings().setSpeakerSelectionRequiresUserGesture(false);
> 
> It seems like this is duplicated here and in TestController.cpp.

If Internals sets it to true, it might stick for longer than expected.
 
> > Source/WebKit/UIProcess/API/C/WKPreferencesRefPrivate.h:174
> > +// Defaults to true.
> > +WK_EXPORT void WKPreferencesSetSpeakerSelectionRequiresUserGesture(WKPreferencesRef preferencesRef, bool flag);
> > +WK_EXPORT bool WKPreferencesGetSpeakerSelectionRequiresUserGesture(WKPreferencesRef preferencesRef);
> 
> Why is new SPI needed here?

Oh right, it should be removed now that TestController does not use it anymore.
Comment 17 Sam Weinig 2020-09-22 10:49:23 PDT
(In reply to youenn fablet from comment #16)
> > > Source/WebCore/platform/mediastream/mac/AudioMediaStreamTrackRendererUnit.cpp:61
> > > +#if PLATFORM(MAC)
> > 
> > Given the path of the file, I assume it is Mac only. Is this check really
> > needed?
> 
> It is needed.
> Source/WebCore/platform/mediastream/mac should really be renamed
> Source/WebCore/platform/mediastream/cocoa.

I see. Very confusing.

> 
> > > Source/WebCore/testing/Internals.cpp:570
> > > +    page.settings().setSpeakerSelectionRequiresUserGesture(false);
> > 
> > It seems like this is duplicated here and in TestController.cpp.
> 
> If Internals sets it to true, it might stick for longer than expected.

Really? TestController calls this in a function called resetPreferencesToConsistentValues() which seems to be called after every invocation.
Comment 18 Eric Carlson 2020-09-22 11:08:22 PDT
Comment on attachment 409293 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=409293&action=review

> Source/WebCore/html/HTMLMediaElement.h:615
> +    String audioOutputDeviceId() const final;
> +    String audioOutputDeviceIdOverride() const final { return m_audioOutputPersistentDeviceId; }

Why are these `final`?

Why is one implemented in here and one in the .cpp when they both just return a member variable?
Comment 19 youenn fablet 2020-09-23 03:05:57 PDT
(In reply to Eric Carlson from comment #18)
> Comment on attachment 409293 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=409293&action=review
> 
> > Source/WebCore/html/HTMLMediaElement.h:615
> > +    String audioOutputDeviceId() const final;
> > +    String audioOutputDeviceIdOverride() const final { return m_audioOutputPersistentDeviceId; }
> 
> Why are these `final`?

These are MediaPlayerClient API.

> Why is one implemented in here and one in the .cpp when they both just
> return a member variable?

audioOutputDeviceId() will later on either return the override or the page audioOutputDeviceId(), hence why it was in cpp in the main patch.
I moved all of them to header file for now.
Comment 20 youenn fablet 2020-09-23 03:06:50 PDT
> > If Internals sets it to true, it might stick for longer than expected.
> 
> Really? TestController calls this in a function called
> resetPreferencesToConsistentValues() which seems to be called after every
> invocation.

Did some test and you are right it does not seem necessary so I'll remove it.
Comment 21 youenn fablet 2020-09-23 03:09:18 PDT
Created attachment 409457 [details]
Patch for landing
Comment 22 youenn fablet 2020-09-23 04:17:54 PDT
Created attachment 409459 [details]
Patch for landing
Comment 23 EWS 2020-09-23 07:04:24 PDT
Committed r267472: <https://trac.webkit.org/changeset/267472>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 409459 [details].
Comment 24 Radar WebKit Bug Importer 2020-09-23 07:05:22 PDT
<rdar://problem/69433680>
Comment 25 Philippe Normand 2020-11-19 04:32:52 PST
Sorry, only the validHandler needs to be immutable for us. If I make it mutable here's the error:

../../Source/WebKit/UIProcess/glib/UserMediaPermissionRequestManagerProxyGLib.cpp:70:17: error: no matching function for call to object of type 'const (lambda at ../../Source/WebKit/UIProcess/glib/UserMediaPermissionRequestManagerProxyGLib.cpp:54:29)'
                validHandler(std::move<WTF::CheckMoveParameter>(audioDevices), std::move<WTF::CheckMoveParameter>(videoDevices), std::move<WTF::CheckMoveParameter>(*deviceIdentifierHashSalt));
                ^~~~~~~~~~~~
../../Source/WebKit/UIProcess/glib/UserMediaPermissionRequestManagerProxyGLib.cpp:54:29: note: candidate function not viable: 'this' argument has type 'const (lambda at ../../Source/WebKit/UIProcess/glib/UserMediaPermissionRequestManagerProxyGLib.cpp:54:29)', but method is not marked const
        auto validHandler = [this, request](Vector<CaptureDevice>&& audioDevices, Vector<CaptureDevice>&& videoDevices, String&& deviceIdentifierHashSalt) mutable {
                            ^
1 error generated.
Compiler killed by signal 1
Comment 26 Philippe Normand 2020-11-19 04:33:10 PST
oops wrong bug, sorry
Comment 27 haimomesi 2021-08-23 11:19:41 PDT
On Safari Technology Release 115 it says:

Added support for HTMLMediaElement.setSinkId

Changeset 267472 in webkit shows all the changes made but I'm using Release 127 (Safari 15.0, WebKit 16612.1.18.11.3) and don't have the setSinkId functionality at all.

Moreover, when using WKWebView I also don't have any "audiooutput" devices when enumerating devices.

The goal is to use getUserMedia and toggle the outputs from earpiece to speaker on iOS.

Any help will be much appreciated
Comment 28 Eric Carlson 2021-08-23 13:02:38 PDT
(In reply to haimomesi from comment #27)
> On Safari Technology Release 115 it says:
> 
> Added support for HTMLMediaElement.setSinkId
> 
> Changeset 267472 in webkit shows all the changes made but I'm using Release
> 127 (Safari 15.0, WebKit 16612.1.18.11.3) and don't have the setSinkId
> functionality at all.
> 
> Moreover, when using WKWebView I also don't have any "audiooutput" devices
> when enumerating devices.
> 
> The goal is to use getUserMedia and toggle the outputs from earpiece to
> speaker on iOS.
> 
> Any help will be much appreciated

Exposing audio output devices and setSinkId are still experimental features and both are still disabled by default. 

You should be able to enable these features for testing with Safari's Develop menu:

    Develop > Experimental Features > Allow per media element speaker device selection
    Develop > Experimental Features > Allow speaker device selection
Comment 29 haimomesi 2021-08-24 02:41:55 PDT
Can we enabled this with WKWebView in some manner?
Or maybe run a js script that will allow this behavior?

Thank you for the quick reply
Comment 30 Eric Carlson 2021-08-24 10:20:23 PDT
(In reply to haimomesi from comment #29)
> Can we enabled this with WKWebView in some manner?
> Or maybe run a js script that will allow this behavior?
> 

No, WKWebView doesn't have API to enable experimental features.