WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 219371
Implement audio capture for SpeechRecognition on iOS
https://bugs.webkit.org/show_bug.cgi?id=219371
Summary
Implement audio capture for SpeechRecognition on iOS
Sihui Liu
Reported
2020-11-30 14:42:44 PST
...
Attachments
Patch
(75.37 KB, patch)
2020-11-30 14:46 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(76.00 KB, patch)
2020-11-30 18:04 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(76.69 KB, patch)
2020-11-30 18:22 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(77.14 KB, patch)
2020-11-30 20:29 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(77.18 KB, patch)
2020-11-30 20:43 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(82.26 KB, patch)
2020-12-01 08:57 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(81.24 KB, patch)
2020-12-01 09:06 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(81.27 KB, patch)
2020-12-01 09:17 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(81.53 KB, patch)
2020-12-01 10:00 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(84.39 KB, patch)
2020-12-01 10:39 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(85.96 KB, patch)
2020-12-01 14:28 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(85.96 KB, patch)
2020-12-01 14:33 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(84.72 KB, patch)
2020-12-01 23:23 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(85.76 KB, patch)
2020-12-01 23:35 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(146.59 KB, patch)
2020-12-03 16:12 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(147.32 KB, patch)
2020-12-03 17:53 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(147.08 KB, patch)
2020-12-03 18:03 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(148.80 KB, patch)
2020-12-03 21:55 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(150.29 KB, patch)
2020-12-04 02:25 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(146.29 KB, patch)
2020-12-04 10:05 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(146.29 KB, patch)
2020-12-04 10:14 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(146.30 KB, patch)
2020-12-04 10:36 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Fix MediaStream tests
(123.04 KB, patch)
2020-12-04 12:07 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(82.79 KB, patch)
2020-12-04 17:04 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(82.76 KB, patch)
2020-12-04 20:39 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(84.05 KB, patch)
2020-12-04 21:51 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(82.90 KB, patch)
2020-12-04 22:19 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(89.15 KB, patch)
2020-12-07 14:09 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(89.15 KB, patch)
2020-12-07 14:10 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(88.63 KB, patch)
2020-12-08 17:21 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch for landing
(88.59 KB, patch)
2020-12-08 22:47 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch for landing
(88.46 KB, patch)
2020-12-08 23:16 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch for landing
(88.40 KB, patch)
2020-12-08 23:55 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Show Obsolete
(32)
View All
Add attachment
proposed patch, testcase, etc.
Radar WebKit Bug Importer
Comment 1
2020-11-30 14:42:54 PST
<
rdar://problem/71824049
>
Sihui Liu
Comment 2
2020-11-30 14:46:27 PST
Comment hidden (obsolete)
Created
attachment 415061
[details]
Patch
Sihui Liu
Comment 3
2020-11-30 18:04:15 PST
Comment hidden (obsolete)
Created
attachment 415086
[details]
Patch
Sihui Liu
Comment 4
2020-11-30 18:22:42 PST
Comment hidden (obsolete)
Created
attachment 415088
[details]
Patch
Sihui Liu
Comment 5
2020-11-30 20:29:32 PST
Comment hidden (obsolete)
Created
attachment 415092
[details]
Patch
Sihui Liu
Comment 6
2020-11-30 20:43:42 PST
Comment hidden (obsolete)
Created
attachment 415093
[details]
Patch
Sihui Liu
Comment 7
2020-12-01 08:57:18 PST
Comment hidden (obsolete)
Created
attachment 415138
[details]
Patch
Sihui Liu
Comment 8
2020-12-01 09:06:35 PST
Comment hidden (obsolete)
Created
attachment 415139
[details]
Patch
Sihui Liu
Comment 9
2020-12-01 09:17:05 PST
Comment hidden (obsolete)
Created
attachment 415141
[details]
Patch
Sihui Liu
Comment 10
2020-12-01 10:00:55 PST
Comment hidden (obsolete)
Created
attachment 415144
[details]
Patch
Sihui Liu
Comment 11
2020-12-01 10:39:47 PST
Comment hidden (obsolete)
Created
attachment 415147
[details]
Patch
Sihui Liu
Comment 12
2020-12-01 14:28:38 PST
Comment hidden (obsolete)
Created
attachment 415164
[details]
Patch
Sihui Liu
Comment 13
2020-12-01 14:33:00 PST
Comment hidden (obsolete)
Created
attachment 415166
[details]
Patch
Sihui Liu
Comment 14
2020-12-01 23:23:49 PST
Comment hidden (obsolete)
Created
attachment 415194
[details]
Patch
Sihui Liu
Comment 15
2020-12-01 23:35:01 PST
Created
attachment 415196
[details]
Patch
Sihui Liu
Comment 16
2020-12-02 09:06:02 PST
Win bot failure should be irrelevant
youenn fablet
Comment 17
2020-12-02 14:06:13 PST
Comment on
attachment 415196
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=415196&action=review
> Source/WebCore/Modules/speech/SpeechRecognitionCaptureSource.cpp:60 > + return RealtimeMediaSourceCenter::singleton().audioCaptureFactory().createAudioCaptureSource(captureDevice, { }, { });
In case we do capture in GPU process, I think it will work out as follows: - GPU process proxy capture audio - Web process receives audio - Web process forwards audio to UIProcess. In practice though, this might not work well due to the sandbox extensions that GPU process proxy will need to have. I think we will need to do direct UIProcess<->UIProcess communication. It seems overall that the current classes are IPC::Connection based so they should be adaptable for that purpose. It is worth keeping that in mind in that patch though.
> Source/WebCore/Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp:91 > + if (isMainThread()) {
We should not be in main thread there? I guess we can add a FIXME somewhere. If the issue is IPC, we should add a message thread receiver and dispatch IPC messages to a specific dispatch queue. We can leave that to a followup though. By the way, why is it needed for the state update callback to be called synchronously here.
> Source/WebCore/Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp:103 > + if (isMainThread()) {
Ditto
> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:115 > + m_source = makeUnique<SpeechRecognitionCaptureSource>(*m_clientIdentifier, WTFMove(dataCallback), WTFMove(stateUpdateCallback), result.source());
It seems simpler to just always have a m_realtimeMediaSourceCreateFunction. Or we can simply pass a Ref<RealtimeMediaSource> directly in start method or in SpeechRecognizer constructor.
> Source/WebKit/UIProcess/SpeechRecognitionRemoteRealtimeMediaSource.h:39 > +class SpeechRecognitionRemoteRealtimeMediaSource : public WebCore::RealtimeMediaSource {
This code is probably very similar to UserMediaCaptureManagerProxy::SourceProxy and its counter part RemoteCaptureSampleManager. Could we try to share code here? Maybe we could have RemoteCaptureSampleManager handle capture failed/stopped to unify both classes.
> Source/WebKit/UIProcess/SpeechRecognitionRemoteRealtimeMediaSourceManager.h:51 > +class SpeechRecognitionRemoteRealtimeMediaSourceManager : public CanMakeWeakPtr<SpeechRecognitionRemoteRealtimeMediaSourceManager>, public IPC::MessageReceiver, private IPC::MessageSender {
Can we try to see whether we can reuse code from RemoteCaptureSampleManager?
> Source/WebKit/UIProcess/SpeechRecognitionRemoteRealtimeMediaSourceManager.h:54 > + SpeechRecognitionRemoteRealtimeMediaSourceManager(Ref<IPC::Connection>&&);
explicit
> Source/WebKit/UIProcess/SpeechRecognitionRemoteRealtimeMediaSourceManager.h:64 > + void remoteSourceStopped(WebCore::RealtimeMediaSourceIdentifier);
These methods can probably be private.
> Source/WebKit/UIProcess/SpeechRecognitionRemoteRealtimeMediaSourceManager.h:77 > + HashMap<WebCore::RealtimeMediaSourceIdentifier, SpeechRecognitionRemoteRealtimeMediaSource*> m_sources;
This seems safe given SpeechRecognitionRemoteRealtimeMediaSource makes sure to unregister itself in destructor. I would still go with a HashMap of WeakPtr though for extra safety.
> Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:97 > + m_recognizer->setRealtimeMediaSourceCreateFunction([this](auto& device) {
It seems this should go in constructor of recogniser.
> Source/WebKit/UIProcess/SpeechRecognitionServer.h:61 > + void setRealtimeMediaSourceCreateFunction(WebCore::SpeechRecognizer::RealtimeMediaSourceCreateFunction&& function) { m_realtimeMediaSourceCreateFunction = WTFMove(function); }
It seems this could go in constructor since it will not change over time.
> Source/WebKit/WebProcess/Speech/SpeechRecognitionRealtimeMediaSourceManager.cpp:52 > +class SpeechRecognitionRealtimeMediaSourceManager::RealtimeMediaSource
It seems we should try to share code with UserMediaCaptureManagerProxy::SourceProxy
> Source/WebKit/WebProcess/Speech/SpeechRecognitionRealtimeMediaSourceManager.messages.in:34 > + RevokeSandboxExtensions()
Maybe we can do it another patch, but it would be good to have one place where we handle sandbox extension granting and revoking for both getUserMedia and recogniser.
Sihui Liu
Comment 18
2020-12-02 16:54:20 PST
Comment on
attachment 415196
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=415196&action=review
>> Source/WebCore/Modules/speech/SpeechRecognitionCaptureSource.cpp:60 >> + return RealtimeMediaSourceCenter::singleton().audioCaptureFactory().createAudioCaptureSource(captureDevice, { }, { }); > > In case we do capture in GPU process, I think it will work out as follows: > - GPU process proxy capture audio > - Web process receives audio > - Web process forwards audio to UIProcess. > In practice though, this might not work well due to the sandbox extensions that GPU process proxy will need to have. > I think we will need to do direct UIProcess<->UIProcess communication. > It seems overall that the current classes are IPC::Connection based so they should be adaptable for that purpose. > It is worth keeping that in mind in that patch though.
Yes, I think direct UI - GPU process connection is the way to go. We probably just need to set the correct connection in SpeechRecognitionRemoteRealtimeMediaSourceManager. Current implementation will return error if capture in GPU process is enabled on iOS.
>> Source/WebCore/Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp:91 >> + if (isMainThread()) { > > We should not be in main thread there? > I guess we can add a FIXME somewhere. If the issue is IPC, we should add a message thread receiver and dispatch IPC messages to a specific dispatch queue. > We can leave that to a followup though. > > By the way, why is it needed for the state update callback to be called synchronously here.
Yes, this is because we handle the remoteAudioSamplesAvailable on main thread. It does not need to be called synchronously I guess (did not break existing tests); just seems unnecessary to schedule a future task if we already on the main thread.
>> Source/WebCore/Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp:103 >> + if (isMainThread()) { > > Ditto
Ditto
>> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:115 >> + m_source = makeUnique<SpeechRecognitionCaptureSource>(*m_clientIdentifier, WTFMove(dataCallback), WTFMove(stateUpdateCallback), result.source()); > > It seems simpler to just always have a m_realtimeMediaSourceCreateFunction. > Or we can simply pass a Ref<RealtimeMediaSource> directly in start method or in SpeechRecognizer constructor.
Okay, will change.
>> Source/WebKit/UIProcess/SpeechRecognitionRemoteRealtimeMediaSource.h:39 >> +class SpeechRecognitionRemoteRealtimeMediaSource : public WebCore::RealtimeMediaSource { > > This code is probably very similar to UserMediaCaptureManagerProxy::SourceProxy and its counter part RemoteCaptureSampleManager. > Could we try to share code here? > Maybe we could have RemoteCaptureSampleManager handle capture failed/stopped to unify both classes.
Okay, will try. Seems we can move the code from different processes to Shared/.
>> Source/WebKit/UIProcess/SpeechRecognitionRemoteRealtimeMediaSourceManager.h:51 >> +class SpeechRecognitionRemoteRealtimeMediaSourceManager : public CanMakeWeakPtr<SpeechRecognitionRemoteRealtimeMediaSourceManager>, public IPC::MessageReceiver, private IPC::MessageSender { > > Can we try to see whether we can reuse code from RemoteCaptureSampleManager?
Okay.
>> Source/WebKit/UIProcess/SpeechRecognitionRemoteRealtimeMediaSourceManager.h:64 >> + void remoteSourceStopped(WebCore::RealtimeMediaSourceIdentifier); > > These methods can probably be private.
Sure.
>> Source/WebKit/UIProcess/SpeechRecognitionRemoteRealtimeMediaSourceManager.h:77 >> + HashMap<WebCore::RealtimeMediaSourceIdentifier, SpeechRecognitionRemoteRealtimeMediaSource*> m_sources; > > This seems safe given SpeechRecognitionRemoteRealtimeMediaSource makes sure to unregister itself in destructor. > I would still go with a HashMap of WeakPtr though for extra safety.
Will change to use WeakPtr.
>> Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:97 >> + m_recognizer->setRealtimeMediaSourceCreateFunction([this](auto& device) { > > It seems this should go in constructor of recogniser.
Sure, made it separate for the macro.
>> Source/WebKit/UIProcess/SpeechRecognitionServer.h:61 >> + void setRealtimeMediaSourceCreateFunction(WebCore::SpeechRecognizer::RealtimeMediaSourceCreateFunction&& function) { m_realtimeMediaSourceCreateFunction = WTFMove(function); } > > It seems this could go in constructor since it will not change over time.
Okay.
>> Source/WebKit/WebProcess/Speech/SpeechRecognitionRealtimeMediaSourceManager.cpp:52 >> +class SpeechRecognitionRealtimeMediaSourceManager::RealtimeMediaSource > > It seems we should try to share code with UserMediaCaptureManagerProxy::SourceProxy
Will try.
>> Source/WebKit/WebProcess/Speech/SpeechRecognitionRealtimeMediaSourceManager.messages.in:34 >> + RevokeSandboxExtensions() > > Maybe we can do it another patch, but it would be good to have one place where we handle sandbox extension granting and revoking for both getUserMedia and recogniser.
Yes, getUserMedia currently has two places (UserMediaProcessManager and UserMediaPermissionRequestManagerProxy) to grant sandbox extensions and may also be simplified by then.
Sihui Liu
Comment 19
2020-12-03 16:12:55 PST
Comment hidden (obsolete)
Created
attachment 415364
[details]
Patch
Sihui Liu
Comment 20
2020-12-03 17:53:14 PST
Comment hidden (obsolete)
Created
attachment 415371
[details]
Patch
Sihui Liu
Comment 21
2020-12-03 18:03:18 PST
Comment hidden (obsolete)
Created
attachment 415372
[details]
Patch
Sihui Liu
Comment 22
2020-12-03 21:55:49 PST
Comment hidden (obsolete)
Created
attachment 415386
[details]
Patch
Sihui Liu
Comment 23
2020-12-04 02:25:39 PST
Created
attachment 415397
[details]
Patch
youenn fablet
Comment 24
2020-12-04 07:18:02 PST
Comment on
attachment 415397
[details]
Patch Some getDisplayMedia tests seem to be failing. getDisplayMedia capture in MacOS happens in UIProcess, WebProcess receiving the video samples over IPC. View in context:
https://bugs.webkit.org/attachment.cgi?id=415397&action=review
> Source/WebKit/ChangeLog:149 > +2020-12-03 Sihui Liu <
sihui_liu@apple.com
>
double entry.
> Source/WebCore/Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp:91 > + if (isMainThread()) {
Since they are not needed, let's just not add these switches. Ditto below.
> Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:106 > + if (m_realtimeMediaSourceCreateFunction) {
I would make this callback mandatory so that we simplify this code path to only handle error cases or have a source. In the future, we will always capture in GPU process anyway.
> Source/WebKit/UIProcess/SpeechRecognitionServer.h:61 > + using RealtimeMediaSourceCreateFunction = Function<RefPtr<WebCore::RealtimeMediaSource>(const WebCore::CaptureDevice&)>;
We could probably use something like SpeechRecognitionErrorOr<Ref<WebCore::RealtimeMediaSource>>
> Source/WebKit/WebProcess/cocoa/UserMediaCaptureManager.cpp:48 > + : RemoteRealtimeMediaSourceManager(nullptr, false)
These values nullptr and false are not very clear.
Sihui Liu
Comment 25
2020-12-04 10:05:29 PST
Comment hidden (obsolete)
Created
attachment 415424
[details]
Patch
Sihui Liu
Comment 26
2020-12-04 10:14:03 PST
Created
attachment 415425
[details]
Patch
Sihui Liu
Comment 27
2020-12-04 10:20:53 PST
(In reply to youenn fablet from
comment #24
)
> Comment on
attachment 415397
[details]
> Patch > > Some getDisplayMedia tests seem to be failing. > getDisplayMedia capture in MacOS happens in UIProcess, WebProcess receiving > the video samples over IPC. > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=415397&action=review
> > > Source/WebKit/ChangeLog:149 > > +2020-12-03 Sihui Liu <
sihui_liu@apple.com
> > > double entry.
Removed!
> > > Source/WebCore/Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp:91 > > + if (isMainThread()) { > > Since they are not needed, let's just not add these switches. Ditto below.
Sure.
> > > Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:106 > > + if (m_realtimeMediaSourceCreateFunction) { > > I would make this callback mandatory so that we simplify this code path to > only handle error cases or have a source. > In the future, we will always capture in GPU process anyway.
Made m_realtimeMediaSourceCreateFunction mandatory.
> > > Source/WebKit/UIProcess/SpeechRecognitionServer.h:61 > > + using RealtimeMediaSourceCreateFunction = Function<RefPtr<WebCore::RealtimeMediaSource>(const WebCore::CaptureDevice&)>; > > We could probably use something like > SpeechRecognitionErrorOr<Ref<WebCore::RealtimeMediaSource>>
I used WebCore::CaptureSourceOrError.
> > > Source/WebKit/WebProcess/cocoa/UserMediaCaptureManager.cpp:48 > > + : RemoteRealtimeMediaSourceManager(nullptr, false) > > These values nullptr and false are not very clear.
I agree, I've changed bool to enum class ShouldHandleSandboxExtensions. I also removed RemoteRealtimeMediaSourceManager from UserMediaCaptureManager, as this patch is getting bigger and harder to debug with specific code change for UserMedia. Will do replace/change existing classes UserMediaCaptureManager and UserMediaCaptureManagerProxy with shared RemoteRemoteRealtimeMediaSourceManager and RemoteRealtimeMediaSourceManager in a separate patch. What do you think?
Sihui Liu
Comment 28
2020-12-04 10:36:08 PST
Created
attachment 415428
[details]
Patch
Sihui Liu
Comment 29
2020-12-04 12:07:06 PST
Created
attachment 415441
[details]
Fix MediaStream tests
youenn fablet
Comment 30
2020-12-04 13:40:34 PST
Comment on
attachment 415441
[details]
Fix MediaStream tests View in context:
https://bugs.webkit.org/attachment.cgi?id=415441&action=review
> Source/WTF/ChangeLog:9 > + Turn on WebSpeech on iOS simulator by default for testing.
Not very clear why we need this. We do not need them for WTR and for regular testing, we have the UI for that, or we can compile switched on locally.
> Source/WebCore/Modules/speech/SpeechRecognitionCaptureSource.cpp:65 > + m_impl = makeUnique<SpeechRecognitionCaptureSourceImpl>(clientIdentifier, WTFMove(dataCallback), WTFMove(stateUpdateCallback), WTFMove(source));
Could be written as: : m_impl(makeUnique<>(...))
> Source/WebKit/ChangeLog:17 > + Original WebKit::RemoteRealtimeMediaSource is renamed to WebKit::WebRemoteRealtimeMediaSource. We should
What is missing from replacing WebRemoteRealtimeMediaSource by the new RemoteRealtimeMediaSource? Maybe we should first do this refactoring as preliminary patches? That would make sure we have code that is clear at any point in time. Also, it is difficult to see how the new code is close to replacing the old one. If that is too much work or too difficult, maybe we should revert to having specific speech classes for now. If we want to continue in that direction, ObsoleteRemoteRealtimeMediaSource might be a better name since that is the intent I think.
> Source/WebKit/Shared/media/RealtimeMediaSourceManager.cpp:53 > + : private WebCore::RealtimeMediaSource::Observer
There are probably a lot of WebCore:: We could be doing using namespace WebCore.
> Source/WebKit/Shared/media/RealtimeMediaSourceManager.cpp:184 > + int64_t m_numberOfFrames { 0 };
Should be uint64_t probably
> Source/WebKit/Shared/media/RealtimeMediaSourceManager.cpp:237 > + case WebCore::CaptureDevice::DeviceType::Microphone:
This code is mostly dealing with audio (not processing video samples or camera sandbox extensions). Let's ASSERT_NOT_REACHED for the other cases. We could also remove orientation code as well for now.
> Source/WebKit/Shared/media/RealtimeMediaSourceManager.h:49 > + RealtimeMediaSourceManager(IPC::Connection&);
explicit, could take a Ref<IPC::Connection>&&.
> Source/WebKit/Shared/media/RealtimeMediaSourceManager.h:78 > + HashMap<WebCore::RealtimeMediaSourceIdentifier, std::unique_ptr<RealtimeMediaSource>> m_sources;
RealtimeMediaSource is not a great name since it makes think this is some kind of WebCore::RealtimeMediaSource which are ref counted. Let's use a shorter name like Source.
> Source/WebKit/UIProcess/SpeechRecognitionServer.h:62 > + void setRealtimeMediaSourceCreateFunction(RealtimeMediaSourceCreateFunction&& function) { m_realtimeMediaSourceCreateFunction = WTFMove(function); }
Could be in SpeechRecognitionServer constructor.
> Source/WebKit/UIProcess/WebProcessProxy.cpp:1732 > + speechRecognitionServer->setRealtimeMediaSourceCreateFunction([weakThis = makeWeakPtr(this), weakPage = makeWeakPtr(targetPage)]() {
If we move it to constructor, we could also make the lambda a method of WebPageProxy to keep the lambda small enough.
> Source/WebKit/UIProcess/WebProcessProxy.cpp:1736 > + if (!weakPage || weakPage->preferences().captureAudioInGPUProcessEnabled())
I would move the weakPage check with weakThis check. And add a "not implemented for GPU process" specific message. Also, maybe only the weakPage is actually necessary since the page proxy keeps a ref to its process proxy.
> Source/WebKit/UIProcess/WebProcessProxy.cpp:1744 > + auto& sourceManager = weakThis->ensureRemoteRealtimeMediaSourceManager();
sourceManager is only used once
> Source/WebKit/UIProcess/WebProcessProxy.cpp:1768 > + addMessageReceiver(Messages::RemoteRealtimeMediaSourceManager::messageReceiverName(), *m_remoteRealtimeMediaSourceManager);
It is probably safer to remove the message receiver at destruction time.
Sihui Liu
Comment 31
2020-12-04 14:16:48 PST
(In reply to youenn fablet from
comment #30
)
> Comment on
attachment 415441
[details]
> Fix MediaStream tests > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=415441&action=review
> > > Source/WTF/ChangeLog:9 > > + Turn on WebSpeech on iOS simulator by default for testing. > > Not very clear why we need this. > We do not need them for WTR and for regular testing, we have the UI for > that, or we can compile switched on locally.
Okay, will revert.
> > > Source/WebCore/Modules/speech/SpeechRecognitionCaptureSource.cpp:65 > > + m_impl = makeUnique<SpeechRecognitionCaptureSourceImpl>(clientIdentifier, WTFMove(dataCallback), WTFMove(stateUpdateCallback), WTFMove(source)); > > Could be written as: > : m_impl(makeUnique<>(...)) > > > Source/WebKit/ChangeLog:17 > > + Original WebKit::RemoteRealtimeMediaSource is renamed to WebKit::WebRemoteRealtimeMediaSource. We should > > What is missing from replacing WebRemoteRealtimeMediaSource by the new > RemoteRealtimeMediaSource?
Replacing WebRemoteRealtimeMediaSource requires change in UserMediaCaptureManager and RemoteRealtimeMediaSourceManager; and possibly some behavior changes that need to verify by testing. Then I found the refactoring work overweighs what I want to implement in this patch.
> Maybe we should first do this refactoring as preliminary patches? That would > make sure we have code that is clear at any point in time. > Also, it is difficult to see how the new code is close to replacing the old > one.
I was thinking about replacing UserMediaCaptureManagerProxy and UserMediaCaptureManager functionalities with RealtimeMediaSourceManager and RemoteRealtimeMediaSourceManager, not only the RemoteRealtimeMediaSource.
> If that is too much work or too difficult, maybe we should revert to having > specific speech classes for now.
Will do!
> > If we want to continue in that direction, ObsoleteRemoteRealtimeMediaSource > might be a better name since that is the intent I think. > > > Source/WebKit/Shared/media/RealtimeMediaSourceManager.cpp:53 > > + : private WebCore::RealtimeMediaSource::Observer > > There are probably a lot of WebCore:: > We could be doing using namespace WebCore.
Okay.
> > > Source/WebKit/Shared/media/RealtimeMediaSourceManager.cpp:184 > > + int64_t m_numberOfFrames { 0 }; > > Should be uint64_t probably
Will change.
> > > Source/WebKit/Shared/media/RealtimeMediaSourceManager.cpp:237 > > + case WebCore::CaptureDevice::DeviceType::Microphone: > > This code is mostly dealing with audio (not processing video samples or > camera sandbox extensions). > Let's ASSERT_NOT_REACHED for the other cases. > We could also remove orientation code as well for now.
Will revert to audio-only and speechrecognition-only version.
> > > Source/WebKit/Shared/media/RealtimeMediaSourceManager.h:49 > > + RealtimeMediaSourceManager(IPC::Connection&); > > explicit, could take a Ref<IPC::Connection>&&.
Sure.
> > > Source/WebKit/Shared/media/RealtimeMediaSourceManager.h:78 > > + HashMap<WebCore::RealtimeMediaSourceIdentifier, std::unique_ptr<RealtimeMediaSource>> m_sources; > > RealtimeMediaSource is not a great name since it makes think this is some > kind of WebCore::RealtimeMediaSource which are ref counted. > Let's use a shorter name like Source.
Okay.
> > > Source/WebKit/UIProcess/SpeechRecognitionServer.h:62 > > + void setRealtimeMediaSourceCreateFunction(RealtimeMediaSourceCreateFunction&& function) { m_realtimeMediaSourceCreateFunction = WTFMove(function); } > > Could be in SpeechRecognitionServer constructor.
Will move.
> > > Source/WebKit/UIProcess/WebProcessProxy.cpp:1732 > > + speechRecognitionServer->setRealtimeMediaSourceCreateFunction([weakThis = makeWeakPtr(this), weakPage = makeWeakPtr(targetPage)]() { > > If we move it to constructor, we could also make the lambda a method of > WebPageProxy to keep the lambda small enough.
I guess you mean WebProcessProxy, sure!
> > > Source/WebKit/UIProcess/WebProcessProxy.cpp:1736 > > + if (!weakPage || weakPage->preferences().captureAudioInGPUProcessEnabled()) > > I would move the weakPage check with weakThis check. > And add a "not implemented for GPU process" specific message. > Also, maybe only the weakPage is actually necessary since the page proxy > keeps a ref to its process proxy.
Will change.
> > > Source/WebKit/UIProcess/WebProcessProxy.cpp:1744 > > + auto& sourceManager = weakThis->ensureRemoteRealtimeMediaSourceManager(); > > sourceManager is only used once
Will remove.
> > > Source/WebKit/UIProcess/WebProcessProxy.cpp:1768 > > + addMessageReceiver(Messages::RemoteRealtimeMediaSourceManager::messageReceiverName(), *m_remoteRealtimeMediaSourceManager); > > It is probably safer to remove the message receiver at destruction time.
Sure.
Sihui Liu
Comment 32
2020-12-04 17:04:27 PST
Comment hidden (obsolete)
Created
attachment 415477
[details]
Patch
Sihui Liu
Comment 33
2020-12-04 20:39:36 PST
Created
attachment 415483
[details]
Patch
Sihui Liu
Comment 34
2020-12-04 21:51:20 PST
Comment hidden (obsolete)
Created
attachment 415485
[details]
Patch
Sihui Liu
Comment 35
2020-12-04 22:19:22 PST
Comment hidden (obsolete)
Created
attachment 415486
[details]
Patch
youenn fablet
Comment 36
2020-12-07 06:16:57 PST
Comment on
attachment 415486
[details]
Patch Some nits below, some tests would be good. View in context:
https://bugs.webkit.org/attachment.cgi?id=415486&action=review
> Source/WebCore/ChangeLog:9 > + Add a way to set RealtimeMediaSource member of SpeechRecognitionCaptureSourceImpl.
We should be able to add some tests with mock audio source. Can we add/enable some tests?
> Source/WebKit/UIProcess/SpeechRecognitionRemoteRealtimeMediaSourceManager.cpp:57 > + send(Messages::SpeechRecognitionRealtimeMediaSourceManager::AddSource(identifier, captureDevice));
I would go with createSource/deleteSource instead of addSource/removeSource (for both methods and IPC messages).
> Source/WebKit/UIProcess/SpeechRecognitionRemoteRealtimeMediaSourceManager.h:54 > + SpeechRecognitionRemoteRealtimeMediaSourceManager(Ref<IPC::Connection>&&);
explicit
> Source/WebKit/UIProcess/WebProcessProxy.cpp:1754 > + return createRealtimeMediaSourceForSpeechRecognition();
If we make ensureSpeechRecognitionRemoteRealtimeMediaSourceManager public, then we could move createRealtimeMediaSourceForSpeechRecognition in WebPageProxy and write the lambda as: if (!weakPage) return error; return weakPage? weakPage->createRealtimeMediaSourceForSpeechRecognition() : CaptureSourceOrError { "Page is invalid" }; That will allow removing capturing 'this' in the lambda.
> Source/WebKit/WebProcess/Speech/SpeechRecognitionRealtimeMediaSourceManager.h:41 > +class SpeechRecognitionRealtimeMediaSourceManager : public CanMakeWeakPtr<SpeechRecognitionRealtimeMediaSourceManager>, public IPC::MessageReceiver, private IPC::MessageSender {
final?
> Source/WebKit/WebProcess/Speech/SpeechRecognitionRealtimeMediaSourceManager.h:44 > + SpeechRecognitionRealtimeMediaSourceManager(Ref<IPC::Connection>&&);
explicit
> Source/WebKit/WebProcess/Speech/SpeechRecognitionRealtimeMediaSourceManager.h:59 > + void didReceiveMessage(IPC::Connection&, IPC::Decoder&) override;
final
Sihui Liu
Comment 37
2020-12-07 11:43:18 PST
Comment on
attachment 415486
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=415486&action=review
>> Source/WebCore/ChangeLog:9 >> + Add a way to set RealtimeMediaSource member of SpeechRecognitionCaptureSourceImpl. > > We should be able to add some tests with mock audio source. > Can we add/enable some tests?
Hmm it seems using remote source or local source have no difference in generating events for SpeechRecognition, so this change does not affect the behavior of existing tests. What tests do you think should be added for this?
>> Source/WebKit/UIProcess/SpeechRecognitionRemoteRealtimeMediaSourceManager.cpp:57 >> + send(Messages::SpeechRecognitionRealtimeMediaSourceManager::AddSource(identifier, captureDevice)); > > I would go with createSource/deleteSource instead of addSource/removeSource (for both methods and IPC messages).
Will change.
>> Source/WebKit/UIProcess/SpeechRecognitionRemoteRealtimeMediaSourceManager.h:54 >> + SpeechRecognitionRemoteRealtimeMediaSourceManager(Ref<IPC::Connection>&&); > > explicit
Okay.
>> Source/WebKit/UIProcess/WebProcessProxy.cpp:1754 >> + return createRealtimeMediaSourceForSpeechRecognition(); > > If we make ensureSpeechRecognitionRemoteRealtimeMediaSourceManager public, then we could move createRealtimeMediaSourceForSpeechRecognition in WebPageProxy and write the lambda as: > if (!weakPage) > return error; > return weakPage? weakPage->createRealtimeMediaSourceForSpeechRecognition() : CaptureSourceOrError { "Page is invalid" }; > > That will allow removing capturing 'this' in the lambda.
Sure, will move it to WebPageProxy.
>> Source/WebKit/WebProcess/Speech/SpeechRecognitionRealtimeMediaSourceManager.h:41 >> +class SpeechRecognitionRealtimeMediaSourceManager : public CanMakeWeakPtr<SpeechRecognitionRealtimeMediaSourceManager>, public IPC::MessageReceiver, private IPC::MessageSender { > > final?
Okay.
>> Source/WebKit/WebProcess/Speech/SpeechRecognitionRealtimeMediaSourceManager.h:44 >> + SpeechRecognitionRealtimeMediaSourceManager(Ref<IPC::Connection>&&); > > explicit
Will add.
>> Source/WebKit/WebProcess/Speech/SpeechRecognitionRealtimeMediaSourceManager.h:59 >> + void didReceiveMessage(IPC::Connection&, IPC::Decoder&) override; > > final
Will change.
youenn fablet
Comment 38
2020-12-07 11:47:33 PST
(In reply to Sihui Liu from
comment #37
)
> Comment on
attachment 415486
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=415486&action=review
> > >> Source/WebCore/ChangeLog:9 > >> + Add a way to set RealtimeMediaSource member of SpeechRecognitionCaptureSourceImpl. > > > > We should be able to add some tests with mock audio source. > > Can we add/enable some tests?
The audio session category is changed to playrecord. After this patch, this will be done in WebProcess, so a layout test could check this actually happens from internals API.
Sihui Liu
Comment 39
2020-12-07 11:49:22 PST
(In reply to youenn fablet from
comment #38
)
> (In reply to Sihui Liu from
comment #37
) > > Comment on
attachment 415486
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=415486&action=review
> > > > >> Source/WebCore/ChangeLog:9 > > >> + Add a way to set RealtimeMediaSource member of SpeechRecognitionCaptureSourceImpl. > > > > > > We should be able to add some tests with mock audio source. > > > Can we add/enable some tests? > > The audio session category is changed to playrecord. > After this patch, this will be done in WebProcess, so a layout test could > check this actually happens from internals API.
I see, will add.
Sihui Liu
Comment 40
2020-12-07 14:09:25 PST
Created
attachment 415582
[details]
Patch
Sihui Liu
Comment 41
2020-12-07 14:10:15 PST
Created
attachment 415583
[details]
Patch
youenn fablet
Comment 42
2020-12-08 01:02:04 PST
Comment on
attachment 415583
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=415583&action=review
> Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:119 > m_recognizer->start(clientIdentifier);
This is a bit weird to have the return here and some code below. I think we should rewrite it a bit: - Remove SpeechRecognizer::start(SpeechRecognitionConnectionClientIdentifier identifier) - Add a #else here that, in case ENABLE(MEDIA_STREAM) is not defined will send back a SpeechRecognitionErrorType::AudioCapture error. We should probably at some point, either change compilation flags to enable RealtimeMediaSource if ENABLE(MEDIA_STREAM) && ENABLE(SPEECH_API), or just disable speech api if MEDIA_STREAM is not on. That part should be left to a follow-up patch.
> Source/WebKit/UIProcess/WebProcessProxy.cpp:1718 > return;
We should probably update m_speechRecognitionServerMap after getting the targetPage.
> LayoutTests/fast/speechrecognition/ios/audio-capture.html:20 > + shouldBeEqualToString("internals.audioSessionCategory()", "PlayAndRecord");
Can we add a negative check before, for instance in start event handler?
youenn fablet
Comment 43
2020-12-08 01:02:42 PST
Please check api-mac before landing.
Sihui Liu
Comment 44
2020-12-08 17:21:38 PST
Created
attachment 415693
[details]
Patch
Sihui Liu
Comment 45
2020-12-08 22:47:58 PST
Created
attachment 415719
[details]
Patch for landing
Sihui Liu
Comment 46
2020-12-08 23:16:51 PST
Created
attachment 415722
[details]
Patch for landing
EWS
Comment 47
2020-12-08 23:51:29 PST
commit-queue failed to commit
attachment 415722
[details]
to WebKit repository.
Sihui Liu
Comment 48
2020-12-08 23:55:59 PST
Created
attachment 415723
[details]
Patch for landing
EWS
Comment 49
2020-12-09 00:28:44 PST
Committed
r270574
: <
https://trac.webkit.org/changeset/270574
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 415723
[details]
.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug