WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 221082
Use user media permission prompt for speech recognition
https://bugs.webkit.org/show_bug.cgi?id=221082
Summary
Use user media permission prompt for speech recognition
Sihui Liu
Reported
2021-01-28 00:42:00 PST
...
Attachments
Patch
(43.05 KB, patch)
2021-01-28 01:08 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(44.26 KB, patch)
2021-01-28 12:33 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(36.76 KB, patch)
2021-01-28 14:10 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(40.10 KB, patch)
2021-01-29 11:44 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(40.33 KB, patch)
2021-01-29 12:22 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(44.98 KB, patch)
2021-01-29 16:23 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(44.92 KB, patch)
2021-01-29 17:40 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(45.14 KB, patch)
2021-01-29 19:07 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch for landing
(45.36 KB, patch)
2021-02-01 09:59 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Show Obsolete
(8)
View All
Add attachment
proposed patch, testcase, etc.
Sihui Liu
Comment 1
2021-01-28 01:08:47 PST
Created
attachment 418629
[details]
Patch
Sihui Liu
Comment 2
2021-01-28 12:33:35 PST
Created
attachment 418662
[details]
Patch
Sihui Liu
Comment 3
2021-01-28 14:10:53 PST
Created
attachment 418667
[details]
Patch
Sihui Liu
Comment 4
2021-01-28 23:45:36 PST
rdar://problem/73372499
youenn fablet
Comment 5
2021-01-29 00:26:29 PST
Comment on
attachment 418667
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=418667&action=review
> Source/WebCore/Modules/speech/SpeechRecognition.cpp:76 > + return { };
We should probably throw an exception there. And add a test covering this case.
> Source/WebCore/Modules/speech/SpeechRecognition.cpp:78 > + m_connection->start(identifier(), m_lang, m_continuous, m_interimResults, m_maxAlternatives, ClientOrigin { document.topOrigin().data(), document.securityOrigin().data() }, *frame);
Do we need to pass the frame or just the frame identifier? Can we use Document::frameID()?
> Source/WebCore/Modules/speech/SpeechRecognitionConnection.h:42 > + virtual void start(SpeechRecognitionConnectionClientIdentifier, const String& lang, bool continuous, bool interimResults, uint64_t maxAlternatives, ClientOrigin&&, Frame&) = 0;
const Frame&, or FrameIdentifier?
> Source/WebKit/UIProcess/SpeechRecognitionPermissionRequest.h:37 > + static Ref<SpeechRecognitionPermissionRequest> create(const String& lang, const WebCore::ClientOrigin& origin, const WebCore::FrameIdentifier frameIdentifier, CompletionHandler<void(Optional<WebCore::SpeechRecognitionError>&&)>&& completionHandler)
I would tend to give a name to the completion handler, something like: using PermissionCallback = CompletionHandler<void(Optional<WebCore::SpeechRecognitionError>&&)>;
> Source/WebKit/UIProcess/SpeechRecognitionPermissionRequest.h:50 > + const WebCore::FrameIdentifier frameIdentifier() const { return m_frameIdentifier; }
s/const WebCore::FrameIdentifier/WebCore::FrameIdentifier/
> Source/WebKit/UIProcess/SpeechRecognitionPermissionRequest.h:53 > + SpeechRecognitionPermissionRequest(const String& lang, const WebCore::ClientOrigin& origin, const WebCore::FrameIdentifier frameIdentifier, CompletionHandler<void(Optional<WebCore::SpeechRecognitionError>&&)>&& completionHandler)
s/const WebCore::FrameIdentifier/WebCore::FrameIdentifier/
> Source/WebKit/UIProcess/WebPageProxy.cpp:10407 > + userMediaPermissionRequestManager().checkUserMediaPermissionForSpeechRecognition(frameIdentifier, requestingOrigin, topOrigin, *captureDevice, WTFMove(completionHandler));
Why not using UserMediaPermissionRequestManagerProxy::requestUserMediaPermissionForFrame and passing it an optional completion handler? Completion handler could be stored in UserMediaPermissionRequestProxy to override the current behavior. To limit the code changes, we could initially only use the completion handler for speech recognition code path. Then use the completion handler in gum code path as well as this would allow to move back WebPage code from UserMediaPermissionRequestManagerProxy.cpp back to WebPage.cpp. Or maybe we can do everything in this patch? Callback could be called in UserMediaPermissionRequestManagerProxy::finishGrantingRequest and UserMediaPermissionRequestManagerProxy::denyRequest.
Sihui Liu
Comment 6
2021-01-29 10:28:01 PST
Comment on
attachment 418667
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=418667&action=review
>> Source/WebCore/Modules/speech/SpeechRecognition.cpp:76 >> + return { }; > > We should probably throw an exception there. And add a test covering this case.
Okay, we have a layout test that covers this exception, was not sure about changing the behavior. Will update.
>> Source/WebCore/Modules/speech/SpeechRecognition.cpp:78 >> + m_connection->start(identifier(), m_lang, m_continuous, m_interimResults, m_maxAlternatives, ClientOrigin { document.topOrigin().data(), document.securityOrigin().data() }, *frame); > > Do we need to pass the frame or just the frame identifier? > Can we use Document::frameID()?
Just the identifier. Didn't know there is a Document::frameID(), will update.
>> Source/WebCore/Modules/speech/SpeechRecognitionConnection.h:42 >> + virtual void start(SpeechRecognitionConnectionClientIdentifier, const String& lang, bool continuous, bool interimResults, uint64_t maxAlternatives, ClientOrigin&&, Frame&) = 0; > > const Frame&, or FrameIdentifier?
FrameIdentifier
>> Source/WebKit/UIProcess/SpeechRecognitionPermissionRequest.h:37 >> + static Ref<SpeechRecognitionPermissionRequest> create(const String& lang, const WebCore::ClientOrigin& origin, const WebCore::FrameIdentifier frameIdentifier, CompletionHandler<void(Optional<WebCore::SpeechRecognitionError>&&)>&& completionHandler) > > I would tend to give a name to the completion handler, something like: > using PermissionCallback = CompletionHandler<void(Optional<WebCore::SpeechRecognitionError>&&)>;
Okay.
>> Source/WebKit/UIProcess/SpeechRecognitionPermissionRequest.h:50 >> + const WebCore::FrameIdentifier frameIdentifier() const { return m_frameIdentifier; } > > s/const WebCore::FrameIdentifier/WebCore::FrameIdentifier/
Okay.
>> Source/WebKit/UIProcess/SpeechRecognitionPermissionRequest.h:53 >> + SpeechRecognitionPermissionRequest(const String& lang, const WebCore::ClientOrigin& origin, const WebCore::FrameIdentifier frameIdentifier, CompletionHandler<void(Optional<WebCore::SpeechRecognitionError>&&)>&& completionHandler) > > s/const WebCore::FrameIdentifier/WebCore::FrameIdentifier/
Okay.
>> Source/WebKit/UIProcess/WebPageProxy.cpp:10407 >> + userMediaPermissionRequestManager().checkUserMediaPermissionForSpeechRecognition(frameIdentifier, requestingOrigin, topOrigin, *captureDevice, WTFMove(completionHandler)); > > Why not using UserMediaPermissionRequestManagerProxy::requestUserMediaPermissionForFrame and passing it an optional completion handler? > Completion handler could be stored in UserMediaPermissionRequestProxy to override the current behavior. > > To limit the code changes, we could initially only use the completion handler for speech recognition code path. > Then use the completion handler in gum code path as well as this would allow to move back WebPage code from UserMediaPermissionRequestManagerProxy.cpp back to WebPage.cpp. > Or maybe we can do everything in this patch? > Callback could be called in UserMediaPermissionRequestManagerProxy::finishGrantingRequest and UserMediaPermissionRequestManagerProxy::denyRequest.
I did not use requestUserMediaPermissionForFrame to limit the changes. This patch basically only changes to use a different UIDelegate callback and invoke a different prompt. requestUserMediaPermissionForFrame seems to go through the whole process including finding a device(getUserMediaPermissionInfo, invoking delegate callback from checkUserMediaPermissionForOrigin), checking deviceIdHashSalt, validating user media request constraints, checking a set of settings (like mockCaptureDevicesEnabled, which is checked by SpeechRecognitionPermissionManager), requesting system validation(checking TCC permission, which is done by SpeechRecognitionPermissionManager). I am not sure all steps are necessary for speech recognition request. To use requestUserMediaPermissionForFrame, it seems better to merge some functionalities of SpeechRecognitionPermissionManager and userMediaPermissionRequestManager. How about switching to the same delegate callback first, and refactoring to merge in a follow up patch?
Sihui Liu
Comment 7
2021-01-29 11:44:29 PST
Created
attachment 418751
[details]
Patch
Sihui Liu
Comment 8
2021-01-29 12:22:31 PST
Created
attachment 418754
[details]
Patch
youenn fablet
Comment 9
2021-01-29 12:47:21 PST
Comment on
attachment 418754
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=418754&action=review
> Source/WebKit/UIProcess/UserMediaPermissionRequestManagerProxy.h:167 > + HashMap<RefPtr<UserMediaPermissionRequestProxy>, CompletionHandler<void(bool)>> m_speechRecognitionRequestCallbacks;
I think it would be best to have UserMediaPermissionRequestProxy owns the CompletionHandler and only set it for speech requests. It is fine to start with a dedicated checkUserMediaPermissionForSpeechRecognition, when having the request decision, we can check whether it has a callback. If the request has a callback, we can keep calling it and returning early for now, and in a later patch, make full use of it.
Sihui Liu
Comment 10
2021-01-29 13:09:17 PST
(In reply to youenn fablet from
comment #9
)
> Comment on
attachment 418754
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=418754&action=review
> > > Source/WebKit/UIProcess/UserMediaPermissionRequestManagerProxy.h:167 > > + HashMap<RefPtr<UserMediaPermissionRequestProxy>, CompletionHandler<void(bool)>> m_speechRecognitionRequestCallbacks; > > I think it would be best to have UserMediaPermissionRequestProxy owns the > CompletionHandler and only set it for speech requests. > It is fine to start with a dedicated > checkUserMediaPermissionForSpeechRecognition, when having the request > decision, we can check whether it has a callback. > If the request has a callback, we can keep calling it and returning early > for now, and in a later patch, make full use of it.
I see, will make the change.
Sihui Liu
Comment 11
2021-01-29 16:23:36 PST
Created
attachment 418785
[details]
Patch
Sihui Liu
Comment 12
2021-01-29 17:40:07 PST
Created
attachment 418794
[details]
Patch
Sihui Liu
Comment 13
2021-01-29 19:07:48 PST
Created
attachment 418799
[details]
Patch
youenn fablet
Comment 14
2021-02-01 00:23:44 PST
Comment on
attachment 418799
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=418799&action=review
> Source/WebCore/Modules/speech/SpeechRecognition.cpp:79 > + auto frameIdentifier = optionalFrameIdentifier ? *optionalFrameIdentifier : FrameIdentifier { };
We probably do not need to check frame. We could do: auto frameID = document.frameID(); if (!frameID) return Exception { InvalidStateError, "Recognition is not in a valid frame"_s }; ...
> Source/WebKit/UIProcess/SpeechRecognitionPermissionManager.cpp:78 > +void SpeechRecognitionPermissionManager::request(const String& lang, const WebCore::ClientOrigin& origin, const WebCore::FrameIdentifier frameIdentifier, SpeechRecognitionPermissionRequestCallback&& completiontHandler)
s/const WebCore::FrameIdentifier/WebCore::FrameIdentifier/ Ditto below. Or is it to make sure it stays const?
> Source/WebKit/UIProcess/SpeechRecognitionPermissionRequest.h:65 > + const WebCore::FrameIdentifier m_frameIdentifier;
Ditto here.
> Source/WebKit/UIProcess/UserMediaPermissionRequestManagerProxy.cpp:244 > + callback(true);
We will miss device ID exposure which is done in finishGrantingRequest. I guess we should fix this in a refactoring.
> Source/WebKit/UIProcess/UserMediaPermissionRequestManagerProxy.cpp:616 > +void UserMediaPermissionRequestManagerProxy::checkUserMediaPermissionForSpeechRecognition(const WebCore::FrameIdentifier frameIdentifier, const WebCore::SecurityOrigin& requestingOrigin, const WebCore::SecurityOrigin& topOrigin, const WebCore::CaptureDevice& device, CompletionHandler<void(bool)>&& completionHandler)
ditto for const WebCore::FrameIdentifier
> Source/WebKit/UIProcess/UserMediaPermissionRequestManagerProxy.cpp:632 > + }
Can we replace wasRequestDenied and searchForGrantedRequest with getRequestAction(). We will miss the ability of being able to reprompt user if speech recognition is made as part of a user gesture.
> Source/WebKit/UIProcess/UserMediaPermissionRequestProxy.cpp:44 > + , m_decisionCompletionHandler(WTFMove(decisionCompletionHandler))
We probably need to call this completion handler in UserMediaPermissionRequestManagerProxy::invalidatePendingRequests() or UserMediaPermissionRequestProxy::invalidate().
> Source/WebKit/UIProcess/WebPageProxy.cpp:10418 > +void WebPageProxy::requestUserMediaPermissionForSpeechRecognition(const WebCore::FrameIdentifier frameIdentifier, const WebCore::SecurityOrigin& requestingOrigin, const WebCore::SecurityOrigin& topOrigin, CompletionHandler<void(bool)>&& completionHandler)
const WebCore::FrameIdentifier?
Sihui Liu
Comment 15
2021-02-01 09:51:38 PST
Comment on
attachment 418799
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=418799&action=review
>> Source/WebCore/Modules/speech/SpeechRecognition.cpp:79 >> + auto frameIdentifier = optionalFrameIdentifier ? *optionalFrameIdentifier : FrameIdentifier { }; > > We probably do not need to check frame. > We could do: > auto frameID = document.frameID(); > if (!frameID) > return Exception { InvalidStateError, "Recognition is not in a valid frame"_s }; > ...
I found WebKitLegacy does not have frameIdentifier (WebFrameLoaderClient::frameID()). If we throw exception here, we will have different behavior in WebKit and WebKitLegacy and break some layout test.
>> Source/WebKit/UIProcess/SpeechRecognitionPermissionManager.cpp:78 >> +void SpeechRecognitionPermissionManager::request(const String& lang, const WebCore::ClientOrigin& origin, const WebCore::FrameIdentifier frameIdentifier, SpeechRecognitionPermissionRequestCallback&& completiontHandler) > > s/const WebCore::FrameIdentifier/WebCore::FrameIdentifier/ > Ditto below. > Or is it to make sure it stays const?
No, will remove const.
>> Source/WebKit/UIProcess/SpeechRecognitionPermissionRequest.h:65 >> + const WebCore::FrameIdentifier m_frameIdentifier; > > Ditto here.
Okay.
>> Source/WebKit/UIProcess/UserMediaPermissionRequestManagerProxy.cpp:244 >> + callback(true); > > We will miss device ID exposure which is done in finishGrantingRequest. > I guess we should fix this in a refactoring.
Not sure what is device ID exposure. I leave a comment in
https://bugs.webkit.org/show_bug.cgi?id=221213
to remind myself about this.
>> Source/WebKit/UIProcess/UserMediaPermissionRequestManagerProxy.cpp:616 >> +void UserMediaPermissionRequestManagerProxy::checkUserMediaPermissionForSpeechRecognition(const WebCore::FrameIdentifier frameIdentifier, const WebCore::SecurityOrigin& requestingOrigin, const WebCore::SecurityOrigin& topOrigin, const WebCore::CaptureDevice& device, CompletionHandler<void(bool)>&& completionHandler) > > ditto for const WebCore::FrameIdentifier
Okay.
>> Source/WebKit/UIProcess/UserMediaPermissionRequestManagerProxy.cpp:632 >> + } > > Can we replace wasRequestDenied and searchForGrantedRequest with getRequestAction(). > We will miss the ability of being able to reprompt user if speech recognition is made as part of a user gesture.
Sure.
>> Source/WebKit/UIProcess/UserMediaPermissionRequestProxy.cpp:44 >> + , m_decisionCompletionHandler(WTFMove(decisionCompletionHandler)) > > We probably need to call this completion handler in UserMediaPermissionRequestManagerProxy::invalidatePendingRequests() or UserMediaPermissionRequestProxy::invalidate().
Okay.
>> Source/WebKit/UIProcess/WebPageProxy.cpp:10418 >> +void WebPageProxy::requestUserMediaPermissionForSpeechRecognition(const WebCore::FrameIdentifier frameIdentifier, const WebCore::SecurityOrigin& requestingOrigin, const WebCore::SecurityOrigin& topOrigin, CompletionHandler<void(bool)>&& completionHandler) > > const WebCore::FrameIdentifier?
Sure.
Sihui Liu
Comment 16
2021-02-01 09:59:45 PST
Created
attachment 418890
[details]
Patch for landing
EWS
Comment 17
2021-02-01 12:30:50 PST
Committed
r272165
: <
https://trac.webkit.org/changeset/272165
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 418890
[details]
.
Simon Fraser (smfr)
Comment 18
2021-08-19 13:50:48 PDT
Comment on
attachment 418890
[details]
Patch for landing View in context:
https://bugs.webkit.org/attachment.cgi?id=418890&action=review
> Source/WebKit/UIProcess/UserMediaPermissionRequestManagerProxy.cpp:624 > + auto request = UserMediaPermissionRequestProxy::create(*this, 0, frameIdentifier, frameIdentifier, requestingOrigin.isolatedCopy(), topOrigin.isolatedCopy(), Vector<WebCore::CaptureDevice> { device }, { }, { }, WTFMove(completionHandler));
Why is it OK to pass 0 as the userRequestID here?
Sihui Liu
Comment 19
2021-08-19 14:18:21 PDT
(In reply to Simon Fraser (smfr) from
comment #18
)
> Comment on
attachment 418890
[details]
> Patch for landing > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=418890&action=review
> > > Source/WebKit/UIProcess/UserMediaPermissionRequestManagerProxy.cpp:624 > > + auto request = UserMediaPermissionRequestProxy::create(*this, 0, frameIdentifier, frameIdentifier, requestingOrigin.isolatedCopy(), topOrigin.isolatedCopy(), Vector<WebCore::CaptureDevice> { device }, { }, { }, WTFMove(completionHandler)); > > Why is it OK to pass 0 as the userRequestID here?
This does not correspond to a UserMediaPermissionRequest in web process. We create the RequestProxy only to check the media permission for speech.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug