219459 – Implement recognizer for SpeechRecognition

RESOLVED FIXED219459

Implement recognizer for SpeechRecognition

https://bugs.webkit.org/show_bug.cgi?id=219459

Summary Implement recognizer for SpeechRecognition

Sihui Liu

Reported 2020-12-02 15:56:53 PST

...

Attachments
Patch (28.66 KB, patch) 2020-12-07 11:09 PST, Sihui Liu	no flags	Details Formatted Diff Diff
Patch (29.73 KB, patch) 2020-12-07 14:19 PST, Sihui Liu	ews-feeder: commit-queue-	Details Formatted Diff Diff
Patch (29.73 KB, patch) 2020-12-07 14:28 PST, Sihui Liu	no flags	Details Formatted Diff Diff
Patch (29.77 KB, patch) 2020-12-07 14:34 PST, Sihui Liu	no flags	Details Formatted Diff Diff
Patch (37.22 KB, patch) 2020-12-08 00:54 PST, Sihui Liu	no flags	Details Formatted Diff Diff
Patch (36.23 KB, patch) 2020-12-08 07:51 PST, Sihui Liu	ews-feeder: commit-queue-	Details Formatted Diff Diff
Patch (36.24 KB, patch) 2020-12-08 08:06 PST, Sihui Liu	darin: review-	Details Formatted Diff Diff
Patch (60.56 KB, patch) 2020-12-10 02:18 PST, Sihui Liu	no flags	Details Formatted Diff Diff
Patch (62.09 KB, patch) 2020-12-10 08:13 PST, Sihui Liu	ews-feeder: commit-queue-	Details Formatted Diff Diff
Patch (62.07 KB, patch) 2020-12-10 08:20 PST, Sihui Liu	ews-feeder: commit-queue-	Details Formatted Diff Diff
Patch (62.39 KB, patch) 2020-12-10 09:05 PST, Sihui Liu	ews-feeder: commit-queue-	Details Formatted Diff Diff
Patch (62.85 KB, patch) 2020-12-10 11:05 PST, Sihui Liu	no flags	Details Formatted Diff Diff
Patch (63.86 KB, patch) 2020-12-10 14:57 PST, Sihui Liu	no flags	Details Formatted Diff Diff
Patch (68.60 KB, patch) 2020-12-11 11:26 PST, Sihui Liu	ews-feeder: commit-queue-	Details Formatted Diff Diff
Patch (68.31 KB, patch) 2020-12-11 12:07 PST, Sihui Liu	no flags	Details Formatted Diff Diff
Patch (70.84 KB, patch) 2020-12-14 08:46 PST, Sihui Liu	no flags	Details Formatted Diff Diff
Show Obsolete (15) View All Add attachment proposed patch, testcase, etc.

Radar WebKit Bug Importer

Comment 1 2020-12-02 15:57:01 PST

<rdar://problem/71914465>

Sihui Liu

Comment 2 2020-12-07 11:09:43 PST

Created attachment 415566 [details] Patch

Sihui Liu

Comment 3 2020-12-07 14:19:20 PST

Created attachment 415585 [details] Patch

Sihui Liu

Comment 4 2020-12-07 14:28:29 PST

Created attachment 415586 [details] Patch

Sihui Liu

Comment 5 2020-12-07 14:34:51 PST

Created attachment 415587 [details] Patch

Sihui Liu

Comment 6 2020-12-08 00:54:34 PST

Created attachment 415617 [details] Patch

Sihui Liu

Comment 7 2020-12-08 07:51:17 PST

Created attachment 415644 [details] Patch

Sihui Liu

Comment 8 2020-12-08 08:06:28 PST

Created attachment 415646 [details] Patch

Darin Adler

Comment 9 2020-12-08 11:36:35 PST

Comment on attachment 415646 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=415646&action=review > Source/WebCore/Configurations/WebCore.xcconfig:128 > +WK_SPEECH_LDFLAGS = $(WK_SPEECH_LDFLAGS_$(WK_PLATFORM_NAME)); > +WK_SPEECH_LDFLAGS_macosx = -framework Speech; > +WK_SPEECH_LDFLAGS_iphoneos = -framework Speech; > +WK_SPEECH_LDFLAGS_iphonesimulator = -framework Speech; > +WK_SPEECH_LDFLAGS_maccatalyst = -framework Speech; Is this OK for the base system on macOS? Is the Speech framework included in that reduced system? Does this affect launch time? Adding another framework that mostly isn’t used can slow down launch if that framework is not part of the shared cache, I believe. > Source/WebCore/Modules/speech/SpeechRecognizer.cpp:46 > + auto error = WebCore::SpeechRecognitionError { WebCore::SpeechRecognitionErrorType::ServiceNotAllowed, "Failed to start recognition" }; Should use _s on the string constant here since it’s going to be used to make a WTF::String. > Source/WebCore/Modules/speech/SpeechRecognizer.cpp:48 > + auto update = WebCore::SpeechRecognitionUpdate::createError(identifier, error); > + invokeDelegateCallback(update); Seems like this could be collapsed into a single line like the other similar cases below. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:89 > + return nil; This is missing a call to either "[self release]" or "[self dealloc]". > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:91 > + if (![_recognizer.get() isAvailable]) The "get()" here is not needed. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:92 > + return nil; This is missing a call to either "[self release]" or "[self dealloc]". > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:104 > + [_request _setMaximumRecognitionDuration:60 * 60]; Where does this magic constant come from? Maybe a named constant at the top of the file would be better. We could also comment how we chose that value. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:107 > + _task = [_recognizer.get() recognitionTaskWithRequest:_request.get() delegate:self]; The first "get()" here isn't needed. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:119 > + data.alternatives.append(WebCore::SpeechRecognitionAlternativeData { [[segment substring] UTF8String], [segment confidence] }); There’s no reason to call UTF8String here. We’re doing to store this into a WTF::String, and it can do it with NSString. Calling UTF8String makes the operation less efficient, and I don’t think provides any value. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:124 > + data.alternatives.append(WebCore::SpeechRecognitionAlternativeData { [segmentAlternative UTF8String], 0.0 }); There’s no reason to call UTF8String here. We’re doing to store this into a WTF::String, and it can do it with NSString. Calling UTF8String makes the operation less efficient, and I don’t think provides any value. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:193 > + // FIXME: This is never called. This is a confusing comment. Why is this never called? We say FIXME: What should someone be fixing? > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:214 > + // FIXME: This is never called. Ditto. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:231 > + auto error = WebCore::SpeechRecognitionError { WebCore::SpeechRecognitionErrorType::Aborted, [task.error.localizedDescription UTF8String] }; There’s no reason to call UTF8String here. We’re doing to store this into a WTF::String, and it can do it with NSString. Calling UTF8String makes the operation less efficient, and I don’t think provides any value. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:233 > + auto update = WebCore::SpeechRecognitionUpdate::createError(_identifier, error); > + _delegateCallback(update); I think we could collapse these into a single line. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:285 > + [m_task.get() audioSamplesAvailable:buffer.get()]; The first "get()" here isn’t needed. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:304 > + [m_task.get() stop]; I don’t think this get() is needed. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:310 > + [m_task.get() abort]; Ditto. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:319 > + [task.get() abort]; Ditto. > Source/WebKit/UIProcess/WebProcessProxy.cpp:1728 > + if (!weakPage || !weakPage->preferences().mockCaptureDevicesEnabled()) > + return false; > + return true; I would have written this: return weakPage && weakPage->preferences().mockCaptureDevicesEnabled();

youenn fablet

Comment 10 2020-12-09 01:27:57 PST

Comment on attachment 415646 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=415646&action=review > Source/WebCore/Modules/speech/SpeechRecognizer.cpp:123 > + invokeDelegateCallback(SpeechRecognitionUpdate::create(*m_clientIdentifier, SpeechRecognitionUpdateType::AudioEnd)); Are we sure m_clientIdentifier is always not null here? Maybe we should change the model as a clean-up patch by creating a SpeechRecognizer for every new recognition task. Then we would pass the client identifier as part of SpeechRecognizer constructor and it would no longer be optional. We could keep another boolean internal state to track whether being stopped/aborted. The recogniser could also become mock or not mock at creation time as well. > Source/WebCore/Modules/speech/SpeechRecognizer.cpp:128 > +void SpeechRecognizer::dataCaptured(const WTF::MediaTime&, const PlatformAudioData&, const AudioStreamDescription&, size_t) We can probably remove WTF:: here. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:114 > + auto segments = [transcription segments]; Can we use reserveInitialCapacity? > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:133 > + [_request appendAudioSampleBuffer:sampleBuffer]; We tend to push audio samples in background threads only. If this method is only supposed to be called in the main thread, we could add an isMainThread ASSERT. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:200 > + [self callbackWithResult:transcription isFinal:NO]; Is the delegate guaranteed to be called on the main thread? > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:239 > +@interface WebSpeechRecognizerTaskMock : WebSpeechRecognizerTask I would tend to move WebSpeechRecognizerTaskMock to its own file. > Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:290 > + auto taskClass = mockCaptureDevicesEnabled ? [WebSpeechRecognizerTaskMock class] : [WebSpeechRecognizerTaskMock class]; In both cases, we are creating a WebSpeechRecognizerTaskMock, the second one is probably WebSpeechRecognizerTask.

Sihui Liu

Comment 11 2020-12-09 11:35:42 PST

Comment on attachment 415646 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=415646&action=review >> Source/WebCore/Configurations/WebCore.xcconfig:128 >> +WK_SPEECH_LDFLAGS_maccatalyst = -framework Speech; > > Is this OK for the base system on macOS? Is the Speech framework included in that reduced system? > > Does this affect launch time? Adding another framework that mostly isn’t used can slow down launch if that framework is not part of the shared cache, I believe. Thanks for pointing this out! The framework is not in base system, should add -weak. It may affect launch time, but it seems the framework is too small to cause noticeable change. >> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:46 >> + auto error = WebCore::SpeechRecognitionError { WebCore::SpeechRecognitionErrorType::ServiceNotAllowed, "Failed to start recognition" }; > > Should use _s on the string constant here since it’s going to be used to make a WTF::String. Sure. >> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:48 >> + invokeDelegateCallback(update); > > Seems like this could be collapsed into a single line like the other similar cases below. Will collapse this. >> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:123 >> + invokeDelegateCallback(SpeechRecognitionUpdate::create(*m_clientIdentifier, SpeechRecognitionUpdateType::AudioEnd)); > > Are we sure m_clientIdentifier is always not null here? > Maybe we should change the model as a clean-up patch by creating a SpeechRecognizer for every new recognition task. > Then we would pass the client identifier as part of SpeechRecognizer constructor and it would no longer be optional. > We could keep another boolean internal state to track whether being stopped/aborted. > The recogniser could also become mock or not mock at creation time as well. Sounds good. Filed https://bugs.webkit.org/show_bug.cgi?id=219699. >> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:128 >> +void SpeechRecognizer::dataCaptured(const WTF::MediaTime&, const PlatformAudioData&, const AudioStreamDescription&, size_t) > > We can probably remove WTF:: here. Sure. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:89 >> + return nil; > > This is missing a call to either "[self release]" or "[self dealloc]". ah, will add! >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:91 >> + if (![_recognizer.get() isAvailable]) > > The "get()" here is not needed. Will remove. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:92 >> + return nil; > > This is missing a call to either "[self release]" or "[self dealloc]". Will add! >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:104 >> + [_request _setMaximumRecognitionDuration:60 * 60]; > > Where does this magic constant come from? Maybe a named constant at the top of the file would be better. We could also comment how we chose that value. Will add! >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:107 >> + _task = [_recognizer.get() recognitionTaskWithRequest:_request.get() delegate:self]; > > The first "get()" here isn't needed. Will remove. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:114 >> + auto segments = [transcription segments]; > > Can we use reserveInitialCapacity? Sure. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:119 >> + data.alternatives.append(WebCore::SpeechRecognitionAlternativeData { [[segment substring] UTF8String], [segment confidence] }); > > There’s no reason to call UTF8String here. We’re doing to store this into a WTF::String, and it can do it with NSString. Calling UTF8String makes the operation less efficient, and I don’t think provides any value. Will remove this. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:124 >> + data.alternatives.append(WebCore::SpeechRecognitionAlternativeData { [segmentAlternative UTF8String], 0.0 }); > > There’s no reason to call UTF8String here. We’re doing to store this into a WTF::String, and it can do it with NSString. Calling UTF8String makes the operation less efficient, and I don’t think provides any value. Will remove. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:133 >> + [_request appendAudioSampleBuffer:sampleBuffer]; > > We tend to push audio samples in background threads only. > If this method is only supposed to be called in the main thread, we could add an isMainThread ASSERT. Sure. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:193 >> + // FIXME: This is never called. > > This is a confusing comment. Why is this never called? We say FIXME: What should someone be fixing? hmm, left a comment here to remind myself this delegate callback never gets called during my testing. Will remove. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:200 >> + [self callbackWithResult:transcription isFinal:NO]; > > Is the delegate guaranteed to be called on the main thread? Good question! Will check this on testing; and all other callbacks. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:214 >> + // FIXME: This is never called. > > Ditto. Will remove. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:231 >> + auto error = WebCore::SpeechRecognitionError { WebCore::SpeechRecognitionErrorType::Aborted, [task.error.localizedDescription UTF8String] }; > > There’s no reason to call UTF8String here. We’re doing to store this into a WTF::String, and it can do it with NSString. Calling UTF8String makes the operation less efficient, and I don’t think provides any value. Will remove. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:233 >> + _delegateCallback(update); > > I think we could collapse these into a single line. Okay. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:239 >> +@interface WebSpeechRecognizerTaskMock : WebSpeechRecognizerTask > > I would tend to move WebSpeechRecognizerTaskMock to its own file. Will move. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:285 >> + [m_task.get() audioSamplesAvailable:buffer.get()]; > > The first "get()" here isn’t needed. Will remove. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:290 >> + auto taskClass = mockCaptureDevicesEnabled ? [WebSpeechRecognizerTaskMock class] : [WebSpeechRecognizerTaskMock class]; > > In both cases, we are creating a WebSpeechRecognizerTaskMock, the second one is probably WebSpeechRecognizerTask. Oops! >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:304 >> + [m_task.get() stop]; > > I don’t think this get() is needed. Will remove. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:310 >> + [m_task.get() abort]; > > Ditto. Will remove. >> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:319 >> + [task.get() abort]; > > Ditto. Will remove. >> Source/WebKit/UIProcess/WebProcessProxy.cpp:1728 >> + return true; > > I would have written this: > > return weakPage && weakPage->preferences().mockCaptureDevicesEnabled(); NIce.

Darin Adler

Comment 12 2020-12-09 12:20:50 PST

(In reply to Sihui Liu from comment #11) > It may affect launch time, but it seems the framework is too small to cause > noticeable change. Please double check this with the people who worked so hard to make web processes launch quickly so they can make sure we measure after the fact to be sure.

Sihui Liu

Comment 13 2020-12-10 02:18:29 PST

Created attachment 415848 [details] Patch

Sihui Liu

Comment 14 2020-12-10 08:13:16 PST

Created attachment 415877 [details] Patch

Sihui Liu

Comment 15 2020-12-10 08:20:08 PST

Created attachment 415878 [details] Patch

Sihui Liu

Comment 16 2020-12-10 09:05:11 PST

Created attachment 415881 [details] Patch

Sihui Liu

Comment 17 2020-12-10 11:05:55 PST

Created attachment 415903 [details] Patch

Sihui Liu

Comment 18 2020-12-10 14:57:29 PST

Created attachment 415936 [details] Patch

youenn fablet

Comment 19 2020-12-11 01:09:09 PST

Comment on attachment 415936 [details] Patch This looks almost ready to me. It would be good to get some more tests. View in context: https://bugs.webkit.org/attachment.cgi?id=415936&action=review > Source/WebCore/ChangeLog:11 > + Manually tested in MiniBrowser. If we beef up the mock implementation to return some fake recognised words, we could probably add some tests. > Source/WebCore/Modules/speech/SpeechRecognizer.cpp:69 > +void SpeechRecognizer::start(SpeechRecognitionConnectionClientIdentifier clientIdentifier, Ref<RealtimeMediaSource>&& source, bool mockCaptureDevicesEnabled, const String& localeIdentifier, bool continuous, bool interimResults, uint64_t maxAlternatives) s/mockCaptureDevicesEnabled/mockSpeechRecognitionEnabled/ > Source/WebCore/Modules/speech/SpeechRecognizer.cpp:77 > + auto update = WebCore::SpeechRecognitionUpdate::createError(clientIdentifier, error); WTFMove() or inline error creation in this line > Source/WebCore/Modules/speech/cocoa/SpeechRecognizerCocoa.mm:45 > +bool SpeechRecognizer::startRecognition(bool mockCaptureDevicesEnabled, SpeechRecognitionConnectionClientIdentifier identifier, const String& localeIdentifier, bool continuous, bool interimResults, uint64_t alternatives) s/mockCaptureDevicesEnabled/mockSpeechDevicesEnabled/ > Source/WebCore/Modules/speech/cocoa/SpeechRecognizerCocoa.mm:70 > +void SpeechRecognizer::resetRecognition() Should we merge abort and reset methods? If we abort, do we want m_task to not stay null? > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:119 > + Vector<WebCore::SpeechRecognitionResultData> datas; No need for WebCore:: prefix in this file > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:120 > + datas.reserveInitialCapacity(_maxAlternatives <= segments.count ? _maxAlternatives : segments.count); This should probably be: datas.reserveInitialCapacity(segments.count); > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:127 > + if (data.alternatives.size() == _maxAlternatives) This is for data.alternatives that you could potentially try doing reserveInitialCapacity(_maxAlternatives <= segments.alternativeSubstrings.count...); > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:132 > + datas.append(WTFMove(data)); Could do uncheckedAppend here. Or you could write it as: datas.append(SpeechRecognitionResultData {isFinal , WTFMove(alternatives }); > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:134 > + _delegateCallback(WebCore::SpeechRecognitionUpdate::createResult(_identifier, datas)); Might need a WTFMove here. > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:215 > + auto update = WebCore::SpeechRecognitionUpdate::createError(_identifier, error); WTFMove() > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:253 > + if (successfully) { We usually do if (!successfully) > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:258 > + auto error = WebCore::SpeechRecognitionError { WebCore::SpeechRecognitionErrorType::Aborted, task.error.localizedDescription}; s/}/ }/ > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:259 > + _delegateCallback(WebCore::SpeechRecognitionUpdate::createError(_identifier, error)); WTFMove > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTaskMock.mm:50 > + } Do we not want to also exercise the WebCore::SpeechRecognitionUpdate::createResult code path? Can we do something like adding a timer that will output some bip bop words regularly or as a one shot? > Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:48 > + , m_checkIfMockCaptureDevicesEnabled(WTFMove(checkIfEnabled)) I would rename it to checkIfMockSpeechEnabled. > Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:107 > + m_recognizer->reset(); Are we using reset elsewhere? If not, can we implement reset so that it does not send any update? That way, we could remove m_isResetting. > Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:116 > + m_recognizer->reset(); If we call reset, isn't there a chance that we will call its callback above that will send update message while it should not? reset is no longer explicitly sending a message, so maybe we no longer need m_isResetting. > Source/WebKit/UIProcess/WebProcessProxy.cpp:1729 > + return !weakPage || weakPage->preferences().mockCaptureDevicesEnabled(); In theory it should be return weakPage && weakPage->preferences().mockCaptureDevicesEnabled(); I would name it checkIfMockSpeechDevicesEnabled.

Sihui Liu

Comment 20 2020-12-11 11:26:40 PST

Created attachment 416017 [details] Patch

Sihui Liu

Comment 21 2020-12-11 12:07:16 PST

Created attachment 416028 [details] Patch

Sihui Liu

Comment 22 2020-12-11 12:11:19 PST

Comment on attachment 415936 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=415936&action=review >> Source/WebCore/Modules/speech/cocoa/SpeechRecognizerCocoa.mm:70 >> +void SpeechRecognizer::resetRecognition() > > Should we merge abort and reset methods? > If we abort, do we want m_task to not stay null? abort() means abort task, but still listen to async update (error or end), and wait for task to complete naturally. reset() means abort task, and not listen to any async update by setting task null. This is a bit confusing and may be simplified.. >> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:119 >> + Vector<WebCore::SpeechRecognitionResultData> datas; > > No need for WebCore:: prefix in this file Added "using namespace WebCore". >> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTaskMock.mm:50 >> + } > > Do we not want to also exercise the WebCore::SpeechRecognitionUpdate::createResult code path? > Can we do something like adding a timer that will output some bip bop words regularly or as a one shot? Added one fake final result for each received buffer. >> Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:107 >> + m_recognizer->reset(); > > Are we using reset elsewhere? > If not, can we implement reset so that it does not send any update? > That way, we could remove m_isResetting. reset() is called after an error/end update from recognizer delegate callback (from audio capture/recognition), or error update from server starting another request. >> Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:116 >> + m_recognizer->reset(); > > If we call reset, isn't there a chance that we will call its callback above that will send update message while it should not? > reset is no longer explicitly sending a message, so maybe we no longer need m_isResetting. Yes; basically reset() means not waiting for async update but allow sending sync update message. So this event ordering is possible for some recognition: Error -> AudioEnd -> SpeechEnd -> End*

youenn fablet

Comment 23 2020-12-14 01:09:42 PST

Comment on attachment 416028 [details] Patch LGTM. As a side question, do we have enough existing tests for abort and stop cases? View in context: https://bugs.webkit.org/attachment.cgi?id=416028&action=review > Source/WebCore/Modules/speech/cocoa/SpeechRecognizerCocoa.mm:48 > + m_task = adoptNS([[taskClass alloc] initWithIdentifier:identifier locale:localeIdentifier doMultipleRecognitions:continuous reportInterimResults:interimResults maxAlternatives:alternatives delegateCallback:[this, weakThis = makeWeakPtr(this)](const SpeechRecognitionUpdate& update) { auto& update > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:38 > +static constexpr size_t maximumRecognitionDuration = 60 * 60; We should probably think about our strategy for a page that is doing speech recognition and goes into the background on iOS. It seems to me it might be best to mute the audio capture or maybe stop the recognition. There is also the case of a page that is doing recognition and another page starts audio capture. Except on iPadOS, the above strategy would also handle this case. > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:85 > + _maxAlternatives = alternatives; Can be done as a one liner: _maxAlternatives = alternatives ? alternatives : 1 > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:136 > + _delegateCallback(SpeechRecognitionUpdate::createResult(_identifier, datas)); WTFMove(data) > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:214 > + return; Can be added with above if > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTaskMock.mm:64 > + _delegateCallback(SpeechRecognitionUpdate::createResult(_identifier, datas)); WTFMove()

Sihui Liu

Comment 24 2020-12-14 08:46:23 PST

Created attachment 416163 [details] Patch

Sihui Liu

Comment 25 2020-12-14 09:30:00 PST

(In reply to youenn fablet from comment #23) > Comment on attachment 416028 [details] > Patch > > LGTM. > As a side question, do we have enough existing tests for abort and stop > cases? Added another test for restarting the same recognition after stop. The two added tests should be the most basic cases for stop(). abort() is nothing different from stop() with our current test setup. We can probably make WebSpeechRecognizerTaskMock behave more like SFSpeechRecognizer, e.g. calling callback asynchronously. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=416028&action=review > > > Source/WebCore/Modules/speech/cocoa/SpeechRecognizerCocoa.mm:48 > > + m_task = adoptNS([[taskClass alloc] initWithIdentifier:identifier locale:localeIdentifier doMultipleRecognitions:continuous reportInterimResults:interimResults maxAlternatives:alternatives delegateCallback:[this, weakThis = makeWeakPtr(this)](const SpeechRecognitionUpdate& update) { > > auto& update > > > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:38 > > +static constexpr size_t maximumRecognitionDuration = 60 * 60; > > We should probably think about our strategy for a page that is doing speech > recognition and goes into the background on iOS. > It seems to me it might be best to mute the audio capture or maybe stop the > recognition. > > There is also the case of a page that is doing recognition and another page > starts audio capture. > Except on iPadOS, the above strategy would also handle this case. ah good point. We should abort recognition when page becomes invisible (process is backgrounded) or audio capture will be started in a different web page (or in a different process as it seems CoreAudioCaptureSource::startProducingData() can set active source and mute inactive ones in the same process). > > > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:85 > > + _maxAlternatives = alternatives; > > Can be done as a one liner: > _maxAlternatives = alternatives ? alternatives : 1 > > > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:136 > > + _delegateCallback(SpeechRecognitionUpdate::createResult(_identifier, datas)); > > WTFMove(data) > > > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:214 > > + return; > > Can be added with above if > > > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTaskMock.mm:64 > > + _delegateCallback(SpeechRecognitionUpdate::createResult(_identifier, datas)); > > WTFMove()

EWS

Comment 26 2020-12-14 09:48:12 PST

Committed r270772: <https://trac.webkit.org/changeset/270772> All reviewed patches have been landed. Closing bug and clearing flags on attachment 416163 [details].

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component Accessibility

Assignee

Sihui Liu

Reported

2020-12-02 15:56 PST

Modified

2020-12-14 09:48 PST History

CC List

3 users Show

URL

Keywords InRadar

Depends on

Blocks