WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
219459
Implement recognizer for SpeechRecognition
https://bugs.webkit.org/show_bug.cgi?id=219459
Summary
Implement recognizer for SpeechRecognition
Sihui Liu
Reported
2020-12-02 15:56:53 PST
...
Attachments
Patch
(28.66 KB, patch)
2020-12-07 11:09 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(29.73 KB, patch)
2020-12-07 14:19 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(29.73 KB, patch)
2020-12-07 14:28 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(29.77 KB, patch)
2020-12-07 14:34 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(37.22 KB, patch)
2020-12-08 00:54 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(36.23 KB, patch)
2020-12-08 07:51 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(36.24 KB, patch)
2020-12-08 08:06 PST
,
Sihui Liu
darin
: review-
Details
Formatted Diff
Diff
Patch
(60.56 KB, patch)
2020-12-10 02:18 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(62.09 KB, patch)
2020-12-10 08:13 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(62.07 KB, patch)
2020-12-10 08:20 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(62.39 KB, patch)
2020-12-10 09:05 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(62.85 KB, patch)
2020-12-10 11:05 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(63.86 KB, patch)
2020-12-10 14:57 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(68.60 KB, patch)
2020-12-11 11:26 PST
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(68.31 KB, patch)
2020-12-11 12:07 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(70.84 KB, patch)
2020-12-14 08:46 PST
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Show Obsolete
(15)
View All
Add attachment
proposed patch, testcase, etc.
Radar WebKit Bug Importer
Comment 1
2020-12-02 15:57:01 PST
<
rdar://problem/71914465
>
Sihui Liu
Comment 2
2020-12-07 11:09:43 PST
Created
attachment 415566
[details]
Patch
Sihui Liu
Comment 3
2020-12-07 14:19:20 PST
Created
attachment 415585
[details]
Patch
Sihui Liu
Comment 4
2020-12-07 14:28:29 PST
Created
attachment 415586
[details]
Patch
Sihui Liu
Comment 5
2020-12-07 14:34:51 PST
Created
attachment 415587
[details]
Patch
Sihui Liu
Comment 6
2020-12-08 00:54:34 PST
Created
attachment 415617
[details]
Patch
Sihui Liu
Comment 7
2020-12-08 07:51:17 PST
Created
attachment 415644
[details]
Patch
Sihui Liu
Comment 8
2020-12-08 08:06:28 PST
Created
attachment 415646
[details]
Patch
Darin Adler
Comment 9
2020-12-08 11:36:35 PST
Comment on
attachment 415646
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=415646&action=review
> Source/WebCore/Configurations/WebCore.xcconfig:128 > +WK_SPEECH_LDFLAGS = $(WK_SPEECH_LDFLAGS_$(WK_PLATFORM_NAME)); > +WK_SPEECH_LDFLAGS_macosx = -framework Speech; > +WK_SPEECH_LDFLAGS_iphoneos = -framework Speech; > +WK_SPEECH_LDFLAGS_iphonesimulator = -framework Speech; > +WK_SPEECH_LDFLAGS_maccatalyst = -framework Speech;
Is this OK for the base system on macOS? Is the Speech framework included in that reduced system? Does this affect launch time? Adding another framework that mostly isn’t used can slow down launch if that framework is not part of the shared cache, I believe.
> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:46 > + auto error = WebCore::SpeechRecognitionError { WebCore::SpeechRecognitionErrorType::ServiceNotAllowed, "Failed to start recognition" };
Should use _s on the string constant here since it’s going to be used to make a WTF::String.
> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:48 > + auto update = WebCore::SpeechRecognitionUpdate::createError(identifier, error); > + invokeDelegateCallback(update);
Seems like this could be collapsed into a single line like the other similar cases below.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:89 > + return nil;
This is missing a call to either "[self release]" or "[self dealloc]".
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:91 > + if (![_recognizer.get() isAvailable])
The "get()" here is not needed.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:92 > + return nil;
This is missing a call to either "[self release]" or "[self dealloc]".
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:104 > + [_request _setMaximumRecognitionDuration:60 * 60];
Where does this magic constant come from? Maybe a named constant at the top of the file would be better. We could also comment how we chose that value.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:107 > + _task = [_recognizer.get() recognitionTaskWithRequest:_request.get() delegate:self];
The first "get()" here isn't needed.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:119 > + data.alternatives.append(WebCore::SpeechRecognitionAlternativeData { [[segment substring] UTF8String], [segment confidence] });
There’s no reason to call UTF8String here. We’re doing to store this into a WTF::String, and it can do it with NSString. Calling UTF8String makes the operation less efficient, and I don’t think provides any value.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:124 > + data.alternatives.append(WebCore::SpeechRecognitionAlternativeData { [segmentAlternative UTF8String], 0.0 });
There’s no reason to call UTF8String here. We’re doing to store this into a WTF::String, and it can do it with NSString. Calling UTF8String makes the operation less efficient, and I don’t think provides any value.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:193 > + // FIXME: This is never called.
This is a confusing comment. Why is this never called? We say FIXME: What should someone be fixing?
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:214 > + // FIXME: This is never called.
Ditto.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:231 > + auto error = WebCore::SpeechRecognitionError { WebCore::SpeechRecognitionErrorType::Aborted, [task.error.localizedDescription UTF8String] };
There’s no reason to call UTF8String here. We’re doing to store this into a WTF::String, and it can do it with NSString. Calling UTF8String makes the operation less efficient, and I don’t think provides any value.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:233 > + auto update = WebCore::SpeechRecognitionUpdate::createError(_identifier, error); > + _delegateCallback(update);
I think we could collapse these into a single line.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:285 > + [m_task.get() audioSamplesAvailable:buffer.get()];
The first "get()" here isn’t needed.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:304 > + [m_task.get() stop];
I don’t think this get() is needed.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:310 > + [m_task.get() abort];
Ditto.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:319 > + [task.get() abort];
Ditto.
> Source/WebKit/UIProcess/WebProcessProxy.cpp:1728 > + if (!weakPage || !weakPage->preferences().mockCaptureDevicesEnabled()) > + return false; > + return true;
I would have written this: return weakPage && weakPage->preferences().mockCaptureDevicesEnabled();
youenn fablet
Comment 10
2020-12-09 01:27:57 PST
Comment on
attachment 415646
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=415646&action=review
> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:123 > + invokeDelegateCallback(SpeechRecognitionUpdate::create(*m_clientIdentifier, SpeechRecognitionUpdateType::AudioEnd));
Are we sure m_clientIdentifier is always not null here? Maybe we should change the model as a clean-up patch by creating a SpeechRecognizer for every new recognition task. Then we would pass the client identifier as part of SpeechRecognizer constructor and it would no longer be optional. We could keep another boolean internal state to track whether being stopped/aborted. The recogniser could also become mock or not mock at creation time as well.
> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:128 > +void SpeechRecognizer::dataCaptured(const WTF::MediaTime&, const PlatformAudioData&, const AudioStreamDescription&, size_t)
We can probably remove WTF:: here.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:114 > + auto segments = [transcription segments];
Can we use reserveInitialCapacity?
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:133 > + [_request appendAudioSampleBuffer:sampleBuffer];
We tend to push audio samples in background threads only. If this method is only supposed to be called in the main thread, we could add an isMainThread ASSERT.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:200 > + [self callbackWithResult:transcription isFinal:NO];
Is the delegate guaranteed to be called on the main thread?
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:239 > +@interface WebSpeechRecognizerTaskMock : WebSpeechRecognizerTask
I would tend to move WebSpeechRecognizerTaskMock to its own file.
> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:290 > + auto taskClass = mockCaptureDevicesEnabled ? [WebSpeechRecognizerTaskMock class] : [WebSpeechRecognizerTaskMock class];
In both cases, we are creating a WebSpeechRecognizerTaskMock, the second one is probably WebSpeechRecognizerTask.
Sihui Liu
Comment 11
2020-12-09 11:35:42 PST
Comment on
attachment 415646
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=415646&action=review
>> Source/WebCore/Configurations/WebCore.xcconfig:128 >> +WK_SPEECH_LDFLAGS_maccatalyst = -framework Speech; > > Is this OK for the base system on macOS? Is the Speech framework included in that reduced system? > > Does this affect launch time? Adding another framework that mostly isn’t used can slow down launch if that framework is not part of the shared cache, I believe.
Thanks for pointing this out! The framework is not in base system, should add -weak. It may affect launch time, but it seems the framework is too small to cause noticeable change.
>> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:46 >> + auto error = WebCore::SpeechRecognitionError { WebCore::SpeechRecognitionErrorType::ServiceNotAllowed, "Failed to start recognition" }; > > Should use _s on the string constant here since it’s going to be used to make a WTF::String.
Sure.
>> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:48 >> + invokeDelegateCallback(update); > > Seems like this could be collapsed into a single line like the other similar cases below.
Will collapse this.
>> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:123 >> + invokeDelegateCallback(SpeechRecognitionUpdate::create(*m_clientIdentifier, SpeechRecognitionUpdateType::AudioEnd)); > > Are we sure m_clientIdentifier is always not null here? > Maybe we should change the model as a clean-up patch by creating a SpeechRecognizer for every new recognition task. > Then we would pass the client identifier as part of SpeechRecognizer constructor and it would no longer be optional. > We could keep another boolean internal state to track whether being stopped/aborted. > The recogniser could also become mock or not mock at creation time as well.
Sounds good. Filed
https://bugs.webkit.org/show_bug.cgi?id=219699
.
>> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:128 >> +void SpeechRecognizer::dataCaptured(const WTF::MediaTime&, const PlatformAudioData&, const AudioStreamDescription&, size_t) > > We can probably remove WTF:: here.
Sure.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:89 >> + return nil; > > This is missing a call to either "[self release]" or "[self dealloc]".
ah, will add!
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:91 >> + if (![_recognizer.get() isAvailable]) > > The "get()" here is not needed.
Will remove.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:92 >> + return nil; > > This is missing a call to either "[self release]" or "[self dealloc]".
Will add!
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:104 >> + [_request _setMaximumRecognitionDuration:60 * 60]; > > Where does this magic constant come from? Maybe a named constant at the top of the file would be better. We could also comment how we chose that value.
Will add!
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:107 >> + _task = [_recognizer.get() recognitionTaskWithRequest:_request.get() delegate:self]; > > The first "get()" here isn't needed.
Will remove.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:114 >> + auto segments = [transcription segments]; > > Can we use reserveInitialCapacity?
Sure.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:119 >> + data.alternatives.append(WebCore::SpeechRecognitionAlternativeData { [[segment substring] UTF8String], [segment confidence] }); > > There’s no reason to call UTF8String here. We’re doing to store this into a WTF::String, and it can do it with NSString. Calling UTF8String makes the operation less efficient, and I don’t think provides any value.
Will remove this.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:124 >> + data.alternatives.append(WebCore::SpeechRecognitionAlternativeData { [segmentAlternative UTF8String], 0.0 }); > > There’s no reason to call UTF8String here. We’re doing to store this into a WTF::String, and it can do it with NSString. Calling UTF8String makes the operation less efficient, and I don’t think provides any value.
Will remove.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:133 >> + [_request appendAudioSampleBuffer:sampleBuffer]; > > We tend to push audio samples in background threads only. > If this method is only supposed to be called in the main thread, we could add an isMainThread ASSERT.
Sure.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:193 >> + // FIXME: This is never called. > > This is a confusing comment. Why is this never called? We say FIXME: What should someone be fixing?
hmm, left a comment here to remind myself this delegate callback never gets called during my testing. Will remove.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:200 >> + [self callbackWithResult:transcription isFinal:NO]; > > Is the delegate guaranteed to be called on the main thread?
Good question! Will check this on testing; and all other callbacks.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:214 >> + // FIXME: This is never called. > > Ditto.
Will remove.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:231 >> + auto error = WebCore::SpeechRecognitionError { WebCore::SpeechRecognitionErrorType::Aborted, [task.error.localizedDescription UTF8String] }; > > There’s no reason to call UTF8String here. We’re doing to store this into a WTF::String, and it can do it with NSString. Calling UTF8String makes the operation less efficient, and I don’t think provides any value.
Will remove.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:233 >> + _delegateCallback(update); > > I think we could collapse these into a single line.
Okay.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:239 >> +@interface WebSpeechRecognizerTaskMock : WebSpeechRecognizerTask > > I would tend to move WebSpeechRecognizerTaskMock to its own file.
Will move.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:285 >> + [m_task.get() audioSamplesAvailable:buffer.get()]; > > The first "get()" here isn’t needed.
Will remove.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:290 >> + auto taskClass = mockCaptureDevicesEnabled ? [WebSpeechRecognizerTaskMock class] : [WebSpeechRecognizerTaskMock class]; > > In both cases, we are creating a WebSpeechRecognizerTaskMock, the second one is probably WebSpeechRecognizerTask.
Oops!
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:304 >> + [m_task.get() stop]; > > I don’t think this get() is needed.
Will remove.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:310 >> + [m_task.get() abort]; > > Ditto.
Will remove.
>> Source/WebCore/Modules/speech/SpeechRecognizerCocoa.mm:319 >> + [task.get() abort]; > > Ditto.
Will remove.
>> Source/WebKit/UIProcess/WebProcessProxy.cpp:1728 >> + return true; > > I would have written this: > > return weakPage && weakPage->preferences().mockCaptureDevicesEnabled();
NIce.
Darin Adler
Comment 12
2020-12-09 12:20:50 PST
(In reply to Sihui Liu from
comment #11
)
> It may affect launch time, but it seems the framework is too small to cause > noticeable change.
Please double check this with the people who worked so hard to make web processes launch quickly so they can make sure we measure after the fact to be sure.
Sihui Liu
Comment 13
2020-12-10 02:18:29 PST
Created
attachment 415848
[details]
Patch
Sihui Liu
Comment 14
2020-12-10 08:13:16 PST
Created
attachment 415877
[details]
Patch
Sihui Liu
Comment 15
2020-12-10 08:20:08 PST
Created
attachment 415878
[details]
Patch
Sihui Liu
Comment 16
2020-12-10 09:05:11 PST
Created
attachment 415881
[details]
Patch
Sihui Liu
Comment 17
2020-12-10 11:05:55 PST
Created
attachment 415903
[details]
Patch
Sihui Liu
Comment 18
2020-12-10 14:57:29 PST
Created
attachment 415936
[details]
Patch
youenn fablet
Comment 19
2020-12-11 01:09:09 PST
Comment on
attachment 415936
[details]
Patch This looks almost ready to me. It would be good to get some more tests. View in context:
https://bugs.webkit.org/attachment.cgi?id=415936&action=review
> Source/WebCore/ChangeLog:11 > + Manually tested in MiniBrowser.
If we beef up the mock implementation to return some fake recognised words, we could probably add some tests.
> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:69 > +void SpeechRecognizer::start(SpeechRecognitionConnectionClientIdentifier clientIdentifier, Ref<RealtimeMediaSource>&& source, bool mockCaptureDevicesEnabled, const String& localeIdentifier, bool continuous, bool interimResults, uint64_t maxAlternatives)
s/mockCaptureDevicesEnabled/mockSpeechRecognitionEnabled/
> Source/WebCore/Modules/speech/SpeechRecognizer.cpp:77 > + auto update = WebCore::SpeechRecognitionUpdate::createError(clientIdentifier, error);
WTFMove() or inline error creation in this line
> Source/WebCore/Modules/speech/cocoa/SpeechRecognizerCocoa.mm:45 > +bool SpeechRecognizer::startRecognition(bool mockCaptureDevicesEnabled, SpeechRecognitionConnectionClientIdentifier identifier, const String& localeIdentifier, bool continuous, bool interimResults, uint64_t alternatives)
s/mockCaptureDevicesEnabled/mockSpeechDevicesEnabled/
> Source/WebCore/Modules/speech/cocoa/SpeechRecognizerCocoa.mm:70 > +void SpeechRecognizer::resetRecognition()
Should we merge abort and reset methods? If we abort, do we want m_task to not stay null?
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:119 > + Vector<WebCore::SpeechRecognitionResultData> datas;
No need for WebCore:: prefix in this file
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:120 > + datas.reserveInitialCapacity(_maxAlternatives <= segments.count ? _maxAlternatives : segments.count);
This should probably be: datas.reserveInitialCapacity(segments.count);
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:127 > + if (data.alternatives.size() == _maxAlternatives)
This is for data.alternatives that you could potentially try doing reserveInitialCapacity(_maxAlternatives <= segments.alternativeSubstrings.count...);
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:132 > + datas.append(WTFMove(data));
Could do uncheckedAppend here. Or you could write it as: datas.append(SpeechRecognitionResultData {isFinal , WTFMove(alternatives });
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:134 > + _delegateCallback(WebCore::SpeechRecognitionUpdate::createResult(_identifier, datas));
Might need a WTFMove here.
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:215 > + auto update = WebCore::SpeechRecognitionUpdate::createError(_identifier, error);
WTFMove()
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:253 > + if (successfully) {
We usually do if (!successfully)
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:258 > + auto error = WebCore::SpeechRecognitionError { WebCore::SpeechRecognitionErrorType::Aborted, task.error.localizedDescription};
s/}/ }/
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:259 > + _delegateCallback(WebCore::SpeechRecognitionUpdate::createError(_identifier, error));
WTFMove
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTaskMock.mm:50 > + }
Do we not want to also exercise the WebCore::SpeechRecognitionUpdate::createResult code path? Can we do something like adding a timer that will output some bip bop words regularly or as a one shot?
> Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:48 > + , m_checkIfMockCaptureDevicesEnabled(WTFMove(checkIfEnabled))
I would rename it to checkIfMockSpeechEnabled.
> Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:107 > + m_recognizer->reset();
Are we using reset elsewhere? If not, can we implement reset so that it does not send any update? That way, we could remove m_isResetting.
> Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:116 > + m_recognizer->reset();
If we call reset, isn't there a chance that we will call its callback above that will send update message while it should not? reset is no longer explicitly sending a message, so maybe we no longer need m_isResetting.
> Source/WebKit/UIProcess/WebProcessProxy.cpp:1729 > + return !weakPage || weakPage->preferences().mockCaptureDevicesEnabled();
In theory it should be return weakPage && weakPage->preferences().mockCaptureDevicesEnabled(); I would name it checkIfMockSpeechDevicesEnabled.
Sihui Liu
Comment 20
2020-12-11 11:26:40 PST
Created
attachment 416017
[details]
Patch
Sihui Liu
Comment 21
2020-12-11 12:07:16 PST
Created
attachment 416028
[details]
Patch
Sihui Liu
Comment 22
2020-12-11 12:11:19 PST
Comment on
attachment 415936
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=415936&action=review
>> Source/WebCore/Modules/speech/cocoa/SpeechRecognizerCocoa.mm:70 >> +void SpeechRecognizer::resetRecognition() > > Should we merge abort and reset methods? > If we abort, do we want m_task to not stay null?
abort() means abort task, but still listen to async update (error or end), and wait for task to complete naturally. reset() means abort task, and not listen to any async update by setting task null. This is a bit confusing and may be simplified..
>> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:119 >> + Vector<WebCore::SpeechRecognitionResultData> datas; > > No need for WebCore:: prefix in this file
Added "using namespace WebCore".
>> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTaskMock.mm:50 >> + } > > Do we not want to also exercise the WebCore::SpeechRecognitionUpdate::createResult code path? > Can we do something like adding a timer that will output some bip bop words regularly or as a one shot?
Added one fake final result for each received buffer.
>> Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:107 >> + m_recognizer->reset(); > > Are we using reset elsewhere? > If not, can we implement reset so that it does not send any update? > That way, we could remove m_isResetting.
reset() is called after an error/end update from recognizer delegate callback (from audio capture/recognition), or error update from server starting another request.
>> Source/WebKit/UIProcess/SpeechRecognitionServer.cpp:116 >> + m_recognizer->reset(); > > If we call reset, isn't there a chance that we will call its callback above that will send update message while it should not? > reset is no longer explicitly sending a message, so maybe we no longer need m_isResetting.
Yes; basically reset() means not waiting for async update but allow sending sync update message. So this event ordering is possible for some recognition: Error -> AudioEnd -> SpeechEnd -> End*
youenn fablet
Comment 23
2020-12-14 01:09:42 PST
Comment on
attachment 416028
[details]
Patch LGTM. As a side question, do we have enough existing tests for abort and stop cases? View in context:
https://bugs.webkit.org/attachment.cgi?id=416028&action=review
> Source/WebCore/Modules/speech/cocoa/SpeechRecognizerCocoa.mm:48 > + m_task = adoptNS([[taskClass alloc] initWithIdentifier:identifier locale:localeIdentifier doMultipleRecognitions:continuous reportInterimResults:interimResults maxAlternatives:alternatives delegateCallback:[this, weakThis = makeWeakPtr(this)](const SpeechRecognitionUpdate& update) {
auto& update
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:38 > +static constexpr size_t maximumRecognitionDuration = 60 * 60;
We should probably think about our strategy for a page that is doing speech recognition and goes into the background on iOS. It seems to me it might be best to mute the audio capture or maybe stop the recognition. There is also the case of a page that is doing recognition and another page starts audio capture. Except on iPadOS, the above strategy would also handle this case.
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:85 > + _maxAlternatives = alternatives;
Can be done as a one liner: _maxAlternatives = alternatives ? alternatives : 1
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:136 > + _delegateCallback(SpeechRecognitionUpdate::createResult(_identifier, datas));
WTFMove(data)
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:214 > + return;
Can be added with above if
> Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTaskMock.mm:64 > + _delegateCallback(SpeechRecognitionUpdate::createResult(_identifier, datas));
WTFMove()
Sihui Liu
Comment 24
2020-12-14 08:46:23 PST
Created
attachment 416163
[details]
Patch
Sihui Liu
Comment 25
2020-12-14 09:30:00 PST
(In reply to youenn fablet from
comment #23
)
> Comment on
attachment 416028
[details]
> Patch > > LGTM. > As a side question, do we have enough existing tests for abort and stop > cases?
Added another test for restarting the same recognition after stop. The two added tests should be the most basic cases for stop(). abort() is nothing different from stop() with our current test setup. We can probably make WebSpeechRecognizerTaskMock behave more like SFSpeechRecognizer, e.g. calling callback asynchronously.
> > View in context: >
https://bugs.webkit.org/attachment.cgi?id=416028&action=review
> > > Source/WebCore/Modules/speech/cocoa/SpeechRecognizerCocoa.mm:48 > > + m_task = adoptNS([[taskClass alloc] initWithIdentifier:identifier locale:localeIdentifier doMultipleRecognitions:continuous reportInterimResults:interimResults maxAlternatives:alternatives delegateCallback:[this, weakThis = makeWeakPtr(this)](const SpeechRecognitionUpdate& update) { > > auto& update > > > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:38 > > +static constexpr size_t maximumRecognitionDuration = 60 * 60; > > We should probably think about our strategy for a page that is doing speech > recognition and goes into the background on iOS. > It seems to me it might be best to mute the audio capture or maybe stop the > recognition. > > There is also the case of a page that is doing recognition and another page > starts audio capture. > Except on iPadOS, the above strategy would also handle this case.
ah good point. We should abort recognition when page becomes invisible (process is backgrounded) or audio capture will be started in a different web page (or in a different process as it seems CoreAudioCaptureSource::startProducingData() can set active source and mute inactive ones in the same process).
> > > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:85 > > + _maxAlternatives = alternatives; > > Can be done as a one liner: > _maxAlternatives = alternatives ? alternatives : 1 > > > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:136 > > + _delegateCallback(SpeechRecognitionUpdate::createResult(_identifier, datas)); > > WTFMove(data) > > > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm:214 > > + return; > > Can be added with above if > > > Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTaskMock.mm:64 > > + _delegateCallback(SpeechRecognitionUpdate::createResult(_identifier, datas)); > > WTFMove()
EWS
Comment 26
2020-12-14 09:48:12 PST
Committed
r270772
: <
https://trac.webkit.org/changeset/270772
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 416163
[details]
.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug