RESOLVED FIXED 195645
[macOS] Broker access to Speech Synthesis
https://bugs.webkit.org/show_bug.cgi?id=195645
Summary [macOS] Broker access to Speech Synthesis
Per Arne Vollan
Reported 2019-03-12 15:21:58 PDT
Speech synthesis should be performed in the UI process.
Attachments
Patch (52.20 KB, patch)
2019-03-12 15:53 PDT, Per Arne Vollan
no flags
Patch (52.46 KB, patch)
2019-03-12 16:32 PDT, Per Arne Vollan
no flags
Patch (55.24 KB, patch)
2019-03-12 16:50 PDT, Per Arne Vollan
no flags
Patch (55.92 KB, patch)
2019-03-12 17:14 PDT, Per Arne Vollan
no flags
Archive of layout-test-results from ews106 for mac-highsierra-wk2 (3.00 MB, application/zip)
2019-03-12 18:43 PDT, EWS Watchlist
no flags
Patch (56.09 KB, patch)
2019-03-12 20:00 PDT, Per Arne Vollan
no flags
Patch (58.22 KB, patch)
2019-03-13 19:15 PDT, Per Arne Vollan
no flags
Patch (58.75 KB, patch)
2019-03-13 20:46 PDT, Per Arne Vollan
no flags
Patch (59.01 KB, patch)
2019-03-14 16:54 PDT, Per Arne Vollan
no flags
Per Arne Vollan
Comment 1 2019-03-12 15:22:20 PDT
Per Arne Vollan
Comment 2 2019-03-12 15:53:22 PDT
Per Arne Vollan
Comment 3 2019-03-12 16:32:40 PDT
Per Arne Vollan
Comment 4 2019-03-12 16:50:33 PDT
Per Arne Vollan
Comment 5 2019-03-12 17:14:41 PDT
EWS Watchlist
Comment 6 2019-03-12 18:43:34 PDT
Comment on attachment 364479 [details] Patch Attachment 364479 [details] did not pass mac-wk2-ews (mac-wk2): Output: https://webkit-queues.webkit.org/results/11483397 New failing tests: fast/speechsynthesis/speech-synthesis-elapsed-time.html fast/speechsynthesis/speech-synthesis-boundary-events.html fast/speechsynthesis/speech-synthesis-gc-utterance-crash.html fast/speechsynthesis/speech-synthesis-voices.html fast/speechsynthesis/speech-synthesis-cancel-crash.html fast/speechsynthesis/speech-synthesis-speak.html fast/speechsynthesis/speech-synthesis-pause-resume.html
EWS Watchlist
Comment 7 2019-03-12 18:43:35 PDT
Created attachment 364492 [details] Archive of layout-test-results from ews106 for mac-highsierra-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews106 Port: mac-highsierra-wk2 Platform: Mac OS X 10.13.6
Per Arne Vollan
Comment 8 2019-03-12 20:00:11 PDT
Brent Fulgham
Comment 9 2019-03-12 20:40:25 PDT
Comment on attachment 364499 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=364499&action=review Looks great! I think we could remove some of the boilerplate message passing code by using the “sendWithAsyncReply” features of the IPC layer. > Source/WebCore/Modules/speech/SpeechSynthesis.cpp:125 > + m_speechSynthesisClient->speak(*utterance.platformUtterance()); It seems like both of these ‘speech’ implementations should take an object (not a pointer), or both should take a pointer, but not this mix. > Source/WebCore/Modules/speech/SpeechSynthesis.h:105 > + SpeechSynthesisClient* m_speechSynthesisClient { nullptr }; This should either be a WeakPtr or a RefPtr. We should always be leery of bare pointers. > Source/WebCore/platform/PlatformSpeechSynthesizer.h:43 > +enum class SpeechBoundary : uint8_t { Good! > Source/WebKit/UIProcess/WebPageProxy.cpp:8808 > +void WebPageProxy::speechSynthesisVoiceList(Vector<WebSpeechSynthesisVoice>& result) If these were Implemented for sendWithAsyncRepky, this would take a completion handler you could call when done to get the response back to the WebProcess. > Source/WebKit/UIProcess/WebPageProxy.cpp:8811 > + m_synthesizer = std::make_unique<PlatformSpeechSynthesizer>(this); Maybe this should be a helper “ensureSynthesizer()” that returns a PlatformSpeechSynthesizer& that you can use elsewhere. > Source/WebKit/UIProcess/WebPageProxy.cpp:8832 > + m_synthesizer->speak(m_utterance.get()); You could pass the CompletionHandler to PlatformSpeechSynthesizer::speak > Source/WebKit/UIProcess/WebPageProxy.cpp:8837 > + m_synthesizer->cancel(); It seems like m_synthesizer can be nullptr in a lot of cases. Should these null check? Or at least assert? > Source/WebKit/UIProcess/WebPageProxy.cpp:8842 > + m_synthesizer->pause(); Ditto ->pause(WTFMove(completionHandler)) > Source/WebKit/UIProcess/WebPageProxy.messages.in:555 > + SpeechSynthesisVoiceList() -> (Vector<WebKit::WebSpeechSynthesisVoice> voiceList) LegacySync I think this should be Async (Vector<WebKit::WebSpeechSynthesisVoice> voiceList) > Source/WebKit/UIProcess/WebPageProxy.messages.in:556 > + SpeechSynthesisSpeak(String text, String lang, float volume, float rate, float pitch, MonotonicTime startTime, String voiceURI, String voiceName, String voiceLang, bool localService, bool defaultVoice) If these were Async() you could pass a Lambda on the calling side and it would automatically be called... > Source/WebKit/WebProcess/com.apple.WebProcess.sb.in:-606 > - (global-name "com.apple.speech.speechsynthesisd") Hooray! > Source/WebKit/WebProcess/WebCoreSupport/WebSpeechSynthesisClient.cpp:43 > + voices.append(WebCore::PlatformSpeechSynthesisVoice::create(voice.voiceURI, voice.name, voice.lang, voice.localService, voice.defaultLang)); Might be nicer to WTFMove() the parr’s of voice into the new object, since we discard it all when we are done. > Source/WebKit/WebProcess/WebCoreSupport/WebSpeechSynthesisClient.cpp:52 > + m_page.send(Messages::WebPageProxy::SpeechSynthesisSpeak(utterance.text(), utterance.lang(), utterance.volume(), utterance.rate(), utterance.pitch(), utterance.startTime(), voice->voiceURI(), voice->name(), voice->lang(), voice->localService(), voice->isDefault())); If you used sendWithAsyncReply you could pass a Lambda to be called when the speak command completed and avoid writing a “didSpeak” response handler. > Source/WebKit/WebProcess/WebPage/WebPage.cpp:6406 > + corePage()->speechSynthesisClient()->observer()->didStartSpeaking(); These could be done as Lambdas in the speechSynthesisClient if you used sendWithAsyncReply on the sending side. > Source/WebKit/WebProcess/WebPage/WebPage.messages.in:554 > + DidResumeSpeaking() I don’t think you need these ‘DidXZ’ implementations if they are the return call of an sendWithAsyncReply.
Sam Weinig
Comment 10 2019-03-12 22:16:57 PDT
Comment on attachment 364499 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=364499&action=review >> Source/WebKit/UIProcess/WebPageProxy.messages.in:555 >> + SpeechSynthesisVoiceList() -> (Vector<WebKit::WebSpeechSynthesisVoice> voiceList) LegacySync > > I think this should be Async (Vector<WebKit::WebSpeechSynthesisVoice> voiceList) Adding more LegacySync messages is really not ideal. Have you considered alternatives? Like sending the voice list over on process initialization?
Per Arne Vollan
Comment 11 2019-03-13 10:12:45 PDT
(In reply to Sam Weinig from comment #10) > Comment on attachment 364499 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=364499&action=review > > >> Source/WebKit/UIProcess/WebPageProxy.messages.in:555 > >> + SpeechSynthesisVoiceList() -> (Vector<WebKit::WebSpeechSynthesisVoice> voiceList) LegacySync > > > > I think this should be Async (Vector<WebKit::WebSpeechSynthesisVoice> voiceList) > > Adding more LegacySync messages is really not ideal. Have you considered > alternatives? Like sending the voice list over on process initialization? This is definitely a good idea, although I am a little concerned about the performance of process startup when doing this.
Per Arne Vollan
Comment 12 2019-03-13 16:27:00 PDT
(In reply to Per Arne Vollan from comment #11) > (In reply to Sam Weinig from comment #10) > > Comment on attachment 364499 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=364499&action=review > > > > >> Source/WebKit/UIProcess/WebPageProxy.messages.in:555 > > >> + SpeechSynthesisVoiceList() -> (Vector<WebKit::WebSpeechSynthesisVoice> voiceList) LegacySync > > > > > > I think this should be Async (Vector<WebKit::WebSpeechSynthesisVoice> voiceList) > > > > Adding more LegacySync messages is really not ideal. Have you considered > > alternatives? Like sending the voice list over on process initialization? > > This is definitely a good idea, although I am a little concerned about the > performance of process startup when doing this. It takes about 150ms to get get the voice list the first time, so I don't think it would be good to add this to the startup time of the UI process.
Per Arne Vollan
Comment 13 2019-03-13 19:15:10 PDT
Per Arne Vollan
Comment 14 2019-03-13 19:21:27 PDT
(In reply to Brent Fulgham from comment #9) > Comment on attachment 364499 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=364499&action=review > > Looks great! I think we could remove some of the boilerplate message passing > code by using the “sendWithAsyncReply” features of the IPC layer. > > > Source/WebCore/Modules/speech/SpeechSynthesis.cpp:125 > > + m_speechSynthesisClient->speak(*utterance.platformUtterance()); > > It seems like both of these ‘speech’ implementations should take an object > (not a pointer), or both should take a pointer, but not this mix. > > > Source/WebCore/Modules/speech/SpeechSynthesis.h:105 > > + SpeechSynthesisClient* m_speechSynthesisClient { nullptr }; > > This should either be a WeakPtr or a RefPtr. We should always be leery of > bare pointers. > > > Source/WebCore/platform/PlatformSpeechSynthesizer.h:43 > > +enum class SpeechBoundary : uint8_t { > > Good! > > > Source/WebKit/UIProcess/WebPageProxy.cpp:8808 > > +void WebPageProxy::speechSynthesisVoiceList(Vector<WebSpeechSynthesisVoice>& result) > > If these were Implemented for sendWithAsyncRepky, this would take a > completion handler you could call when done to get the response back to the > WebProcess. > > > Source/WebKit/UIProcess/WebPageProxy.cpp:8811 > > + m_synthesizer = std::make_unique<PlatformSpeechSynthesizer>(this); > > Maybe this should be a helper “ensureSynthesizer()” that returns a > PlatformSpeechSynthesizer& that you can use elsewhere. > > > Source/WebKit/UIProcess/WebPageProxy.cpp:8832 > > + m_synthesizer->speak(m_utterance.get()); > > You could pass the CompletionHandler to PlatformSpeechSynthesizer::speak > > > Source/WebKit/UIProcess/WebPageProxy.cpp:8837 > > + m_synthesizer->cancel(); > > It seems like m_synthesizer can be nullptr in a lot of cases. Should these > null check? Or at least assert? > > > Source/WebKit/UIProcess/WebPageProxy.cpp:8842 > > + m_synthesizer->pause(); > > Ditto ->pause(WTFMove(completionHandler)) > > > Source/WebKit/UIProcess/WebPageProxy.messages.in:555 > > + SpeechSynthesisVoiceList() -> (Vector<WebKit::WebSpeechSynthesisVoice> voiceList) LegacySync > > I think this should be Async (Vector<WebKit::WebSpeechSynthesisVoice> > voiceList) > I created https://bugs.webkit.org/show_bug.cgi?id=195723 to follow up on this. > > Source/WebKit/UIProcess/WebPageProxy.messages.in:556 > > + SpeechSynthesisSpeak(String text, String lang, float volume, float rate, float pitch, MonotonicTime startTime, String voiceURI, String voiceName, String voiceLang, bool localService, bool defaultVoice) > > If these were Async() you could pass a Lambda on the calling side and it > would automatically be called... > > > Source/WebKit/WebProcess/com.apple.WebProcess.sb.in:-606 > > - (global-name "com.apple.speech.speechsynthesisd") > > Hooray! > > > Source/WebKit/WebProcess/WebCoreSupport/WebSpeechSynthesisClient.cpp:43 > > + voices.append(WebCore::PlatformSpeechSynthesisVoice::create(voice.voiceURI, voice.name, voice.lang, voice.localService, voice.defaultLang)); > > Might be nicer to WTFMove() the parr’s of voice into the new object, since > we discard it all when we are done. > > > Source/WebKit/WebProcess/WebCoreSupport/WebSpeechSynthesisClient.cpp:52 > > + m_page.send(Messages::WebPageProxy::SpeechSynthesisSpeak(utterance.text(), utterance.lang(), utterance.volume(), utterance.rate(), utterance.pitch(), utterance.startTime(), voice->voiceURI(), voice->name(), voice->lang(), voice->localService(), voice->isDefault())); > > If you used sendWithAsyncReply you could pass a Lambda to be called when the > speak command completed and avoid writing a “didSpeak” response handler. > > > Source/WebKit/WebProcess/WebPage/WebPage.cpp:6406 > > + corePage()->speechSynthesisClient()->observer()->didStartSpeaking(); > > These could be done as Lambdas in the speechSynthesisClient if you used > sendWithAsyncReply on the sending side. > > > Source/WebKit/WebProcess/WebPage/WebPage.messages.in:554 > > + DidResumeSpeaking() > > I don’t think you need these ‘DidXZ’ implementations if they are the return > call of an sendWithAsyncReply. Thanks for reviewing! I have updated the patch.
Per Arne Vollan
Comment 15 2019-03-13 20:46:11 PDT
Brent Fulgham
Comment 16 2019-03-13 22:48:03 PDT
Comment on attachment 364622 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=364622&action=review I think this is close, but we could do better by removing more boilerplate method implementations. This might mean touching other platform-specific files to thread the completion handler brought. > Source/WebCore/Modules/speech/SpeechSynthesis.cpp:75 > if (m_voiceList.size()) Nit: I really prefer !m_voiceList.emoty() (or isEmoty?) > Source/WebCore/Modules/speech/SpeechSynthesis.cpp:84 > + m_platformSpeechSynthesizer = std::make_unique<PlatformSpeechSynthesizer>(this); You repeat this pattern in a few places. I suggest making a “ensureSpeechSynthesizer” method that does this and returns a reference. > Source/WebCore/Modules/speech/SpeechSynthesis.cpp:88 > + m_voiceList.append(SpeechSynthesisVoice::create(*voice)); It bugs me that these two m_voiceList loops exist.! Can they be shared somehow? Seems like we could just use a reference to the voiceList from either source. > Source/WebCore/Modules/speech/SpeechSynthesis.h:45 > + static Ref<SpeechSynthesis> create(WeakPtr<SpeechSynthesisClient>); Do we ever create a SpeechSynthesis object without a Client? I think we could pass a reference, and do the makeWeakPtr inside the constructor. > Source/WebKit/UIProcess/WebPageProxy.cpp:8839 > + m_speechSynthesisData->speakingFinishedCompletionHandler = WTFMove(completionHandler); It seems like we should be able to pass the completion handler to “speak”, below. The implementation of speak would take a completion handler as the final argument, and call it when complete (rather then calling “didFinishSpeaking” (Ditto for the other methods). It might not be necessary to have the SpeechSynhesisData object at all, and just thread the state through the call stack. This also seems to be the only place where the Utterance is used. I wonder if we need to remember it elsewhere. Can we just WTFMove it into the “speak” method? > Source/WebKit/UIProcess/WebPageProxy.h:2465 > + CompletionHandler<void()> speakingResumedCompletionHandler; You might not need to store these three handlers, and perhaps not the utterance. > Source/WebKit/UIProcess/Cocoa/WebPageProxyCocoa.mm:243 > +void WebPageProxy::didStartSpeaking(WebCore::PlatformSpeechSynthesisUtterance&) I can tell where these get called, I guess the SpeechSynthesis object calls them at some point? I think these could be discarded if you just passed the completion handler into “speak”, “cancel”, etc., and invoked the handler instead of calling this method. > Source/WebKit/UIProcess/Cocoa/WebPageProxyCocoa.mm:248 > +{ Do we need to remember the utterance at all, or even pass the argument? We never seem to use it,,, > Source/WebKit/WebProcess/WebCoreSupport/WebSpeechSynthesisClient.cpp:75 > + m_page.corePage()->speechSynthesisClient()->observer()->didPauseSpeaking(); This looks like weird spacing (missing indent)
Per Arne Vollan
Comment 17 2019-03-14 16:02:08 PDT
(In reply to Brent Fulgham from comment #16) > Comment on attachment 364622 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=364622&action=review > > I think this is close, but we could do better by removing more boilerplate > method implementations. This might mean touching other platform-specific > files to thread the completion handler brought. > > > Source/WebCore/Modules/speech/SpeechSynthesis.cpp:75 > > if (m_voiceList.size()) > > Nit: I really prefer !m_voiceList.emoty() (or isEmoty?) > > > Source/WebCore/Modules/speech/SpeechSynthesis.cpp:84 > > + m_platformSpeechSynthesizer = std::make_unique<PlatformSpeechSynthesizer>(this); > > You repeat this pattern in a few places. I suggest making a > “ensureSpeechSynthesizer” method that does this and returns a reference. > > > Source/WebCore/Modules/speech/SpeechSynthesis.cpp:88 > > + m_voiceList.append(SpeechSynthesisVoice::create(*voice)); > > It bugs me that these two m_voiceList loops exist.! Can they be shared > somehow? Seems like we could just use a reference to the voiceList from > either source. > > > Source/WebCore/Modules/speech/SpeechSynthesis.h:45 > > + static Ref<SpeechSynthesis> create(WeakPtr<SpeechSynthesisClient>); > > Do we ever create a SpeechSynthesis object without a Client? I think we > could pass a reference, and do the makeWeakPtr inside the constructor. > Yes, for WebKitLegacy there will be no client. > > Source/WebKit/UIProcess/WebPageProxy.cpp:8839 > > + m_speechSynthesisData->speakingFinishedCompletionHandler = WTFMove(completionHandler); > > It seems like we should be able to pass the completion handler to “speak”, > below. The implementation of speak would take a completion handler as the > final argument, and call it when complete (rather then calling > “didFinishSpeaking” (Ditto for the other methods). > I think that's an excellent approach. Since this means changing the existing PlatformSpeechSynthesizer, which will make the scope of the patch bigger, do you think we could do this in a follow-up patch? > It might not be necessary to have the SpeechSynhesisData object at all, and > just thread the state through the call stack. > > This also seems to be the only place where the Utterance is used. I wonder > if we need to remember it elsewhere. Can we just WTFMove it into the “speak” > method? > > > Source/WebKit/UIProcess/WebPageProxy.h:2465 > > + CompletionHandler<void()> speakingResumedCompletionHandler; > > You might not need to store these three handlers, and perhaps not the > utterance. > > > Source/WebKit/UIProcess/Cocoa/WebPageProxyCocoa.mm:243 > > +void WebPageProxy::didStartSpeaking(WebCore::PlatformSpeechSynthesisUtterance&) > > I can tell where these get called, I guess the SpeechSynthesis object calls > them at some point? > > I think these could be discarded if you just passed the completion handler > into “speak”, “cancel”, etc., and invoked the handler instead of calling > this method. > > > Source/WebKit/UIProcess/Cocoa/WebPageProxyCocoa.mm:248 > > +{ > > Do we need to remember the utterance at all, or even pass the argument? We > never seem to use it,,, > > > Source/WebKit/WebProcess/WebCoreSupport/WebSpeechSynthesisClient.cpp:75 > > + m_page.corePage()->speechSynthesisClient()->observer()->didPauseSpeaking(); > > This looks like weird spacing (missing indent) Thanks for reviewing! I will update the patch.
Per Arne Vollan
Comment 18 2019-03-14 16:54:27 PDT
Brent Fulgham
Comment 19 2019-03-14 18:33:50 PDT
Comment on attachment 364622 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=364622&action=review >>> Source/WebKit/UIProcess/WebPageProxy.cpp:8839 >>> + m_speechSynthesisData->speakingFinishedCompletionHandler = WTFMove(completionHandler); >> >> It seems like we should be able to pass the completion handler to “speak”, below. The implementation of speak would take a completion handler as the final argument, and call it when complete (rather then calling “didFinishSpeaking” (Ditto for the other methods). >> >> It might not be necessary to have the SpeechSynhesisData object at all, and just thread the state through the call stack. >> >> This also seems to be the only place where the Utterance is used. I wonder if we need to remember it elsewhere. Can we just WTFMove it into the “speak” method? > > I think that's an excellent approach. Since this means changing the existing PlatformSpeechSynthesizer, which will make the scope of the patch bigger, do you think we could do this in a follow-up patch? Sure!
Brent Fulgham
Comment 20 2019-03-15 10:39:18 PDT
Comment on attachment 364719 [details] Patch r=me.
Per Arne Vollan
Comment 21 2019-03-15 10:52:39 PDT
Comment on attachment 364719 [details] Patch Thanks for reviewing!
WebKit Commit Bot
Comment 22 2019-03-15 11:18:42 PDT
Comment on attachment 364719 [details] Patch Clearing flags on attachment: 364719 Committed r243002: <https://trac.webkit.org/changeset/243002>
WebKit Commit Bot
Comment 23 2019-03-15 11:18:45 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.