Created attachment 413846 [details] repro page On Safari, speechSynthesis.speak invocations interrupt other active media. To reproduce: 1) get a MediaStream from getUserMedia, add it to a video element, play the video element 2) call speechSynthesis.speak(new SpeechSynthesisUtterance('hello') This interrupts both the video element and media stream indepedently. 1) (mobile safari) video elements are paused Calling videoElement.play() resumes playback. If video playback is resumed before the SpeechSynthesisUtterance ends, the speech is sometimes interrupted (this part is really inconsistent). It seems like, after resuming playback from a user-gesture handler, future calls to speak no longer pause this video. If, on the other hand, the video is resumed from a non-user-gesture (e.g. in response to utterance.end event), future calls to speak will re-pause the video. 2) (mobile safari) audio tracks from getUserMedia are muted (MediaStreamTrack - mute event is raised, muted is true) The track is never unmuted, and there is no API for user-code to unmute a track. It seems the only option would be to stop the track, and get a new one by getUserMedia. 3) (desktop safari) audio tracks from getUserMedia are silenced (all samples are 0) while the utterance is playing (mute event is not raised, volume is restored after utterance ends) This may not be a complete list - I haven't checked how Web Speech interacts with other APIs, e.g. Web Audio. For Mobile Safari, I tested on iOS 14.1 For Desktop Safari, I tested on Safari 13.1.3 (OSX 10.15.7) Other browsers (Chrome, Firefox) do not exhibit these issues. It would be great if Safari behaved the same as Chrome or Firefox. At the bare minimum, I ask that media streams could be unmuted after speech finishes. I have attached a page which demonstrates the behavior. The page includes WebRTC, which serves to demonstrate that only the getUserMedia-created stream is muted.
Tested on Safari 14.0, OSX 10.15.7 - same result, only #3 is an issue
not unexpected, but we may be able to do better
<rdar://problem/71355701>
Hi, how do I operate this repro page? I press send video, see my web cam. then I press speak, but I don't see anything else happening. Am I supposed to hear or see a different audio/video track? my own camera video is not paused while speech synthesizer works
On iPad, the speech pauses the video element and mutes the input stream. There is a button to resume the video element. On OS X, the speech only causes the audio input stream to be silent, and only while speech is in progress - you should be able to see that reflected on the volume meter next to the video. I can attach a video of it happening if you’d like.
You may need to edit the attachment, it looks like the string I had it speaking (a poop emoji - sorry) got mangled during upload.
(In reply to Andrew from comment #6) > You may need to edit the attachment, it looks like the string I had it > speaking (a poop emoji - sorry) got mangled during upload. On my Safari when the remote RTP gives an error Unhandled Promise Rejection: AbortError: The operation was aborted. and doesn't display (this works on chrome incidentally)
IIRC Safari is more strict, need to load this kind of page over https:// and not file:// I’m not near a computer, later I will attach a version of the page without RTC.
(In reply to Andrew from comment #8) > IIRC Safari is more strict, need to load this kind of page over https:// and > not file:// > > I’m not near a computer, later I will attach a version of the page without > RTC. Thanks a lot!
Created attachment 414526 [details] repro page (sans rtc) Revised version of the page: * no longer non-ascii text * removed rtc, wasn't required as part of core issue
(In reply to Andrew from comment #10) > Created attachment 414526 [details] > repro page (sans rtc) > > Revised version of the page: > * no longer non-ascii text > * removed rtc, wasn't required as part of core issue So on the Mac, once I start the Mic and Camera, I see my video and I see the volume bar go up and down as a I speak. When I play the speech synthesis, and I continue to talk, I still see the bar go up and down. Should I see the bar at 0 while speech synthesis is playing while I'm talking? Also I'm testing on 11.1
(In reply to chris fleizach from comment #11) > Should I see the bar at 0 while speech synthesis is playing while I'm > talking? Yes, that is what I see. (In reply to chris fleizach from comment #11) > Also I'm testing on 11.1 I currently only have Safari 14.0 (macOS 10.15.7).
(In reply to Andrew from comment #12) > (In reply to chris fleizach from comment #11) > > Should I see the bar at 0 while speech synthesis is playing while I'm > > talking? > > Yes, that is what I see. > > (In reply to chris fleizach from comment #11) > > Also I'm testing on 11.1 > > I currently only have Safari 14.0 (macOS 10.15.7). Interesting -- wonder if this was resolved on 11.0. I don't have an easy way to go back to 10.15 without spending a day blowing away my computers.
(In reply to chris fleizach from comment #13) > Interesting -- wonder if this was resolved on 11.0. I don't have an easy way > to go back to 10.15 without spending a day blowing away my computers. That might be good enough for us on macOS, I will try to sell it. However, the issue is more severe on mobile Safari. We have a much higher usage on iPad than macOS (issue also exists on iPhone I think, but we don't support that platform). Additionally, this bug carries over to wkWebView, now that in 14.3 beta added getUserMedia.
(In reply to Andrew from comment #14) > (In reply to chris fleizach from comment #13) > > Interesting -- wonder if this was resolved on 11.0. I don't have an easy way > > to go back to 10.15 without spending a day blowing away my computers. > > That might be good enough for us on macOS, I will try to sell it. > > However, the issue is more severe on mobile Safari. We have a much higher > usage on iPad than macOS (issue also exists on iPhone I think, but we don't > support that platform). > Additionally, this bug carries over to wkWebView, now that in 14.3 beta > added getUserMedia. got it, looking at that next
Created attachment 414606 [details] patch
Created attachment 414607 [details] patch
Comment on attachment 414607 [details] patch Thanks for the review. I want to try to add a test for this before merging
Hi, we are using speechSynthesis in combination with a video element and can confirm issue #1. Could see this issue on iOS 13.7 and 14. Do you have any ETA on this fix?
I am also interested in this feature and the status.
(In reply to Anton Mo Eriksson from comment #20) > I am also interested in this feature and the status. It is fixed in iOS14.4 already. Please give it a test there
The fix for this issue was needed outside the WebKit project, therefore this is being resolved as 'Moved'. This should now be fixed in shipping software.
Marking as "RESOLVED MOVED" based on Brent's comment 22.