The audio playing over a peer connection becomes more latent when changing the audio output device (e.g. going from speakers to headphones or vice versa). The added latency is around half a second. The expectation is that the audio should not become more latent. This behavior can be seen on the https://webrtc.github.io/samples/src/content/peerconnection/audio/ sample page. The steps to reproduce the bug are: 1. Disconnect all audio output devices and turn volume to max. 2. Navigate to page. 3. Click the "Call" button. 4. Observe audio latency (recommend with a sharp sound like a tap or pop). 5. Connect headphones. 6. Observe increased audio latency (BUG). 7. Click "Hang Up". 8. Click "Call". 9. Observe normal audio latency. 10. Disconnect headphones. 11. Observe increased audio latency (BUG). This issue does not repro with audio that is not over a peer connection. This can be seen with the https://webrtc.github.io/samples/src/content/getusermedia/audio/ sample page, following the same steps as above. Reproduces with: Version/15.0 Safari/605.1.15 (iOS 15.0) Version/15.0 Safari/605.1.15 (iPadOS 15.1)
We've learned that creating an AudioContext seems to mitigate this issue somewhat. Looking at mediaserverd logging, we can see that the num samples per callback from CoreAudio changes when an AudioContext is active: AudioElement only (matches Alec's repro steps above): default 12:34:44.814148-0700 mediaserverd -CMVAEndptMgr- vaemSetDeviceBufferDuration: duration = 0.085, sampleRate = 48000.000, numPCMSamples = 4095, NearestPowerOf2 = 4096, UseQuietBufferSizeProperty = 'NO' default 12:32:42.477241-0700 mediaserverd -CMVAEndptMgr- vaemConfigurePVMSettings: PVMSetCurrentState [MediaPlayback, Default, Speaker, Speaker, (null), NO] AudioElement + AudioContext (add an AudioContext playing silence to Alec's repro steps above): default 12:27:50.915450-0700 mediaserverd -CMVAEndptMgr- vaemSetDeviceBufferDuration: duration = 0.003, sampleRate = 48000.000, numPCMSamples = 127, NearestPowerOf2 = 128, UseQuietBufferSizeProperty = 'NO' default 12:27:50.902493-0700 mediaserverd -CMVAEndptMgr- vaemConfigurePVMSettings: PVMSetCurrentState [MediaPlayback, Default, Speaker, Speaker, (null), NO] There's also a third case, which is Audio + Video Element + getUserMedia: default 12:35:50.823146-0700 mediaserverd -CMVAEndptMgr- vaemSetDeviceBufferDuration: duration = 0.020, sampleRate = 48000.000, numPCMSamples = 959, NearestPowerOf2 = 1024, UseQuietBufferSizeProperty = 'NO' default 12:35:50.855153-0700 mediaserverd -CMVAEndptMgr- vaemConfigurePVMSettings: PVMSetCurrentState [PlayAndRecord, VideoChat, SpeakerAndMicrophone, Speaker, (null), NO] These cases seem to correspond to the code here: https://github.com/WebKit/WebKit/blob/3b96406abb57ffebaa8923a1852637e354fd25ca/Source/WebCore/platform/audio/cocoa/MediaSessionManagerCocoa.mm#L142 I'm not sure I agree with the heuristic in that code where AudioContext => lowest latency, Mic enabled => medium latency, and AudioElement only => highest latency (lowest power). At a minimum, it would seem that if the AudioElement's srcObject contains remote tracks that WebKit ought to choose a lower latency preferred buffer size. Further, it seems to be the case that the actual observed latency is some (fixed?) multiple of the preferred buffer size. Even a buffer size of 4096 would probably be OK if it were just that, but it seems like there's a queue or a buffer somewhere whose size is x * audio buffer size, so where audio buffer size is small then x doesn't matter so much, but when audio buffer size is large then x really starts to matter.
<rdar://problem/84049005>
Created attachment 441863 [details] Patch
Comment on attachment 441863 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=441863&action=review r=me once the bots are happy > Source/WebCore/platform/audio/cocoa/AudioSampleDataSource.h:73 > + void askToComputeOutputSampleOffer() { m_shouldComputeOutputSampleOffset = true; } The method name is slightly confusing, maybe `recomputeSampleOffset` instead? > Source/WebCore/platform/mediastream/cocoa/AudioMediaStreamTrackRendererInternalUnit.cpp:242 > + if (m_sampleTime && m_sampleTime + 2 * sampleCount < sampleTime) Should we reset if the sample time is too far ahead of *or* behind what we expect?
(In reply to Eric Carlson from comment #4) > Comment on attachment 441863 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=441863&action=review > > r=me once the bots are happy > > > Source/WebCore/platform/audio/cocoa/AudioSampleDataSource.h:73 > > + void askToComputeOutputSampleOffer() { m_shouldComputeOutputSampleOffset = true; } > > The method name is slightly confusing, maybe `recomputeSampleOffset` instead? OK > > Source/WebCore/platform/mediastream/cocoa/AudioMediaStreamTrackRendererInternalUnit.cpp:242 > > + if (m_sampleTime && m_sampleTime + 2 * sampleCount < sampleTime) > > Should we reset if the sample time is too far ahead of *or* behind what we > expect? Oh right, good point, will fix it!
Created attachment 442000 [details] Patch for landing
Committed r284674 (243394@main): <https://commits.webkit.org/243394@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 442000 [details].
which version will include this fix?
This change should be present in STP 139, iOS 15.4 Beta, and macOS 12.3 Beta.