Summary: | ThreadSanitizer: data race in WebCore::CARingBufferStorageVector::setCurrentFrameBounds() / getCurrentFrameBounds() | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | David Kilzer (:ddkilzer) <ddkilzer> | ||||||||||
Component: | Media | Assignee: | David Kilzer (:ddkilzer) <ddkilzer> | ||||||||||
Status: | RESOLVED FIXED | ||||||||||||
Severity: | Normal | CC: | cdumez, eric.carlson, ews-watchlist, ggaren, glenn, gsnedders, jer.noble, philipj, sergio, webkit-bug-importer, ysuzuki | ||||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||||
Version: | WebKit Nightly Build | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Attachments: |
|
Description
David Kilzer (:ddkilzer)
2021-08-11 15:47:40 PDT
Seen in these layout tests: fast/mediastream/getUserMedia-webaudio.html fast/mediastream/mediastreamtrack-audio-clone.html imported/w3c/web-platform-tests/webrtc/RTCDTMFSender-insertDTMF.https.html imported/w3c/web-platform-tests/webrtc/RTCPeerConnection-iceConnectionState.https.html imported/w3c/web-platform-tests/webrtc/RTCPeerConnection-track-stats.https.html imported/w3c/web-platform-tests/webrtc/protocol/missing-fields.html webrtc/audio-peer-connection-g722.html webrtc/audio-peer-connection-webaudio.html webrtc/audio-replace-track.html webrtc/peer-connection-audio-mute.html webrtc/peer-connection-audio-mute2.html webrtc/peer-connection-createMediaStreamDestination.html webrtc/peer-connection-remote-audio-mute.html webrtc/peer-connection-remote-audio-mute2.html cc'ing Jer for visibility here. I remember there are some constraints here as we don't want to be locking on the audio thread. My understanding was that CARingBuffer had some code that was intentionally racy to avoid locking. Created attachment 435381 [details]
Patch v1
(In reply to Chris Dumez from comment #3) > cc'ing Jer for visibility here. I remember there are some constraints here > as we don't want to be locking on the audio thread. My understanding was > that CARingBuffer had some code that was intentionally racy to avoid locking. Good to know. I think getCurrentFrameBounds() definitely needs locking (otherwise you get inconsistent values for start/end), but I could see an argument for not locking currentStartFrame() and currentEndFrame(). We can also tell TSan to ignore individual methods via the `SUPPRESS_TSAN` attribute. (In reply to David Kilzer (:ddkilzer) from comment #5) > (In reply to Chris Dumez from comment #3) > > cc'ing Jer for visibility here. I remember there are some constraints here > > as we don't want to be locking on the audio thread. My understanding was > > that CARingBuffer had some code that was intentionally racy to avoid locking. > > Good to know. I think getCurrentFrameBounds() definitely needs locking > (otherwise you get inconsistent values for start/end), but I could see an > argument for not locking currentStartFrame() and currentEndFrame(). > > We can also tell TSan to ignore individual methods via the `SUPPRESS_TSAN` > attribute. I defer to Eric and Jer here. The locking may be fine in this particular instance. (In reply to David Kilzer (:ddkilzer) from comment #4) > Created attachment 435381 [details] > Patch v1 Hmmm...can't use WTF::Locker in a const method? (In reply to David Kilzer (:ddkilzer) from comment #7) > (In reply to David Kilzer (:ddkilzer) from comment #4) > > Created attachment 435381 [details] > > Patch v1 > > Hmmm...can't use WTF::Locker in a const method? Mark Lock as mutable? Created attachment 435385 [details]
Patch v2
Comment on attachment 435385 [details]
Patch v2
The entire point of the CARingBuffer is for it to be lockless; adding a lock to address this race would be very bad for high-priority audio thread performance. We should find another way to address this race.
I'm not sure this is worth fixing; CARingBuffer is inherently racy, and that's fine. Can we just disable this warning somehow? (In reply to Jer Noble from comment #11) > I'm not sure this is worth fixing; CARingBuffer is inherently racy, and > that's fine. Can we just disable this warning somehow? Yes, with TSAN_SUPPRESS. ``` volatile uint64_t m_startFrame; volatile uint64_t m_endFrame; volatile uint32_t m_updateCounter; ``` Should we mark these as atomic instead of volatile? I believe they are marked as volatile exactly to deal with this threading race. However, iirc, using volatile for such things is not entirely correct. cc'ing Geoff / Yusuke who probably know for sure. Created attachment 435389 [details]
Patch v3
(In reply to Chris Dumez from comment #13) > ``` > volatile uint64_t m_startFrame; > volatile uint64_t m_endFrame; > volatile uint32_t m_updateCounter; > ``` > > Should we mark these as atomic instead of volatile? I believe they are > marked as volatile exactly to deal with this threading race. However, iirc, > using volatile for such things is not entirely correct. cc'ing Geoff / > Yusuke who probably know for sure. volatile is unsafe here and can give us races (e.g. on 32-bit we could in principle read while only the half of m_startFrame has been written); in the majority of cases like this we should be using atomics with relaxed ordering which shouldn't have any perf penalty. Created attachment 435435 [details]
Patch v4
(In reply to David Kilzer (:ddkilzer) from comment #16) > Created attachment 435435 [details] > Patch v4 Use the correct macro (SUPPRESS_TSAN, not TSAN_SUPPRESS) and put it in a location to make GCC happy?! Also removed the ASSERT() macros as this patch is only focused on suppressing the false positive TSan reports now. Comment on attachment 435435 [details] Patch v4 Setting cq+ because this change has no impact on non-TSan builds, so the api-ios test failure is unrelated: <https://ews-build.webkit.org/#/builders/9/builds/52389> Committed r281001 (240494@main): <https://commits.webkit.org/240494@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 435435 [details]. |