Bug 308329
| Summary: | REGRESSION (306632@main): webrtc/candidate-stats.html is a flaky failure | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | Diego De La Toba <d_delatoba> |
| Component: | WebRTC | Assignee: | youenn fablet <youennf> |
| Status: | RESOLVED FIXED | ||
| Severity: | Normal | CC: | annevk, sam, webkit-bot-watchers-bugzilla, webkit-bug-importer, youennf |
| Priority: | P2 | Keywords: | InRadar |
| Version: | WebKit Nightly Build | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
Diego De La Toba
webrtc/candidate-stats.html is a constant text failure on Tahoe debug
HISTORY:
https://results.webkit.org/?suite=layout-tests&test=webrtc%2Fcandidate-stats.html
DIFF:
--- /Volumes/Data/worker/Apple-Tahoe-Debug-AppleSilicon-WK2-Tests/build/layout-test-results/webrtc/candidate-stats-expected.txt
+++ /Volumes/Data/worker/Apple-Tahoe-Debug-AppleSilicon-WK2-Tests/build/layout-test-results/webrtc/candidate-stats-actual.txt
@@ -1,3 +1,3 @@
-PASS ICE candidate data channel stats
+FAIL ICE candidate data channel stats assert_array_equals: local lengths differ, expected array ["id", "timestamp", "type", "candidateType", "foundation", "port", "priority", "protocol", "transportId", "usernameFragment"] length 10, got ["id", "timestamp", "type", "candidateType", "foundation", "port", "priority", "protocol", "tcpType", "transportId", "usernameFragment"] length 11
DIFF URL:
https://build.webkit.org/results/Apple-Tahoe-Debug-AppleSilicon-WK2-Tests/307920%40main%20(947)/webrtc/candidate-stats-pretty-diff.html
Reproduction:
I was able to reproduce this failure on macOS Tahoe ToT with the following:
run-webkit-tests --no-build --no-retry --no-show-results --exit-after-n-failures=3 --expect-pass --iterations=1000 --force --clobber-old-results webrtc/candidate-stats.html --debug
I am going to mark expectations as fail while this pends investigation
| Attachments | ||
|---|---|---|
| Add attachment proposed patch, testcase, etc. |
Radar WebKit Bug Importer
<rdar://problem/170829395>
Alexey Proskuryakov
What I see in history is that the test became flaky in early February on all macOS queues, not just Tahoe or debug.
The first failure I see is with 306635@main, which makes Sam's 306632@main a bit suspicious.
Diego De La Toba
This is a constant failure when testing locally. However, per history it is flaky.
Sam Weinig
My change does indeed look suspicious.
But so far have not able to reproduce this on macOS 26.3, even after upping the number of iterations to 2000. Anything else you can think I would need to do to get this to happen?
Some initial code analysis:
--
The failure state is that the `RTCIceCandidateStats` dictionary sometimes has a `tcpType` member when it should not.
`tcpType` has the IDL type `RTCIceTcpCandidateType tcpType` (e.g. implicitly optional since it is not marked required) and the correct corresponding c++ type, `std::optional<RTCIceTcpCandidateType> tcpType` in the `IceCandidateStats` struct (this did not change in the
The only place that constructs the IceCandidateStats struct is IceCandidateStats::convert .
In that convert function, tcpType is initialized like this:
rtcStats.tcp_type ? parseEnumerationFromString<RTCIceTcpCandidateType>(fromStdString(*rtcStats.tcp_type)) : std::nullopt,
Meaning, the only way it can be non-nullopt is if webrtc::RTCIceCandidateStats has a non-nullopt rtcStats.tcp_type.
In the old code, this was the same, except instead of using a ternary operator, it used an if statement (it was also a constructor, so looks a bit different):
if (rtcStats.tcp_type) {
if (auto tcpType = parseEnumerationFromString<RTCIceTcpCandidateType>(fromStdString(*rtcStats.tcp_type)))
tcpType = *tcpType;
}
This is leading me to believe the only way this can be flakey is if webrtc::RTCIceCandidateStats sometimes includes tcp_type.
But I will keep looking.
Sam Weinig
If you are able to reproduce this locally, can you bisect it to the commit that broke it?
Diego De La Toba
locally test is now flaky and reproduces with:
run-webkit-tests --root <PATH-TO-SPADE> --no-retry --no-show-results --expect-pass --iterations=1000 --force webrtc/candidate-stats.html --debug -f
I'll attempt bisection
Anne van Kesteren
I could reproduce this morning with instructions from OP, but I can't reproduce right now. Not sure what commit might have fixed it though.
youenn fablet
An ice candidate can either be udp or tcp. UDP candidates are usually gathered first, then tcp candidates are.
It is possible that the stats generator is sometimes exposing TCP candidate stats before UDP, hence the flakiness.
I think we should fix the test itself by checking whether protocol stats field is "tcp" or "udp" and if tcp validate that there is a tcpType stats field.
I can do this next week when I am back from holidays.
Sam Weinig
I now think it is unlikely the commit mentioned in the title is the regression point, but since I can't reproduce, I can't prove that. It seems more likely there is something racy in the libWebRTC code as Youenn eluded to.
youenn fablet
Pull request: https://github.com/WebKit/WebKit/pull/59707
EWS
Committed 308527@main (fdb557e81e0f): <https://commits.webkit.org/308527@main>
Reviewed commits have been landed. Closing PR #59707 and removing active labels.