Talking to bot watchers, it seems that the fact that the count included only the count of the worst failures as opposed to a count of all failures is a source of confusion. Make it so that the count of failures includes all failures, regardless of type. For example, a test run with 30 crashes, 4 timeouts and 13 ImageDiff failures would be assigned the color purple, but have a count of 47. The current behavior would cause such a run to be assigned the color purple and have a count of 30.
Created attachment 374893 [details] Patch
Agreed, this makes more sense.
rs=me
Created attachment 374927 [details] Patch
Comment on attachment 374927 [details] Patch Clearing flags on attachment: 374927 Committed r247850: <https://trac.webkit.org/changeset/247850>
All reviewed patches have been landed. Closing bug.
<rdar://problem/53568693>
Reopening to attach new patch.
Created attachment 374962 [details] Patch
Comment on attachment 374962 [details] Patch Clearing flags on attachment: 374962 Committed r247863: <https://trac.webkit.org/changeset/247863>