54855 – Let NRWT print a detailed summary of expected results

RESOLVED WONTFIX54855

Let NRWT print a detailed summary of expected results

https://bugs.webkit.org/show_bug.cgi?id=54855

Summary Let NRWT print a detailed summary of expected results

Xianzhu Wang

Reported 2011-02-20 23:30:50 PST

I've found that a detailed summary of expected results is useful to me to understand the situation of current expectations. I propose an option '--print=detailed-expected' which behaves like '--print=expected' but gives detailed summary of each combination of expectations of each type. An example output is as follows: Found: 23149 tests Expect: 13564 passes (13500 now, 64 wontfix) Expect: 821 failures ( 756 now, 65 wontfix) 1 failures ( 1 now, 0 wontfix) CRASH 252 failures ( 207 now, 45 wontfix) FAIL 94 failures ( 94 now, 0 wontfix) IMAGE 71 failures ( 61 now, 10 wontfix) IMAGE+TEXT 1 failures ( 1 now, 0 wontfix) MISSING 390 failures ( 381 now, 9 wontfix) TEXT 12 failures ( 11 now, 1 wontfix) TIMEOUT Expect: 150 flaky ( 149 now, 1 wontfix) 4 flaky ( 4 now, 0 wontfix) FAIL CRASH 3 flaky ( 3 now, 0 wontfix) FAIL TIMEOUT 3 flaky ( 3 now, 0 wontfix) IMAGE IMAGE+TEXT 2 flaky ( 2 now, 0 wontfix) IMAGE+TEXT CRASH 17 flaky ( 17 now, 0 wontfix) PASS CRASH 6 flaky ( 5 now, 1 wontfix) PASS FAIL 1 flaky ( 1 now, 0 wontfix) PASS FAIL CRASH 1 flaky ( 1 now, 0 wontfix) PASS FAIL TIMEOUT 1 flaky ( 1 now, 0 wontfix) PASS FAIL TIMEOUT CRASH 21 flaky ( 21 now, 0 wontfix) PASS IMAGE 2 flaky ( 2 now, 0 wontfix) PASS IMAGE IMAGE+TEXT 5 flaky ( 5 now, 0 wontfix) PASS IMAGE+TEXT 1 flaky ( 1 now, 0 wontfix) PASS MISSING 38 flaky ( 38 now, 0 wontfix) PASS TEXT 2 flaky ( 2 now, 0 wontfix) PASS TEXT CRASH 7 flaky ( 7 now, 0 wontfix) PASS TEXT TIMEOUT 2 flaky ( 2 now, 0 wontfix) PASS TEXT TIMEOUT CRASH 21 flaky ( 21 now, 0 wontfix) PASS TIMEOUT 1 flaky ( 1 now, 0 wontfix) PASS TIMEOUT CRASH 3 flaky ( 3 now, 0 wontfix) TEXT CRASH 1 flaky ( 1 now, 0 wontfix) TEXT IMAGE IMAGE+TEXT 3 flaky ( 3 now, 0 wontfix) TEXT IMAGE+TEXT 2 flaky ( 2 now, 0 wontfix) TEXT MISSING 3 flaky ( 3 now, 0 wontfix) TEXT TIMEOUT Expect: 8614 skipped ( 400 now, 8214 wontfix) 175 skipped ( 41 now, 134 wontfix) FAIL 1 skipped ( 0 now, 1 wontfix) FAIL IMAGE 53 skipped ( 1 now, 52 wontfix) FAIL TIMEOUT 2 skipped ( 0 now, 2 wontfix) IMAGE+TEXT 1663 skipped ( 2 now, 1661 wontfix) PASS 395 skipped ( 212 now, 183 wontfix) PASS FAIL 267 skipped ( 2 now, 265 wontfix) PASS FAIL TIMEOUT 5 skipped ( 5 now, 0 wontfix) PASS TIMEOUT 1 skipped ( 1 now, 0 wontfix) PASS TIMEOUT CRASH 5740 skipped ( 14 now, 5726 wontfix) TEXT 2 skipped ( 1 now, 1 wontfix) TEXT TIMEOUT 1 skipped ( 1 now, 0 wontfix) TEXT TIMEOUT CRASH 308 skipped ( 120 now, 188 wontfix) TIMEOUT 1 skipped ( 0 now, 1 wontfix) TIMEOUT CRASH

Attachments
patch (4.13 KB, patch) 2011-02-21 00:04 PST, Xianzhu Wang	no flags	Details Formatted Diff Diff
patch with unit test (5.11 KB, patch) 2011-02-21 00:58 PST, Xianzhu Wang	no flags	Details Formatted Diff Diff
Show Obsolete (1) View All Add attachment proposed patch, testcase, etc.

Xianzhu Wang

Comment 1 2011-02-21 00:04:15 PST

Created attachment 83134 [details] patch

Xianzhu Wang

Comment 2 2011-02-21 00:58:24 PST

Created attachment 83136 [details] patch with unit test This patch contains only unit test for printing.py, but not test_runner.py as much work is needed to make the related part of it testable which can be done later.

Dirk Pranke

Comment 3 2011-02-21 18:46:24 PST

Comment on attachment 83136 [details] patch with unit test View in context: https://bugs.webkit.org/attachment.cgi?id=83136&action=review > Tools/Scripts/webkitpy/layout_tests/layout_package/test_runner.py:897 > + details[expectation]['now' if test in now else 'wontfix'] += 1 Nit: We don't often use the ternary operator in our code, so you might be better off rewriting this as an if loop. But it's kind of a judgement call. I hate to ban a perfectly good python construct for now real reason. Technically, the patch looks fine. However, I wonder how useful this output will really be to anyone but you. I've increasingly found myself thinking that new-run-webkit-tests has too many knobs for displaying statistics different ways, and we log too many statistics and too much stuff in our verbose output. One could easily already argue that no one but me (and maybe one or two other people) know what all the existing --print options do, let alone what another added one would do. So, while not intending to directly derail this patch, I'd like to ask what you're using these numbers for? Is there some other way we could be getting these numbers that wouldn't require additional command line flags, or for this stuff to be computed and logged at runtime on every run? For example, do you really need this to be printed on every run? Or is this something that you just want to track occasionally, as a report, or something like that. Or is it perhaps better to generate this as a dashboard report across revisions, without actually running NRWT (since this data can be generated statically) (Much like the LTTF dashboard I wrote that you've probably never seen does). Or is it maybe better to have RWT output a full table of test results in some form for a run, that can then be post-processed into whatever format you want? What do others think?

Dirk Pranke

Comment 4 2011-02-21 18:47:10 PST

(cc'ing a few other people who might care one way or another) ...

Ojan Vafai

Comment 5 2011-02-21 20:05:31 PST

I agree with Dirk. We spew out way too much already. How do you use this data in a way that is more helpful than just grepping the test_expectations file?

Xianzhu Wang

Comment 6 2011-02-21 20:16:47 PST

(In reply to comment #3) Thanks Dirk and Ojan for your replies. I want to get reports from time to time with or without actually running NRWT. Post-processing the output of '--print trace-everything' seems better than what this patch does. I'd close this bug.

Ojan Vafai

Comment 7 2011-02-21 20:22:18 PST

You can probably get most of the data you want from the JSON files that the bot generates (it's just a JSON form of test_expectations.txt). You should be able to generate all this data with that one file and a local checkout. http://test-results.appspot.com/testfile?name=expectations.json

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution WONTFIX

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware All

OS All

Product WebKit

Component Tools / Tests

Assignee

Nobody

Reported

2011-02-20 23:30 PST

Modified

2011-02-21 20:22 PST History

CC List

9 users Show

URL

Keywords

Depends on

Blocks