Bug 105531 - [WTR] Wrong totals in WebKitTestRunner.
Summary: [WTR] Wrong totals in WebKitTestRunner.
Status: UNCONFIRMED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: NRWT
Depends on:
Blocks:
 
Reported: 2012-12-20 06:30 PST by Mateusz Leszko
Modified: 2013-01-14 17:46 PST (History)
11 users (show)

See Also:


Attachments
test results (43.17 KB, text/plain)
2012-12-20 06:31 PST, Mateusz Leszko
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mateusz Leszko 2012-12-20 06:30:40 PST
When using  --debug-rwt-logging parameter total are not correct.
Example in attachment.

I see there two problems:
-	We got 229 test ran as expected, but in totals it’s 228.
-	We got 2 missing results – this shouldn’t be counted as expected if it's not expected.

I think this parameter use to result ratio like “passed as expected”/"all running" test in totals.
Comment 1 Mateusz Leszko 2012-12-20 06:31:27 PST
Created attachment 180333 [details]
test results
Comment 2 Mateusz Leszko 2012-12-31 02:29:30 PST
CC'ing owners of
 Tools/Scripts/new-run-webkit-tests
 Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py
Comment 3 Zan Dobersek 2012-12-31 03:01:06 PST
I don't think test result totals are related to the test runner (i.e. DRT or WTR), so this bug might be related to bug #105636.
Comment 4 Mateusz Leszko 2013-01-03 06:53:31 PST
Dirk, are you able to explain or say how to get ratio like “passed as expected”/"all running"?
Comment 5 Dirk Pranke 2013-01-03 12:52:01 PST
I think there's probably a couple bugs in NRWT. I've been away for a few days so haven't had a chance to look into it but will soon.
Comment 6 Dirk Pranke 2013-01-14 17:46:21 PST
Okay, sorry for the delay, I finally got time to look at this.

(In reply to comment #0)
> When using  --debug-rwt-logging parameter total are not correct.
> Example in attachment.
> 
> I see there two problems:
> -    We got 229 test ran as expected, but in totals it’s 228.

Here we have the problem of data being sliced and diced in different ways.

The "229" number comes from finding 232 tests and skipping 3 of them.

The "228" number comes from finding 228 tests and having 4 of them fail. In this case tests that are skipped are counted as having passed.

> -    We got 2 missing results – this shouldn’t be counted as expected if it's not expected.
>

I don't know what version you ran this against (so what the contents of your TestExpectations files were), but the lack of a "failed unexpectedly" in the log file probably means that you had the two tests marked as Missing, and, so, they were expected.

So, I don't see any actually incorrect computations here.

As I mentioned somewhere (email?) the problem with these routines is that there's lots of different ways one might want to slice and dice these statistics, and it's hard to come up with meaningful and clear buckets without being really, really verbose. That said, I'm definitely open to changes to the wording or how we are in fact bucketing things if that will help someone.