37396 – new-run-webkit-tests should log the order tests are run in

RESOLVED FIXED37396

new-run-webkit-tests should log the order tests are run in

https://bugs.webkit.org/show_bug.cgi?id=37396

Summary new-run-webkit-tests should log the order tests are run in

Ojan Vafai

Reported 2010-04-10 17:12:02 PDT

It should record the order the tests are run in on each thread to a local file. This will help when encountering non-deterministic test failures. I'm picturing the final result being something like: THREAD-1 foo/bar/baz1.html foo/bar/baz2.html foo/bar/baz4.html THREAD-2 foo/bar/baz3.html foo/bar/baz5.html foo/bar/baz6.html If we wanted to be really thorough, we may as well throw in the failure type there as well, i.e. THREAD-1 foo/bar/baz1.html = TEXT foo/bar/baz2.html = IMAGE foo/bar/baz4.html = CRASH THREAD-2 foo/bar/baz3.html = PASS foo/bar/baz5.html = IMAGE+TEXT foo/bar/baz6.html = TIMEOUT

Attachments
Add attachment proposed patch, testcase, etc.

Eric Seidel (no email)

Comment 1 2010-04-10 17:16:32 PDT

Yes. I totally agree. But I only care about this when there is a failure. It should just poop out an extra file "testname-previous-tests.txt" and link to it next to the failure in the results.html page. The driver could keep track of what tests were run since the last driver restart and every time there is a failure poop out that file.

Ojan Vafai

Comment 2 2010-04-10 17:26:13 PDT

(In reply to comment #1) > Yes. I totally agree. But I only care about this when there is a failure. > > It should just poop out an extra file "testname-previous-tests.txt" and link to > it next to the failure in the results.html page. > > The driver could keep track of what tests were run since the last driver > restart and every time there is a failure poop out that file. It's not sufficient to just know the previous test that was run. You really need the whole history. For example, there are some tests that depend on an image being in the cache. For those tests, they could pass or fail based off a test run many tests ago. Maybe we want one file per thread though. Then the link next to the failure in the result.html file can be the link to that file. It could even scroll to that test in the file (obviously the file would then need to be html).

Eric Seidel (no email)

Comment 3 2010-04-10 17:40:19 PDT

Historically, the major source of flakiness has simply been test order. Why I think that the list of previous tests is sufficient: 1. You only need to know the tests since the last restart. DRT currently restarts every 1000 tests in run-webkit-tests, you need to know when the last restart was because some state only gets cleared on restarts. 2. Each DRT is separate, including caches. So unless the problem is contention of httpd or disk access, a per-thread list should be sufficient. Currently you can sorta get an order from "run-webkit-tests --verbose", the problem is it doesn't tell you when DRT restarts, so it's hard to reconstruct the previous test list w/o knowing where it should start.

Alexey Proskuryakov

Comment 4 2010-04-10 23:11:42 PDT

As proposed by Zoltan Herczeg on webkit-dev, one could just store random number generator seed. I think it's an elegant solution. Knowing the seed, you could re-run all tests in the same order.

Dirk Pranke

Comment 5 2012-12-01 17:45:16 PST

Note that we currently do do this in the tests_run*.txt files written into layout-test-results (one file per worker). However, the file does not contain DRT/WTR pids so you can't tell when the workers crash or are otherwise restarted, and even the --debug-rwt-logging doesn't give you enough information to fix that. It also doesn't contain any timestamp information to help you determine which tests were running concurrently. More importantly, it's hard (though not impossible) to do something useful with the data in the tests_run*.txt files, since you can't easily feed it back in to NRWT or control how things are sharded ( you can use --test-list to feed in a single list of tests, and at least now with --order=none it'll honor that, but that won't help across multiple workers). I think ideally we'd merge all of the tests_run* files into a single file and add a --replay <path to file> option or something that would make this easier. I think there used to be a flag that would do a simpler version of this (--retry-last-failures or something?) but I'm not seeing it there now. I'm closing this bug for now (since we do at leaset record the order) and going to file a new one for the --replay enhancement. Regarding comment #4, I'm not sure that a random number seed would be needed or useful here. The nondeterminism comes from test timing and contention, not from using a random order. (You could of course specify the seed used when intentionally randomizing the tests, but that's a whole different thing).

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware All

OS All

Product WebKit

Component Tools / Tests

Assignee

Nobody

Reported

2010-04-10 17:12 PDT

Modified

2012-12-01 17:45 PST History

CC List

4 users Show

URL

Keywords NRWT

Depends on

Blocks