Bug 88442 - run-webkit-tests --reset-results prints out that it resets passing tests
Summary: run-webkit-tests --reset-results prints out that it resets passing tests
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: NRWT
: 93354 100993 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-06-06 12:20 PDT by Ojan Vafai
Modified: 2012-11-02 11:49 PDT (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ojan Vafai 2012-06-06 12:20:44 PDT
I ran run-webkit-tests css3/flexbox --reset-results and got a line that it was writing out a new expected result for each file. I'd expect it to only do so for tests that fail.

While this isn't incorrect, it's not the most user-friendly. In practice, what I care about is which results actually changed. I'd much rather only see that list.
Comment 1 Dirk Pranke 2012-06-06 12:28:45 PDT
Seems reasonable to only reset the results for tests that fail, I agree.
Comment 2 Eric Seidel (no email) 2012-07-17 11:06:42 PDT
This is quite confusing (and noisy).
Comment 3 Eric Seidel (no email) 2012-07-17 11:08:50 PDT
Also, if it could compare the results and avoid writing new ones (thus changing the mod date) when the results haven't changed, I expect it could make subsequent git/svn operations much faster.
Comment 4 Dirk Pranke 2012-11-01 15:11:08 PDT
*** Bug 100993 has been marked as a duplicate of this bug. ***
Comment 5 Dirk Pranke 2012-11-01 15:11:29 PDT
*** Bug 93354 has been marked as a duplicate of this bug. ***
Comment 6 Dirk Pranke 2012-11-01 15:14:02 PDT
Rolling a couple other similar requests into this bug ... basically, when writing new baselines we should only do so if they're different than the existing baselines. Also, we should optimize the result so that if the new result now matches the next one in the search path we delete the result instead of writing it. In other words, if foo-expected.txt == "a" and platform/mac/foo-expected.txt == "b" and we write a new result that == "a", we should delete the platform/mac result instead.

(And the logging should be clear about what's going on).
Comment 7 Simon Fraser (smfr) 2012-11-02 11:08:22 PDT
I think we need some way to land new baselines for all tests that are currently expected to pass.

For example, when enabling subpixel rendering on Mac, there is no option I can use with RWT to say "make new baselines for tests that are already passing".
Comment 8 Dirk Pranke 2012-11-02 11:26:11 PDT
Sorry, I should've phrased things differently. I agree with Simon that there does need to be a way to update the baselines for existing tests even if they're passing. It probably just shouldn't do that by default.

That said, to make sure I'm not missing something, Simon, doesn't RWT do what you want right now? I think it *always* updates the baselines with that flag, and doesn't ever look at the existing baselines to see if it matches or not.
Comment 9 Ojan Vafai 2012-11-02 11:29:59 PDT
I'm not understanding something. Whether the test is currently expected to pass or not is irrelevant. But, if it actually did pass in the run that you are trying to reset results, then overwriting the file should be a noop, no?
Comment 10 Dirk Pranke 2012-11-02 11:31:38 PDT
(In reply to comment #9)
> I'm not understanding something. Whether the test is currently expected to pass or not is irrelevant. But, if it actually did pass in the run that you are trying to reset results, then overwriting the file should be a noop, no?

On the chromium port, yes. However, if you're using fuzzy pixel diffing, things might be different.
Comment 11 Simon Fraser (smfr) 2012-11-02 11:33:03 PDT
(In reply to comment #9)
> I'm not understanding something. Whether the test is currently expected to pass or not is irrelevant.

Not if you're making a change that affects many tests in benign ways (like enabling subpixel layout).

I want a way to say "lay down new test results for all tests that are currently expected to pass" for this reason.

> But, if it actually did pass in the run that you are trying to reset results, then overwriting the file should be a noop, no?

It touches the mod date, but otherwise yes.
Comment 12 Ojan Vafai 2012-11-02 11:49:59 PDT
(In reply to comment #10)
> (In reply to comment #9)
> > I'm not understanding something. Whether the test is currently expected to pass or not is irrelevant. But, if it actually did pass in the run that you are trying to reset results, then overwriting the file should be a noop, no?
> 
> On the chromium port, yes. However, if you're using fuzzy pixel diffing, things might be different.

Ah right. Fuzzy diffing is what I wasn't thinking of. I think we *should* always rebaseline passing tests too. We just shouldn't write a file out and print something to stdio if the file we're writing out is the same as the existing one. Is there any downside to doing it this way?