Bug 88442

Summary:	run-webkit-tests --reset-results prints out that it resets passing tests
Product:	WebKit	Reporter:	Ojan Vafai <ojan>
Component:	Tools / Tests	Assignee:	Nobody <webkit-unassigned>
Status:	NEW
Severity:	Normal	CC:	dpranke, eric, simon.fraser, tmpsantos, tony
Priority:	P2	Keywords:	NRWT
Version:	528+ (Nightly build)
Hardware:	Unspecified
OS:	Unspecified

Ojan Vafai

Reported 2012-06-06 12:20:44 PDT

I ran run-webkit-tests css3/flexbox --reset-results and got a line that it was writing out a new expected result for each file. I'd expect it to only do so for tests that fail. While this isn't incorrect, it's not the most user-friendly. In practice, what I care about is which results actually changed. I'd much rather only see that list.

Attachments
Add attachment proposed patch, testcase, etc.

Dirk Pranke

Comment 1 2012-06-06 12:28:45 PDT

Seems reasonable to only reset the results for tests that fail, I agree.

Eric Seidel (no email)

Comment 2 2012-07-17 11:06:42 PDT

This is quite confusing (and noisy).

Eric Seidel (no email)

Comment 3 2012-07-17 11:08:50 PDT

Also, if it could compare the results and avoid writing new ones (thus changing the mod date) when the results haven't changed, I expect it could make subsequent git/svn operations much faster.

Dirk Pranke

Comment 4 2012-11-01 15:11:08 PDT

*** Bug 100993 has been marked as a duplicate of this bug. ***

Dirk Pranke

Comment 5 2012-11-01 15:11:29 PDT

*** Bug 93354 has been marked as a duplicate of this bug. ***

Dirk Pranke

Comment 6 2012-11-01 15:14:02 PDT

Rolling a couple other similar requests into this bug ... basically, when writing new baselines we should only do so if they're different than the existing baselines. Also, we should optimize the result so that if the new result now matches the next one in the search path we delete the result instead of writing it. In other words, if foo-expected.txt == "a" and platform/mac/foo-expected.txt == "b" and we write a new result that == "a", we should delete the platform/mac result instead. (And the logging should be clear about what's going on).

Simon Fraser (smfr)

Comment 7 2012-11-02 11:08:22 PDT

I think we need some way to land new baselines for all tests that are currently expected to pass. For example, when enabling subpixel rendering on Mac, there is no option I can use with RWT to say "make new baselines for tests that are already passing".

Dirk Pranke

Comment 8 2012-11-02 11:26:11 PDT

Sorry, I should've phrased things differently. I agree with Simon that there does need to be a way to update the baselines for existing tests even if they're passing. It probably just shouldn't do that by default. That said, to make sure I'm not missing something, Simon, doesn't RWT do what you want right now? I think it *always* updates the baselines with that flag, and doesn't ever look at the existing baselines to see if it matches or not.

Ojan Vafai

Comment 9 2012-11-02 11:29:59 PDT

I'm not understanding something. Whether the test is currently expected to pass or not is irrelevant. But, if it actually did pass in the run that you are trying to reset results, then overwriting the file should be a noop, no?

Dirk Pranke

Comment 10 2012-11-02 11:31:38 PDT

(In reply to comment #9) > I'm not understanding something. Whether the test is currently expected to pass or not is irrelevant. But, if it actually did pass in the run that you are trying to reset results, then overwriting the file should be a noop, no? On the chromium port, yes. However, if you're using fuzzy pixel diffing, things might be different.

Simon Fraser (smfr)

Comment 11 2012-11-02 11:33:03 PDT

(In reply to comment #9) > I'm not understanding something. Whether the test is currently expected to pass or not is irrelevant. Not if you're making a change that affects many tests in benign ways (like enabling subpixel layout). I want a way to say "lay down new test results for all tests that are currently expected to pass" for this reason. > But, if it actually did pass in the run that you are trying to reset results, then overwriting the file should be a noop, no? It touches the mod date, but otherwise yes.

Ojan Vafai

Comment 12 2012-11-02 11:49:59 PDT

(In reply to comment #10) > (In reply to comment #9) > > I'm not understanding something. Whether the test is currently expected to pass or not is irrelevant. But, if it actually did pass in the run that you are trying to reset results, then overwriting the file should be a noop, no? > > On the chromium port, yes. However, if you're using fuzzy pixel diffing, things might be different. Ah right. Fuzzy diffing is what I wasn't thinking of. I think we *should* always rebaseline passing tests too. We just shouldn't write a file out and print something to stdio if the file we're writing out is the same as the existing one. Is there any downside to doing it this way?

Note You need to log in before you can comment on or make changes to this bug.