Bug 90121
| Summary: | nrwt should support "-failing" baselines/expectations | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | Dirk Pranke <dpranke> |
| Component: | Tools / Tests | Assignee: | Dirk Pranke <dpranke> |
| Status: | RESOLVED WONTFIX | ||
| Severity: | Normal | CC: | abarth, darin, jacobg, mjs, ojan, rniwa, simon.fraser, tony |
| Priority: | P2 | Keywords: | NRWT |
| Version: | 528+ (Nightly build) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
Dirk Pranke
currently there's no good way to distinguish between a baseline file that is believed to produce the "correct" result and a baseline file that is believed to be the "current" (but incorrect) result. It would be nice if we could tell which was which; this is a part of the reason the chromium port has historically suppressed so many test failures.
Of course, we also have a large legacy of existing baselines that could be in either state, so we should come up with some way to accomodate that as well. This has two implications: "-failing" should probably be optional, and there should ideally be some way of distinguishing "-expected correct" from "-expected current but correctness unknown".
We would also need a way to deal with the idea that "-expected correct" is no longer current, i.e., the test changes in such a way that the "-expected" file would need to be updated.
Frankly, it's not clear if the overhead of trying to track this is worth it, but it seems like it might be good to try incrementally (e.g., on a few directories of tests on only a subset of the ports).
Anyone else have thoughts, or proposals for how to implement this?
| Attachments | ||
|---|---|---|
| Add attachment proposed patch, testcase, etc. |
Adam Barth
It sounds too hard to keep this all straight. We have enough trouble just rebaselining all the tests that are correct.
Dirk Pranke
could be ... I'm going to at least think through the implications of the different options and see if I can come up with something that might work and be no worse than things are today; if not, I'll WONTFIX it.
Dirk Pranke
I'm gonna close this as "WONTFIX". It seems like only the Chromium ports really wanted this, and the other ports would be content to just check in "failing" results as -expected.
I'm still probably gonna work on this in some form over in blink-land.
If anyone objects or is still particularly interested, let me know.