Bug 86691 - Simplify syntax in test_expectations.txt
Summary: Simplify syntax in test_expectations.txt
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Ryosuke Niwa
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-05-16 16:56 PDT by Ryosuke Niwa
Modified: 2012-05-17 19:29 PDT (History)
10 users (show)

See Also:


Attachments
Simplifies the syntax (23.78 KB, patch)
2012-05-16 21:00 PDT, Ryosuke Niwa
dpranke: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ryosuke Niwa 2012-05-16 16:56:01 PDT
Per webkit-dev and contributor's meeting discussion, we should simplify the syntax in test_expectations.txt.

Here's my proposal:
- Replace BUGWK123456 by webkit.org/b/123456 or wkb.ug/123456
- Get rid of ":" and "=" delimiters
- Move all qualifiers on one side of test name.

  e.g. either webkit.org/b/123456 WIN MAC TEXT editing/undo/my-undo-test.html
  or editing/undo/my-undo-test.html webkit.org/b/123456 WIN MAC TEXT
  NOT BUGWK123456 WIN MAC: editing/undo/my-undo-test.html = TEXT
Comment 1 Simon Fraser (smfr) 2012-05-16 17:02:02 PDT
(In reply to comment #0)
> Per webkit-dev and contributor's meeting discussion, we should simplify the syntax in test_expectations.txt.
> 
> Here's my proposal:
> - Replace BUGWK123456 by webkit.org/b/123456 or wkb.ug/123456

Why not just 123456? Do Chromium bugs show up in this list? If so, why?
Comment 2 Ryosuke Niwa 2012-05-16 17:03:42 PDT
(In reply to comment #1)
> Why not just 123456? Do Chromium bugs show up in this list? If so, why?

Yes, because sometimes bugs are caused by Skia, V8, or other Chromium code. But maybe we could special-case those instead since they're rare. e.g. Use crbug.com/12345 for Chromium bugs and 12345 for WebKit bugs.
Comment 3 Ojan Vafai 2012-05-16 17:07:33 PDT
I'm OK with making these changes. Outside of the bug number changes I'm not convinced it's actually an improvement.

If we do make these changes, then I think we should make it so that the expected result of the test is optional. For example:
foo/bar.html 12345
foo/bar.html SLOW

Should both imply that the test is expected to pass.
Comment 4 Ryosuke Niwa 2012-05-16 17:13:24 PDT
(In reply to comment #3)
> I'm OK with making these changes. Outside of the bug number changes I'm not convinced it's actually an improvement.

I've found many people adding entries without : and = and cause test failures :(

> If we do make these changes, then I think we should make it so that the expected result of the test is optional. For example:
> foo/bar.html 12345
> foo/bar.html SLOW

Shouldn't "foo/bar.html 12345" do the same thing as SKIP? Why do we ever want to list a test there if we're expecting it to pass everywhere?

It makes sense for SLOW however.
Comment 5 Ojan Vafai 2012-05-16 17:15:40 PDT
(In reply to comment #4)
> (In reply to comment #3)
> > I'm OK with making these changes. Outside of the bug number changes I'm not convinced it's actually an improvement.
> 
> I've found many people adding entries without : and = and cause test failures :(
> 
> > If we do make these changes, then I think we should make it so that the expected result of the test is optional. For example:
> > foo/bar.html 12345
> > foo/bar.html SLOW
> 
> Shouldn't "foo/bar.html 12345" do the same thing as SKIP? Why do we ever want to list a test there if we're expecting it to pass everywhere?

It's a way of associating a test with a bug. We can use this for cases where we check in the failing result as a way of keeping track of the regression so that it gets fixed eventually.

The bigger problem with implying SKIP, is how do you decide when SKIP isn't applied? That needs some hidden set of rules that's too complicated IMO.
Comment 6 Ryosuke Niwa 2012-05-16 17:19:08 PDT
(In reply to comment #5)
> It's a way of associating a test with a bug. We can use this for cases where we check in the failing result as a way of keeping track of the regression so that it gets fixed eventually.

But we'll be checking in -expected-failures.* for those cases, right (someone needs to implement it). That aligns better with our goal of putting information about tests as close as possible to tests themselves. I even hate the fact I have to look through test_expectations.txt to figure out expectations. However,

> The bigger problem with implying SKIP, is how do you decide when SKIP isn't applied? That needs some hidden set of rules that's too complicated IMO.

That's a very good point.
Comment 7 Adam Barth 2012-05-16 20:29:22 PDT
I agree with smfr that we should optimize for WebKit bugs and perhaps handle other kinds of bugs via comments with URLs.
Comment 8 Ryosuke Niwa 2012-05-16 21:00:01 PDT
Created attachment 142408 [details]
Simplifies the syntax
Comment 9 Dimitri Glazkov (Google) 2012-05-16 21:20:59 PDT
I honestly don't understand how this makes anything better. The jumbling of all of the modifiers into one place seems more confusing, not less confusing. Am I the only one who thinks that way?

If we do make this change, can we also modify run-webkit-tests' output to match the new way? Currently it's copy/paste easy to update expectations.

Also, please consider that a whole bunch of us have been updating test expectations for a long time now, and we've developed a certain mechanical memory of how the expectations are done.
Comment 10 Ryosuke Niwa 2012-05-16 21:31:16 PDT
(In reply to comment #9)
> I honestly don't understand how this makes anything better. The jumbling of all of the modifiers into one place seems more confusing, not less confusing. Am I the only one who thinks that way?
> 
> If we do make this change, can we also modify run-webkit-tests' output to match the new way? Currently it's copy/paste easy to update expectations.
> 
> Also, please consider that a whole bunch of us have been updating test expectations for a long time now, and we've developed a certain mechanical memory of how the expectations are done.

We will continue supporting the old syntax for a while we transition.
Comment 11 Dirk Pranke 2012-05-16 21:36:52 PDT
Various comments (written before rniwa uploaded a patch ... I will look at that next).

First, don't take this too negatively, but I would like to be conservative by default here. There are a large number of people used to the existing format, and many of the changes suggested so far are more or less arbitrary. While I wouldn't hold the existing format up as any sort of marvel, it's not clear that it's particularly broken, either.

Second, I agree that we should optimize for bugs in bugs.webkit.org, but I'm not sure that a bare integer is distinctive enough. Perhaps I just find it aesthetically unpleasing. I do think we need to accomodate bugs filed in other databases, and I'm not sure how "comments with URLs" would work? Also, I'm inclined to continue to support some convention for BUGDPRANKE or something like that.

Third, I think it is useful to separate the configurations the expectation applies to from the expected results, i.e. I would be inclined to keep "MAC DEBUG" somehow separated from "IMAGE". I think moving SKIP and SLOW next to IMAGE would be good, though.

Fourth, I'm not sure that taking an unqualified test as equivalent to PASS makes the most sense. It seems like it makes sense to optimize for either the most common case (and I'm not sure what that would be, but it's not PASS), or try and be compatible with Skipped files (so I would be inclined to make it equivalent to SKIP, if I was doing anything). That said, I also don't understand Ojan's comment about "some hidden set of rules"; why is it any more complicated than "if there's no expectation, assume SKIP"? How is that different from "if there's no expectation, assume PASS"?

I do take the point about wanting some way to associate tests with bugs, but I'm not sure that listing <test-name, bug id> pairs mixed in with the actual expectations is the way to do it.

Lastly, I think there should continue to be a way to support WONTFIX, and (as I've mentioned before), I think we should be skipping WONTFIXed bugs. I'm not sure if we should define WONTFIX as equivalent to SKIP (which is my first inclination) or continue to treat it as an annotation like we do today and still require SKIP.
Comment 12 Dirk Pranke 2012-05-16 21:42:29 PDT
Comment on attachment 142408 [details]
Simplifies the syntax

Okay, I've skimmed through the patch but not looked at it in detail. I wish you had waited until there was some sort of consensus before working on this :(

It seems like Dimitri feels roughly the same way that I do; it's not clear to me that this is a big improvement over the existing code. Also, he's right that we should change the output of the failures from NRWT so that you can copy and paste them in. 

I'm R-'ing this for now just to avoid any confusion while we can come to an agreement on what the syntax should be. If we do decide that the syntax you've implemented is what we want, I'll go back and do a proper review of it.
Comment 13 Ryosuke Niwa 2012-05-16 21:51:23 PDT
(In reply to comment #11)
> Second, I agree that we should optimize for bugs in bugs.webkit.org, but I'm not sure that a bare integer is distinctive enough. Perhaps I just find it aesthetically unpleasing. I do think we need to accomodate bugs filed in other databases, and I'm not sure how "comments with URLs" would work? Also, I'm inclined to continue to support some convention for BUGDPRANKE or something like that.

The problem with the current syntax is that, it selects garbage like BUGWK when I double-click on a text editor. I'm open to any prefix or postfix that don't get in my way.

> Thid, think it is useful to separate the configurations the expectation applies to from the expected results, i.e. I would be inclined to keep "MAC DEBUG" somehow separated from "IMAGE". I think moving SKIP and SLOW next to IMAGE would be good, though.

Okay, I find that extremely annoying but maybe that's just me.

> Fourth, I'm not sure that taking an unqualified test as equivalent to PASS makes the most sense. It seems like it makes sense to optimize for either the most common case (and I'm not sure what that would be, but it's not PASS), or try and be compatible with Skipped files (so I would be inclined to make it equivalent to SKIP, if I was doing anything). That said, I also don't understand Ojan's comment about "some hidden set of rules"; why is it any more complicated than "if there's no expectation, assume SKIP"? How is that different from "if there's no expectation, assume PASS"?

Yeah, that was my thinking as well.

> Lastly, I think there should continue to be a way to support WONTFIX, and (as I've mentioned before), I think we should be skipping WONTFIXed bugs. I'm not sure if we should define WONTFIX as equivalent to SKIP (which is my first inclination) or continue to treat it as an annotation like we do today and still require SKIP.

That sounds like an orthogonal issue.
Comment 14 Dirk Pranke 2012-05-16 22:09:34 PDT
(In reply to comment #13)
> > Lastly, I think there should continue to be a way to support WONTFIX, and (as I've mentioned before), I think we should be skipping WONTFIXed bugs. I'm not sure if we should define WONTFIX as equivalent to SKIP (which is my first inclination) or continue to treat it as an annotation like we do today and still require SKIP.
> 
> That sounds like an orthogonal issue.

It's not an orthogonal issue. You're talking about changing the syntax and the semantics of the file. It makes sense to consider all of the changes we might want to make before we change anything so that we can decide what the right solution to accomodate them all would be, since changing the syntax repeatedly has costs.
Comment 15 Ryosuke Niwa 2012-05-16 23:01:04 PDT
(In reply to comment #14)
> (In reply to comment #13)
> > > Lastly, I think there should continue to be a way to support WONTFIX, and (as I've mentioned before), I think we should be skipping WONTFIXed bugs. I'm not sure if we should define WONTFIX as equivalent to SKIP (which is my first inclination) or continue to treat it as an annotation like we do today and still require SKIP.
> > 
> > That sounds like an orthogonal issue.
> 
> It's not an orthogonal issue. You're talking about changing the syntax and the semantics of the file. It makes sense to consider all of the changes we might want to make before we change anything so that we can decide what the right solution to accomodate them all would be, since changing the syntax repeatedly has costs.

Fair enough, FWIW, we can decide whether we should skip WONTFIX or not in a separate bug. If I recall, people have suggested to rename WontFix to NotImplemented. Would that work with you?
Comment 16 Ojan Vafai 2012-05-17 04:26:15 PDT
(In reply to comment #15)
> Fair enough, FWIW, we can decide whether we should skip WONTFIX or not in a separate bug. If I recall, people have suggested to rename WontFix to NotImplemented. Would that work with you?

Lets keep this a separate bug. But, for the record, NotImplemented is the opposite meaning of what we're going for. I proposed a long time ago that we rename WontFix to NeverFix. WontFix is meant for things that we never intend to fix because they don't make sense for our port/platform. NotImplemented would be things we just haven't gotten around to yet (e.g. new DRT apis).
Comment 17 Ojan Vafai 2012-05-17 04:29:09 PDT
Comment on attachment 142408 [details]
Simplifies the syntax

View in context: https://bugs.webkit.org/attachment.cgi?id=142408&action=review

This patch doesn't change the test_expectations.txt parsing in the flakiness dashboard. I'd like to have that fixed in the same patch the modifies the syntax.

> Tools/ChangeLog:15
> +        Support the legacy syntax while we transition to new syntax.

This is OK, but only if someone will convert the existing file and kill the legacy syntax within a couple days of committing.
Comment 18 Dirk Pranke 2012-05-17 12:57:37 PDT
(In reply to comment #17)
> > Tools/ChangeLog:15
> > +        Support the legacy syntax while we transition to new syntax.
> 
> This is OK, but only if someone will convert the existing file and kill the legacy syntax within a couple days of committing.

I agree with this. Frankly, I'm tempted to say we should just convert the file at the same time and never support both syntaxes, but that might be overly ambitious. I think we should convert things over as quickly as possible without breaking things.
Comment 19 Ryosuke Niwa 2012-05-17 13:06:54 PDT
(In reply to comment #18)
> (In reply to comment #17)
> > > Tools/ChangeLog:15
> > > +        Support the legacy syntax while we transition to new syntax.
> > 
> > This is OK, but only if someone will convert the existing file and kill the legacy syntax within a couple days of committing.
> 
> I agree with this. Frankly, I'm tempted to say we should just convert the file at the same time and never support both syntaxes, but that might be overly ambitious. I think we should convert things over as quickly as possible without breaking things.

I think this will be quite challenging due to conflicts. Also if flakiness dashboard has a separate parser, then we have to support both syntaxes while bots transition (since it's extremely unlikely that all platforms and all ports pick up the patch at same time).
Comment 20 Ojan Vafai 2012-05-17 13:36:32 PDT
(In reply to comment #19)
> (In reply to comment #18)
> > (In reply to comment #17)
> > > > Tools/ChangeLog:15
> > > > +        Support the legacy syntax while we transition to new syntax.
> > > 
> > > This is OK, but only if someone will convert the existing file and kill the legacy syntax within a couple days of committing.
> > 
> > I agree with this. Frankly, I'm tempted to say we should just convert the file at the same time and never support both syntaxes, but that might be overly ambitious. I think we should convert things over as quickly as possible without breaking things.
> 
> I think this will be quite challenging due to conflicts. Also if flakiness dashboard has a separate parser, then we have to support both syntaxes while bots transition (since it's extremely unlikely that all platforms and all ports pick up the patch at same time).

The dashboard grabs the file off of svn.webkit.org. So it will use whatever format of the file that is checked in. Also, it's fine if this is broken for 5 minutes after we checkin and push a new test results server.
Comment 21 Ojan Vafai 2012-05-17 19:29:33 PDT
Seems like there's agreement not to move forward with this as is. I've filed https://bugs.webkit.org/show_bug.cgi?id=86796 for addressing the bits we agreed on in the webkit thread.