Bug 86797 - results.html should let you rebaseline a test and modify it's listing in test_expectations.txt/Skipped files from the html page
Summary: results.html should let you rebaseline a test and modify it's listing in test...
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: NRWT
Depends on:
Blocks:
 
Reported: 2012-05-17 19:33 PDT by Ojan Vafai
Modified: 2012-06-19 14:28 PDT (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ojan Vafai 2012-05-17 19:33:36 PDT
Most of the infrastructure for rebaselining tests and modifying test_expectations.txt already exists for garden-o-matic via webkit-patch commands. We just need to start garden-o-matic localhost server that results.html talks to in order to execute the webkit-patch commands. I suppose the webkit-patch rebaseline-test command also needs to be able to take a local path to a layout test results directory instead of a buildbot name.

In addition to being super convenient, this would also make for less manual editing of test_expectations.txt files, alleviating some of Ryosuke's frustrations with the file format.
Comment 1 Adam Barth 2012-05-17 19:36:29 PDT
One possibility is to pass resbaseline-test the base URL to the results directory, with the default being the buildbot base URL.
Comment 2 Ojan Vafai 2012-05-17 19:40:06 PDT
(In reply to comment #1)
> One possibility is to pass resbaseline-test the base URL to the results directory, with the default being the buildbot base URL.

Well, normally we pass it a builder name, but we don't need that in this case since you're clearly rebaselining the platform you're on. (i.e. however run-webkit-tests figures out the platform your on, webkit-patch rebaseline-test should do the same). So I think we can pass the base URL instead of the builder name. But...I don't feel strongly about how it's implemented, that's just how I would do it.
Comment 3 Adam Barth 2012-05-17 19:54:09 PDT
Automatically detecting the port can be tricky, especially in cases where more than one port runs on a given machine.  For example, rebaseline-test would need to distinguish between apple-mac and chromium-mac when running on OS X.
Comment 4 Dirk Pranke 2012-05-17 19:56:25 PDT
(In reply to comment #3)
> Automatically detecting the port can be tricky, especially in cases where more than one port runs on a given machine.  For example, rebaseline-test would need to distinguish between apple-mac and chromium-mac when running on OS X.

If you're starting from results.html, it would be easy to know which port generated the results, no?
Comment 5 Adam Barth 2012-05-17 20:06:27 PDT
> If you're starting from results.html, it would be easy to know which port generated the results, no?

Yeah, we'll probably need to pass that information to rebaseline-test.
Comment 6 Tony Chang 2012-05-18 09:38:24 PDT
Rather than having to have garden-o-matic running on localhost, you could provide a single webkit-patch command line for rebaselining a test that the user would copy/paste.  Maybe we can encode all the necessary information in the command line (port, server to download from, tests to rebase, etc).
Comment 7 Ryosuke Niwa 2012-05-18 09:46:23 PDT
(In reply to comment #6)
> Rather than having to have garden-o-matic running on localhost, you could provide a single webkit-patch command line for rebaselining a test that the user would copy/paste.  Maybe we can encode all the necessary information in the command line (port, server to download from, tests to rebase, etc).

Not that I'm opposed to adding such a command, but having to pass all those information in command line seems annoying.
Comment 8 Tony Chang 2012-05-18 10:16:12 PDT
(In reply to comment #7)
> (In reply to comment #6)
> > Rather than having to have garden-o-matic running on localhost, you could provide a single webkit-patch command line for rebaselining a test that the user would copy/paste.  Maybe we can encode all the necessary information in the command line (port, server to download from, tests to rebase, etc).
> 
> Not that I'm opposed to adding such a command, but having to pass all those information in command line seems annoying.

You could base64 encode a binary format so the command is short.  E.g., webkit-patch rebaseline-tests ja90134jkld914-jsf90a8a=.  Or did you mean annoying to implement?
Comment 9 Ryosuke Niwa 2012-05-18 10:25:38 PDT
(In reply to comment #8)
> (In reply to comment #7)
> > (In reply to comment #6)
> > > Rather than having to have garden-o-matic running on localhost, you could provide a single webkit-patch command line for rebaselining a test that the user would copy/paste.  Maybe we can encode all the necessary information in the command line (port, server to download from, tests to rebase, etc).
> > 
> > Not that I'm opposed to adding such a command, but having to pass all those information in command line seems annoying.
> 
> You could base64 encode a binary format so the command is short.  E.g., webkit-patch rebaseline-tests ja90134jkld914-jsf90a8a=.  Or did you mean annoying to implement?

Would that work when we have 150+ tests to rebaseline like I had to yesterday?
Comment 10 Tony Chang 2012-05-18 10:29:31 PDT
(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > (In reply to comment #6)
> > > > Rather than having to have garden-o-matic running on localhost, you could provide a single webkit-patch command line for rebaselining a test that the user would copy/paste.  Maybe we can encode all the necessary information in the command line (port, server to download from, tests to rebase, etc).
> > > 
> > > Not that I'm opposed to adding such a command, but having to pass all those information in command line seems annoying.
> > 
> > You could base64 encode a binary format so the command is short.  E.g., webkit-patch rebaseline-tests ja90134jkld914-jsf90a8a=.  Or did you mean annoying to implement?
> 
> Would that work when we have 150+ tests to rebaseline like I had to yesterday?

Maybe.  Depends on how you encode the data :)
Comment 11 Dirk Pranke 2012-05-18 17:35:38 PDT
I'm confused by how this bug has developed ... we're talking about doing local rebaselines in a checkout, right, not pulling results down from a bot?

can't you already get most of the way there with copying and pasting suppressions into test_expectations.txt and/or re-running NRWT with --reset-results and a list of tests?

Certainly we can further automate this, but I just want to get on the same page ...
Comment 12 Tony Chang 2012-05-21 09:17:08 PDT
(In reply to comment #11)
> I'm confused by how this bug has developed ... we're talking about doing local rebaselines in a checkout, right, not pulling results down from a bot?

Yes.

> can't you already get most of the way there with copying and pasting suppressions into test_expectations.txt and/or re-running NRWT with --reset-results and a list of tests?

Yes, but it's error prone and slow.  More automation here would be better.
Comment 13 Dirk Pranke 2012-05-21 15:42:16 PDT
(In reply to comment #12)
> (In reply to comment #11)
> > I'm confused by how this bug has developed ... we're talking about doing local rebaselines in a checkout, right, not pulling results down from a bot?
> 
> Yes.
> 
> > can't you already get most of the way there with copying and pasting suppressions into test_expectations.txt and/or re-running NRWT with --reset-results and a list of tests?
> 
> Yes, but it's error prone and slow.  More automation here would be better.

Agreed. Isn't all this basically what Mihai's rebaseline server did? Are we talking about rebuilding that functionality on top of the garden-o-matic code base (I have no objection to doing so, it makes sense, just making sure).
Comment 14 Ojan Vafai 2012-05-21 15:46:19 PDT
(In reply to comment #13)
> (In reply to comment #12)
> > (In reply to comment #11)
> > > I'm confused by how this bug has developed ... we're talking about doing local rebaselines in a checkout, right, not pulling results down from a bot?
> > 
> > Yes.
> > 
> > > can't you already get most of the way there with copying and pasting suppressions into test_expectations.txt and/or re-running NRWT with --reset-results and a list of tests?
> > 
> > Yes, but it's error prone and slow.  More automation here would be better.
> 
> Agreed. Isn't all this basically what Mihai's rebaseline server did? Are we talking about rebuilding that functionality on top of the garden-o-matic code base (I have no objection to doing so, it makes sense, just making sure).

More or less, but with the results.html UI. IMO, this results.html UI plus the functionality provided by garden-o-matic obsoletes the rebaseline server. Would be nice to have one fewer tool to maintain, especially as the rebaseline server is only really useful for really large rebaselines (e.g. bringing up a new port or making a skia change) and learning a new tool for that one use-case just doesn't happen much in practice.