50098 – New webkit-patch rebaseline2 command.

Bug 50098 - New webkit-patch rebaseline2 command.

Summary: New webkit-patch rebaseline2 command.

Status:	RESOLVED WONTFIX

Alias:	None

Product:	WebKit
Classification:	Unclassified
Component:	New Bugs (show other bugs)
Version:	528+ (Nightly build)
Hardware:	Other OS X 10.5

Importance:	P2 Normal
Assignee:	Nobody

URL:
Keywords:

Depends on:	50901 52136
Blocks:
	Show dependency tree / graph

Reported:	2010-11-25 22:11 PST by James Kozianski
Modified:	2011-08-06 13:18 PDT (History)
CC List:	8 users (show)

See Also:

Attachments
Patch (65.78 KB, patch) 2010-11-25 22:12 PST, James Kozianski	no flags	Details \| Formatted Diff \| Diff
Patch (68.36 KB, patch) 2010-11-28 19:11 PST, James Kozianski	no flags	Details \| Formatted Diff \| Diff
Patch (87.96 KB, patch) 2010-12-01 03:19 PST, James Kozianski	no flags	Details \| Formatted Diff \| Diff
Patch (74.73 KB, patch) 2010-12-07 21:40 PST, James Kozianski	no flags	Details \| Formatted Diff \| Diff
Patch (74.48 KB, patch) 2010-12-07 22:25 PST, James Kozianski	no flags	Details \| Formatted Diff \| Diff
Patch (76.98 KB, patch) 2010-12-09 14:01 PST, James Kozianski	eric: review-	Details \| Formatted Diff \| Diff
Show Obsolete (5) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description James Kozianski 2010-11-25 22:11:42 PST

New webkit-patch rebaseline2 command.

Comment 1 James Kozianski 2010-11-25 22:12:33 PST

Created attachment 74903 [details]
Patch

Comment 2 James Kozianski 2010-11-28 19:11:24 PST

Created attachment 74983 [details]
Patch

Comment 3 James Kozianski 2010-11-28 19:13:25 PST

Comment on attachment 74983 [details]
Patch

Added ImageDiff support, instead of just failing on images that don't have checksums.

Comment 4 James Kozianski 2010-11-28 19:18:31 PST

This is intended to be an eventual replacement for the various deduping and rebaselining scripts we have. Apologies for the large patch - the individual components don't really mean anything without the context of the overall script.

Comment 5 Adam Barth 2010-11-29 09:58:27 PST

> This is intended to be an eventual replacement for the various deduping and rebaselining scripts we have.

Yay!  Thanks for working on this project.

Comment 6 Adam Barth 2010-11-29 10:18:05 PST

Comment on attachment 74983 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=74983&action=review

This looks cool, but you're re-inventing a bunch of wheels instead of making the existing wheels more awesome.  Also, you've siloed a bunch of your code in a separate package instead of putting it in the right place in the package hierarchy where it can be used by other parts of webkitpy.

In general, the effectiveness of webkitpy should grow like n^2.  The way you've written this patch, the effectiveness is only growing like n because you're not making use of the network effects of sharing code.

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/all_tests.py:7
> +unittest.main()

We have some nifty unittest auto-discovery code.  Maybe we should use that here?  It's a pay to register new unit tests with the harness.

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/bucket.py:209
> +    def b(self, bucket_name, *args):

This is kind of a short function name.  Can we use something more descriptive?

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:34
> +class RemoteZip(object):

This class seems general and should be in webkitpy.common.net

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:42
> +    def _load(self):
> +        if self._zip_file is None:
> +            self._zip_file = zipfile.ZipFile(urllib.urlretrieve(self._zip_url)[0])

Consider using NetworkTransaction to be robust to network connectivity issues.

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:63
> +class ZipFileHandle:

same here.

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:100
> +class DirectoryAsZip(object):

Also a general class that should be in webkit.common somewhere.

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:121
> +        f = open(os.path.join(self._path, filename))
> +        contents = f.read()
> +        f.close

We generally use "with" to avoid leaking file handles.  Also, we have a FileSystem class we use to abstract these operations.  Notice that you haven't specified a codec, which is something that FileSystem enforces.

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:135
> +    def delete(self, filename):
> +        filename = os.path.join(self._path, filename)
> +        os.remove(filename)

This seems like a security vulnerability waiting to happen...  Maybe some sort of requirement that the final filename is actually a subdirectory of self._path?

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:139
> +class Builder(object):
> +    """Retrieves results from zip files stored on a buildbot builder"""

We already have an abstraction for a BuildBot.  Presumably we should add this functionality to that abstraction instead of creating a new abstraction.

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:168
> +    def results_for(self, name, target_type=None):

We already have code that does this for the non-zip case.  We should merge this functionality into that code instead of having two silos.

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:209
> +class AggregateBuilder(object):

This belongs in the buildbot package somewhere.

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:234
> +        return 'http://build.chromium.org/f/chromium/layout_test_results/' + builder_name + '/layout-test-results.zip'

General webkitpy code shouldn't refer to build.chromium.org.  You probably need to add this to the port abstraction.

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:264
> +class WebKitBuilder(object):
> +    BUILDERS = {
> +        'mac-snowleopard': 'SnowLeopard%20Intel%20Release%20(Tests)',
> +        'mac-leopard': 'Leopard%20Intel%20Debug%20(Tests)',
> +        'win': 'Windows%207%20Release%20(WebKit2%20Tests)',
> +        'chromium-linux': 'GTK%20Linux%2032-bit%20Release',
> +    }

webkitpy already knows about these things.  Please look at how the rest of the system knows about this stuff.

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:61
> +    def _make_bucket_tree(self):
> +        b = bucket.BucketTree()
> +        b.b(None,
> +            b.b('mac',
> +                b.b('mac-snowleopard',
> +                    b.b('mac-leopard',
> +                        b.b('mac-tiger'))),
> +                b.b('win',
> +                    b.b('chromium',
> +                        b.b('chromium-mac'),
> +                        b.b('chromium-win',
> +                            b.b('chromium-win-vista',
> +                                b.b('chromium-win-xp')),
> +                            b.b('chromium-linux',
> +                                b.b('chromium-gpu',
> +                                    b.b('chromium-gpu-linux'),
> +                                    b.b('chromium-gpu-win'),
> +                                    b.b('chromium-gpu-mac'))))))),
> +            b.b('qt',
> +                b.b('qt-linux'),
> +                b.b('qt-mac'),
> +                b.b('qt-win')),
> +            b.b('gtk'))

Is this some kind of description of the structure of the expectations fallback tree?  If so, it should be in code that's shareable with the rest of webkitpy instead of siloed away here.

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:63
> +        AGGREGATE_BUCKETS = ['chromium', 'chromium-win', 'win', 'mac', 'qt', 'chromium-gpu']

Why is this magic list here?  We shouldn't have magic constants like this in the code.

> WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/result.py:35
> +class Result(object):
> +    """Represents the result of a single test on a single platform"""

We already have an abstraction for Result.

> WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:29
> +from webkitpy.layout_tests.port import factory

webkitpy.tool cannot import from webkitpy.layout_tests.  Code that's shared between these two packages should go in webkitpy.common.

> WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:53
> +            make_option("--chromium", action="store_true", dest="use_chromium_bots", help="Fetch results from the Chromium bots."),

Boo.  We should use --port to specify the port instead of making a magic special case for Chromium.

> WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:54
> +            make_option("--webkit", action="store_false", dest="use_chromium_bots", help="Fetch results from the WebKit bots (default)."),

It doesn't make sense to have a --webkit command line argument for webkit-patch.  Everything webkit-patch does is related to WebKit!

> WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:57
> +            make_option("--list-builders", action="store_true", dest="list_builders", help="List the builders for each platform."),
> +            make_option("--list-archives", action="store_true", dest="list_archives", help="Show the URLs of the test result archives."),
> +            make_option("--show-archives", action="store_true", dest="show_archives", help="Show the results stored in the specified archives."),

These seem like separate commands.  Why are they arguments to this command?

> WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:63
> +        builder = ChromiumBuilder() if options.use_chromium_bots else WebKitBuilder()

This line of code shouldn't be here.  We should get this information by instantiating a Port object.  How will this line of code scale when there are N contributors to WebKit with as much infrastructure as Chromium?

> WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:73
> +            exit()

Whenever you call exit, you might consider whether what you really want is a separate command.  Calling exit also makes this code difficult to test.

Comment 7 James Kozianski 2010-12-01 03:18:53 PST

Thanks for the prompt and thorough review, Adam!

(In reply to comment #6)
> (From update of attachment 74983 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=74983&action=review
> 
> This looks cool, but you're re-inventing a bunch of wheels instead of making the existing wheels more awesome.  Also, you've siloed a bunch of your code in a separate package instead of putting it in the right place in the package hierarchy where it can be used by other parts of webkitpy.
> 
> In general, the effectiveness of webkitpy should grow like n^2.  The way you've written this patch, the effectiveness is only growing like n because you're not making use of the network effects of sharing code.

Yep, I'm definitely happy to do that. Writing this script for me was a learning process about the test infrastructure as well as the webkitpy module. Having everything be separate allowed me to iterate quickly on my ideas and be confident that I'm not causing regressions in other people's code. Much of what's in webkitpy is stuff that I don't know that I don't know about, so your comments are helpful for me to know what should be merged and where.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/all_tests.py:7
> > +unittest.main()
> 
> We have some nifty unittest auto-discovery code.  Maybe we should use that here?  It's a pay to register new unit tests with the harness.

You're right - these tests will be auto-discovered by that script. I've deleted this file.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/bucket.py:209
> > +    def b(self, bucket_name, *args):
> 
> This is kind of a short function name.  Can we use something more descriptive?

This was an alias for 'make_bucket' intended to allow the call-sites to be concise so that the hierarchical structure of the calls would be easier to parse. I've removed this method and introduced a short alias at the call sites.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:34
> > +class RemoteZip(object):
> 
> This class seems general and should be in webkitpy.common.net

Done.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:42
> > +    def _load(self):
> > +        if self._zip_file is None:
> > +            self._zip_file = zipfile.ZipFile(urllib.urlretrieve(self._zip_url)[0])
> 
> Consider using NetworkTransaction to be robust to network connectivity issues.

Done.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:63
> > +class ZipFileHandle:
> 
> same here.

Done.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:100
> > +class DirectoryAsZip(object):
> 
> Also a general class that should be in webkit.common somewhere.

I've moved it to webkit.common.diraszip.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:121
> > +        f = open(os.path.join(self._path, filename))
> > +        contents = f.read()
> > +        f.close
> 
> We generally use "with" to avoid leaking file handles.  Also, we have a FileSystem class we use to abstract these operations.  Notice that you haven't specified a codec, which is something that FileSystem enforces.

Done.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:135
> > +    def delete(self, filename):
> > +        filename = os.path.join(self._path, filename)
> > +        os.remove(filename)
> 
> This seems like a security vulnerability waiting to happen...  Maybe some sort of requirement that the final filename is actually a subdirectory of self._path?

I've added a check to ensure the final filename doesn't include '..'. Is that enough?

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:139
> > +class Builder(object):
> > +    """Retrieves results from zip files stored on a buildbot builder"""
> 
> We already have an abstraction for a BuildBot.  Presumably we should add this functionality to that abstraction instead of creating a new abstraction.

I've renamed this to ResultSet and moved it to webkitpy.common.net as it is quite a lot more general than the existing BuildBot / builder abstractions. A ResultSet can be used independently of BuildBots / builders as well (for instance, the existing results in one's LayoutTests directory can be the source of a ResultSet), so the name was ill-suited in the first place.

I've also added a method to the Builder class (results()) that returns a ResultSet of the results stored on that Builder.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:168
> > +    def results_for(self, name, target_type=None):
> 
> We already have code that does this for the non-zip case.  We should merge this functionality into that code instead of having two silos.

Are you referring to webkitpy/common/net/layouttestresults.py? I don't think the code there and this code can really be merged - that code is for parsing a particular results.html file to extract the names of failing tests, whereas this code consumes zip files and provides search functionality for retrieving Result objects, which can be compared and saved to disk. Also, as mentioned above, the ResultSet code is quite general and it is useful outside of the context of BuildBot, so I would prefer to keep this in its own class.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:209
> > +class AggregateBuilder(object):
> 
> This belongs in the buildbot package somewhere.

I've moved this to webkitpy.common.net, where buildbot.py lives.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:234
> > +        return 'http://build.chromium.org/f/chromium/layout_test_results/' + builder_name + '/layout-test-results.zip'
> 
> General webkitpy code shouldn't refer to build.chromium.org.  You probably need to add this to the port abstraction.

Done. I've added a method Port.buildbot_resultset() which abstracts how the ResultSet is produced for each platform.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:264
> > +class WebKitBuilder(object):
> > +    BUILDERS = {
> > +        'mac-snowleopard': 'SnowLeopard%20Intel%20Release%20(Tests)',
> > +        'mac-leopard': 'Leopard%20Intel%20Debug%20(Tests)',
> > +        'win': 'Windows%207%20Release%20(WebKit2%20Tests)',
> > +        'chromium-linux': 'GTK%20Linux%2032-bit%20Release',
> > +    }
> 
> webkitpy already knows about these things.  Please look at how the rest of the system knows about this stuff.

What I'm trying to express here is a mapping between platforms (as in the subdirectories of LayoutTests/platform) and the name of the builder that provides the 'canonical' test results for that platform. When a user wants to get new baselines for mac-snowleopard, what builder should those baselines be retrieved from? Looking through the code I can't find anywhere where such a mapping is defined - rebaseline_chromium_webkit_tests.py defines its own mapping for this kind of work, the current webkit-patch rebaseline script prompts the user to select a builder name and the port object itself provides a fallback order, but no mapping between platforms and builders.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:61
> > +    def _make_bucket_tree(self):
> > +        b = bucket.BucketTree()
> > +        b.b(None,
> > +            b.b('mac',
> > +                b.b('mac-snowleopard',
> > +                    b.b('mac-leopard',
> > +                        b.b('mac-tiger'))),
> > +                b.b('win',
> > +                    b.b('chromium',
> > +                        b.b('chromium-mac'),
> > +                        b.b('chromium-win',
> > +                            b.b('chromium-win-vista',
> > +                                b.b('chromium-win-xp')),
> > +                            b.b('chromium-linux',
> > +                                b.b('chromium-gpu',
> > +                                    b.b('chromium-gpu-linux'),
> > +                                    b.b('chromium-gpu-win'),
> > +                                    b.b('chromium-gpu-mac'))))))),
> > +            b.b('qt',
> > +                b.b('qt-linux'),
> > +                b.b('qt-mac'),
> > +                b.b('qt-win')),
> > +            b.b('gtk'))
> 
> Is this some kind of description of the structure of the expectations fallback tree?  If so, it should be in code that's shareable with the rest of webkitpy instead of siloed away here.

Yes, this is the fallback tree that is implied by the various implementations of Port.baseline_search_path() - it is needed to do correct deduping. I would like to see all the different implementations of that method go away and have the tree above be the authoritive fallback tree, but that seemed like it should happen in its own patch.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:63
> > +        AGGREGATE_BUCKETS = ['chromium', 'chromium-win', 'win', 'mac', 'qt', 'chromium-gpu']
> 
> Why is this magic list here?  We shouldn't have magic constants like this in the code.

These are the names of platforms that aren't targeted by any builders because they don't denote real platforms, merely convenient fallback targets. This is significant in deduping: if we have the same expectations in mac-leopard, mac-snowleopard and mac-tiger, but a different result in mac, then we can clobber the result in mac and delete the others. We couldn't do that if mac was a platform in itself, because then it having different expectations would be valid and not a duplication.

I've added a clarifying comment to explain the meaning of the list.

> 
> > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/result.py:35
> > +class Result(object):
> > +    """Represents the result of a single test on a single platform"""
> 
> We already have an abstraction for Result.

Are you referring to the webkitpy/layout_tests/test_types module? Those classes seem like they have a different usage to the ones I'm introducing here - they have no state and perform functions on filenames and file contents passed in to them. There is no point in having more than one instance of those classes, whereas the Result classes defined here are stateful and are meant to represent Results (local or remote) for the purposes of comparing them.

> 
> > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:29
> > +from webkitpy.layout_tests.port import factory
> 
> webkitpy.tool cannot import from webkitpy.layout_tests.  Code that's shared between these two packages should go in webkitpy.common.

Ah, okay - I was taking inspiration from the original rebaseline.py. I've moved the rebaseline package out of layout_tests and into webkitpy/tool/commands/rebaseline2.

> 
> > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:53
> > +            make_option("--chromium", action="store_true", dest="use_chromium_bots", help="Fetch results from the Chromium bots."),
> 
> Boo.  We should use --port to specify the port instead of making a magic special case for Chromium.

Sorry, I was imitating build-webkit's --chromium flag here. I've added Port.buildbot_resultset() so the buildbots used are controlled via the port.

> 
> > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:54
> > +            make_option("--webkit", action="store_false", dest="use_chromium_bots", help="Fetch results from the WebKit bots (default)."),
> 
> It doesn't make sense to have a --webkit command line argument for webkit-patch.  Everything webkit-patch does is related to WebKit!

Haha, of course :-) Removed.

> 
> > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:57
> > +            make_option("--list-builders", action="store_true", dest="list_builders", help="List the builders for each platform."),
> > +            make_option("--list-archives", action="store_true", dest="list_archives", help="Show the URLs of the test result archives."),
> > +            make_option("--show-archives", action="store_true", dest="show_archives", help="Show the results stored in the specified archives."),
> 
> These seem like separate commands.  Why are they arguments to this command?

I found these useful for selecting what platforms to rebase for and for seeing what results are available for those platforms, so they seemed like reasonable arguments with a similar usage to dry-run: you use them to see what the rebase command you construct will do.

> 
> > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:63
> > +        builder = ChromiumBuilder() if options.use_chromium_bots else WebKitBuilder()
> 
> This line of code shouldn't be here.  We should get this information by instantiating a Port object.  How will this line of code scale when there are N contributors to WebKit with as much infrastructure as Chromium?

Done.

> 
> > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:73
> > +            exit()
> 
> Whenever you call exit, you might consider whether what you really want is a separate command.  Calling exit also makes this code difficult to test.

I've removed those flags.

Comment 8 James Kozianski 2010-12-01 03:19:34 PST

Created attachment 75263 [details]
Patch

Comment 9 Adam Barth 2010-12-01 09:19:55 PST

(In reply to comment #7)
> Thanks for the prompt and thorough review, Adam!
> 
> (In reply to comment #6)
> > (From update of attachment 74983 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=74983&action=review
> > 
> > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:264
> > > +class WebKitBuilder(object):
> > > +    BUILDERS = {
> > > +        'mac-snowleopard': 'SnowLeopard%20Intel%20Release%20(Tests)',
> > > +        'mac-leopard': 'Leopard%20Intel%20Debug%20(Tests)',
> > > +        'win': 'Windows%207%20Release%20(WebKit2%20Tests)',
> > > +        'chromium-linux': 'GTK%20Linux%2032-bit%20Release',
> > > +    }
> > 
> > webkitpy already knows about these things.  Please look at how the rest of the system knows about this stuff.
> 
> What I'm trying to express here is a mapping between platforms (as in the subdirectories of LayoutTests/platform) and the name of the builder that provides the 'canonical' test results for that platform. When a user wants to get new baselines for mac-snowleopard, what builder should those baselines be retrieved from? Looking through the code I can't find anywhere where such a mapping is defined - rebaseline_chromium_webkit_tests.py defines its own mapping for this kind of work, the current webkit-patch rebaseline script prompts the user to select a builder name and the port object itself provides a fallback order, but no mapping between platforms and builders.

We should never have let rebaseline_chromium_webkit_tests into the tree.  It's wrong at so many levels.

The mapping between LayoutTest/platform directories and buildbots probably belongs somewhere in the port abstraction.  You'll find that we already have two port abstractions (shame on us), but we should be able to teach one fo them to find it's buildbot.

> > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:61
> > > +    def _make_bucket_tree(self):
> > > +        b = bucket.BucketTree()
> > > +        b.b(None,
> > > +            b.b('mac',
> > > +                b.b('mac-snowleopard',
> > > +                    b.b('mac-leopard',
> > > +                        b.b('mac-tiger'))),
> > > +                b.b('win',
> > > +                    b.b('chromium',
> > > +                        b.b('chromium-mac'),
> > > +                        b.b('chromium-win',
> > > +                            b.b('chromium-win-vista',
> > > +                                b.b('chromium-win-xp')),
> > > +                            b.b('chromium-linux',
> > > +                                b.b('chromium-gpu',
> > > +                                    b.b('chromium-gpu-linux'),
> > > +                                    b.b('chromium-gpu-win'),
> > > +                                    b.b('chromium-gpu-mac'))))))),
> > > +            b.b('qt',
> > > +                b.b('qt-linux'),
> > > +                b.b('qt-mac'),
> > > +                b.b('qt-win')),
> > > +            b.b('gtk'))
> > 
> > Is this some kind of description of the structure of the expectations fallback tree?  If so, it should be in code that's shareable with the rest of webkitpy instead of siloed away here.
> 
> Yes, this is the fallback tree that is implied by the various implementations of Port.baseline_search_path() - it is needed to do correct deduping. I would like to see all the different implementations of that method go away and have the tree above be the authoritive fallback tree, but that seemed like it should happen in its own patch.

It's certainly fine to do that in a separate patch, but we're not going to want to make the situation worse by landing this patch before that one.  That's how we accumulate these technical debts.

> > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:63
> > > +        AGGREGATE_BUCKETS = ['chromium', 'chromium-win', 'win', 'mac', 'qt', 'chromium-gpu']
> > 
> > Why is this magic list here?  We shouldn't have magic constants like this in the code.
> 
> These are the names of platforms that aren't targeted by any builders because they don't denote real platforms, merely convenient fallback targets. This is significant in deduping: if we have the same expectations in mac-leopard, mac-snowleopard and mac-tiger, but a different result in mac, then we can clobber the result in mac and delete the others. We couldn't do that if mac was a platform in itself, because then it having different expectations would be valid and not a duplication.
> 
> I've added a clarifying comment to explain the meaning of the list.

Ok, but how does that scale when we add more ports / platforms to the project?  Will someone have to hunt around all of webkitpy to find all the port-specific logic and update it?  It seems better to centralize that knowledge in the port abstraction.

> > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/result.py:35
> > > +class Result(object):
> > > +    """Represents the result of a single test on a single platform"""
> > 
> > We already have an abstraction for Result.
> 
> Are you referring to the webkitpy/layout_tests/test_types module? Those classes seem like they have a different usage to the ones I'm introducing here - they have no state and perform functions on filenames and file contents passed in to them. There is no point in having more than one instance of those classes, whereas the Result classes defined here are stateful and are meant to represent Results (local or remote) for the purposes of comparing them.

I meant something in buildbot.py.  I'm not sure exactly what.  The current abstraction might be something as simple as a string, but that doesn't mean it shouldn't be improved.

> > > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:29
> > > +from webkitpy.layout_tests.port import factory
> > 
> > webkitpy.tool cannot import from webkitpy.layout_tests.  Code that's shared between these two packages should go in webkitpy.common.
> 
> Ah, okay - I was taking inspiration from the original rebaseline.py. I've moved the rebaseline package out of layout_tests and into webkitpy/tool/commands/rebaseline2.

Do you mean http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitpy/tool/commands/rebaseline.py ?

The only thing that file imports from layout_tests is webkitpy.layout_tests.port, but that's because webkitpy.layout_tests.port shouldn't be in layout_tests.  :P

Some of these invariants aren't perfect, but we'd like to make them better over time.

Comment 10 Ojan Vafai 2010-12-01 09:42:19 PST

> > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:61
> > > > +    def _make_bucket_tree(self):
> > > > +        b = bucket.BucketTree()
> > > > +        b.b(None,
> > > > +            b.b('mac',
> > > > +                b.b('mac-snowleopard',
> > > > +                    b.b('mac-leopard',
> > > > +                        b.b('mac-tiger'))),
> > > > +                b.b('win',
> > > > +                    b.b('chromium',
> > > > +                        b.b('chromium-mac'),

It's a little more complicated than this. chromium-mac falls back to chromium, then does the same fallback order as upstream mac (i.e. !win). See http://codesearch.google.com/codesearch/p#OAMlx_jo-ck/src/third_party/WebKit/WebKitTools/Scripts/webkitpy/layout_tests/port/chromium_mac.py&l=52&exact_package=chromium.

> > > > +                        b.b('chromium-win',
> > > > +                            b.b('chromium-win-vista',
> > > > +                                b.b('chromium-win-xp')),
> > > > +                            b.b('chromium-linux',
> > > > +                                b.b('chromium-gpu',
> > > > +                                    b.b('chromium-gpu-linux'),
> > > > +                                    b.b('chromium-gpu-win'),
> > > > +                                    b.b('chromium-gpu-mac'))))))),

Each platform gpu one should fallback to chromium-gpu and then the equivalent platform non-gpu one. See http://codesearch.google.com/codesearch/p#OAMlx_jo-ck/src/third_party/WebKit/WebKitTools/Scripts/webkitpy/layout_tests/port/chromium_gpu.py&l=99&exact_package=chromium

> > > > +            b.b('qt',
> > > > +                b.b('qt-linux'),
> > > > +                b.b('qt-mac'),
> > > > +                b.b('qt-win')),
> > > > +            b.b('gtk'))
> > > 
> > > Is this some kind of description of the structure of the expectations fallback tree?  If so, it should be in code that's shareable with the rest of webkitpy instead of siloed away here.
> > 
> > Yes, this is the fallback tree that is implied by the various implementations of Port.baseline_search_path() - it is needed to do correct deduping. I would like to see all the different implementations of that method go away and have the tree above be the authoritive fallback tree, but that seemed like it should happen in its own patch.
> 
> It's certainly fine to do that in a separate patch, but we're not going to want to make the situation worse by landing this patch before that one.  That's how we accumulate these technical debts.
> 
> > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:63
> > > > +        AGGREGATE_BUCKETS = ['chromium', 'chromium-win', 'win', 'mac', 'qt', 'chromium-gpu']
> > > 
> > > Why is this magic list here?  We shouldn't have magic constants like this in the code.
> > 
> > These are the names of platforms that aren't targeted by any builders because they don't denote real platforms, merely convenient fallback targets. This is significant in deduping: if we have the same expectations in mac-leopard, mac-snowleopard and mac-tiger, but a different result in mac, then we can clobber the result in mac and delete the others. We couldn't do that if mac was a platform in itself, because then it having different expectations would be valid and not a duplication.
> > 
> > I've added a clarifying comment to explain the meaning of the list.
> 
> Ok, but how does that scale when we add more ports / platforms to the project?  Will someone have to hunt around all of webkitpy to find all the port-specific logic and update it?  It seems better to centralize that knowledge in the port abstraction.

I agree that this logic should be in shared code, probably in the port abstraction. I like this better than what we have in port right now though. In order to understand each platform's fallback rules, you need to look in a bunch of different files and it's very complicated. Having the data in one file would be great. That said, this could just be the underlying data used by the existing port code. For example, the base port class could have this mapping and the baseline_search_path implementations could just index into the mapping. 

What we lose here is that each port file completely encapsulates the data for that port, but the increased code clarity is worth it IMO.

In either case, that seems like it could easily be a separate patch. In the interested of avoiding too much yak shaving, how about you rewrite this patch to use baseline_search_path and leave refactoring this stuff as a FIXME?

Comment 11 Dirk Pranke 2010-12-01 14:38:07 PST

Comment on attachment 75263 [details]
Patch

I just started looking at this patch (sorry for the delay!), so I will confine myself to a couple of comments on topics others haven't raised, and then reply to some of the other bug comments.

Mihai says that the intent of this is to replace both webkit-patch rebaseline and rebaseline-chromium-webkit-tests. If so, great! The latter code is poorly tested and hard to maintain, so it'll be nice to have something theoretically better :)


> WebKitTools/Scripts/webkitpy/common/system/filesystem.py:126
> +        shutil.copyfile(src, dest)

Why copy_file instead of copyfile(), or remove_file() instead of unlink or remove? I would prefer to have the methods be the same as the underlying package/class method names, to minimize the cognitive overhead.

> WebKitTools/Scripts/webkitpy/layout_tests/port/base.py:637
> +

Comment: I'm not wild about this being put into the layout_test/port/* classes, mostly because this has nothing to do with *running* the tests, and partially because this class is already large and hard to understand as a whole. That said, there's not an obvious other place to put them, and I don't want to create a parallel port tree, and it's only one entry point. This is probably something to think about when I/we get around to refactoring/combining this code with the common.config.ports code.

Comment 12 Dirk Pranke 2010-12-01 15:35:09 PST

(In reply to comment #9)
> 
> The mapping between LayoutTest/platform directories and buildbots probably belongs somewhere in the port abstraction.  You'll find that we already have two port abstractions (shame on us), but we should be able to teach one fo them to find it's buildbot.
>

In the absence of an obviously better place to put this, I'm inclined to agree that this belongs as a method on layout_tests/port/*, since those files are likely the basis for whatever persists into the future. That said, my comment above stands, in that this is code that has nothing to do w/ actually running the tests. I think we need to figure out how to break the Port objects up into composites.
 
> > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:61
> > > > +    def _make_bucket_tree(self):
> > > > +        b = bucket.BucketTree()
> > > > +        b.b(None,
> > > > +            b.b('mac',
> > > > +                b.b('mac-snowleopard',
> > > > +                    b.b('mac-leopard',
> > > > +                        b.b('mac-tiger'))),
> > > > +                b.b('win',
> > > > +                    b.b('chromium',
> > > > +                        b.b('chromium-mac'),
> > > > +                        b.b('chromium-win',
> > > > +                            b.b('chromium-win-vista',
> > > > +                                b.b('chromium-win-xp')),
> > > > +                            b.b('chromium-linux',
> > > > +                                b.b('chromium-gpu',
> > > > +                                    b.b('chromium-gpu-linux'),
> > > > +                                    b.b('chromium-gpu-win'),
> > > > +                                    b.b('chromium-gpu-mac'))))))),
> > > > +            b.b('qt',
> > > > +                b.b('qt-linux'),
> > > > +                b.b('qt-mac'),
> > > > +                b.b('qt-win')),
> > > > +            b.b('gtk'))
> > > 
> > > Is this some kind of description of the structure of the expectations fallback tree?  If so, it should be in code that's shareable with the rest of webkitpy instead of siloed away here.
> > 
> > Yes, this is the fallback tree that is implied by the various implementations of Port.baseline_search_path() - it is needed to do correct deduping. I would like to see all the different implementations of that method go away and have the tree above be the authoritive fallback tree, but that seemed like it should happen in its own patch.
> 
> It's certainly fine to do that in a separate patch, but we're not going to want to make the situation worse by landing this patch before that one.  That's how we accumulate these technical debts.
> 

As Ojan has already pointed out, this logic is wrong (e.g., chromium-mac doesn't fall back to win). 

Unfortunately, I don't think you can even draw the logic for all of the ports as a single tree, so this approach might be fatally flawed. At least it's a DAG :)

I agree with Ojan that attempting to figure out the logic in the current codebase is complicated. And, I agree that we don't want to lose port.baseline_search_path() as at least an entry point. 

Maybe this is something best solved with some form of composition, or mixins, or multiple inheritance. Not sure.

I also agree with Adam that it would be really nice if we didn't have two sources of truth in the tree at the same time. That said, I would be okay with it as long as you agreed to immediately fix things after this feature landed. It would be better if we could refactor/change the port code first, in a separate patch, though.

> > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:63
> > > > +        AGGREGATE_BUCKETS = ['chromium', 'chromium-win', 'win', 'mac', 'qt', 'chromium-gpu']
> > > 
> > > Why is this magic list here?  We shouldn't have magic constants like this in the code.
> > 
> > These are the names of platforms that aren't targeted by any builders because they don't denote real platforms, merely convenient fallback targets. This is significant in deduping: if we have the same expectations in mac-leopard, mac-snowleopard and mac-tiger, but a different result in mac, then we can clobber the result in mac and delete the others. We couldn't do that if mac was a platform in itself, because then it having different expectations would be valid and not a duplication.
> > 
> > I've added a clarifying comment to explain the meaning of the list.
> 
> Ok, but how does that scale when we add more ports / platforms to the project?  Will someone have to hunt around all of webkitpy to find all the port-specific logic and update it?  It seems better to centralize that knowledge in the port abstraction.
>

I fear you aren't aware of how we name the platform dirs. If a platform directory indicates an operating system, but not a version of an operating system, then it is meant to indicate "the most recent version of that operating system" (and it includes any older versions if the older version is either untested, or uncared about).  "chromium-win" is what we use on Windows 7 bots (there is no chromium-win-7 directory). "chromium-mac" is similarly equivalent to "chromium-mac-snowleopard" (and/or Lion, which we don't really support). Same thing is true for 'mac', and 'win'. 'qt' is just a regular platform, period.

So, what you said about clobbering 'mac' in the paragraph above is wrong. You would want to clobber the 'mac-leopard' and 'mac-tiger' versions of the results, but leave 'mac' and 'mac-snowleopard' (and we would assume that 'mac' now refers to Lion).

The only truly "dummy" platforms  are 'chromium' and 'chromium-gpu'.  
   
> > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/result.py:35
> > > > +class Result(object):
> > > > +    """Represents the result of a single test on a single platform"""
> > > 
> > > We already have an abstraction for Result.
> > 
> > Are you referring to the webkitpy/layout_tests/test_types module? Those classes seem like they have a different usage to the ones I'm introducing here - they have no state and perform functions on filenames and file contents passed in to them. There is no point in having more than one instance of those classes, whereas the Result classes defined here are stateful and are meant to represent Results (local or remote) for the purposes of comparing them.
> 

Ouch. My head hurts here at introducing this concept. I think what you actually want is something closer to the layout_tests.layout_package.test_output concept, possibly combined with the expected_* methods on the Port class.

Also, please don't call this "Result" - that is too generic and overloaded a term. Even if you were to call it a "TestResult", that's confusing, because is the object referring to the actual expected output, or the results of running the test and comparing against the expectations (e.g., "PASS"/"FAIL", etc.)? 

I've been trying to morph the layout tests code into using something a bit more consistent, so we should line things up here if we can. I would prefer something like "TestOutput" here.

As to the overlap between test_output, test_types, and the port/* routines, those ned to be cleaned up as well. My current thinking is that the test_types classes are legacy objects from much earlier versions of new-run-webkit-tests (predating the port/* abstraction completely), and just need to go away. I.e., it's an abstraction that is confusing things.

In the very near future we'll need to add support for audio files and tests, and this aspect of the design should really be cleaned up (I'm expecting by me) before then.

Also, the work we're doing with reftests will impact this as well, so it would be good to get up to speed on that if you aren't already (and I've cc'ed Ito-san on this to make sure he's aware of it as well).

Comment 13 James Kozianski 2010-12-01 19:10:50 PST

(In reply to comment #9)
> (In reply to comment #7)
> > Thanks for the prompt and thorough review, Adam!
> > 
> > (In reply to comment #6)
> > > (From update of attachment 74983 [details] [details] [details])
> > > View in context: https://bugs.webkit.org/attachment.cgi?id=74983&action=review
> > > 
> > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/builder.py:264
> > > > +class WebKitBuilder(object):
> > > > +    BUILDERS = {
> > > > +        'mac-snowleopard': 'SnowLeopard%20Intel%20Release%20(Tests)',
> > > > +        'mac-leopard': 'Leopard%20Intel%20Debug%20(Tests)',
> > > > +        'win': 'Windows%207%20Release%20(WebKit2%20Tests)',
> > > > +        'chromium-linux': 'GTK%20Linux%2032-bit%20Release',
> > > > +    }
> > > 
> > > webkitpy already knows about these things.  Please look at how the rest of the system knows about this stuff.
> > 
> > What I'm trying to express here is a mapping between platforms (as in the subdirectories of LayoutTests/platform) and the name of the builder that provides the 'canonical' test results for that platform. When a user wants to get new baselines for mac-snowleopard, what builder should those baselines be retrieved from? Looking through the code I can't find anywhere where such a mapping is defined - rebaseline_chromium_webkit_tests.py defines its own mapping for this kind of work, the current webkit-patch rebaseline script prompts the user to select a builder name and the port object itself provides a fallback order, but no mapping between platforms and builders.
> 
> We should never have let rebaseline_chromium_webkit_tests into the tree.  It's wrong at so many levels.
> 
> The mapping between LayoutTest/platform directories and buildbots probably belongs somewhere in the port abstraction.  You'll find that we already have two port abstractions (shame on us), but we should be able to teach one fo them to find it's buildbot.

Yep, I've moved these mappings into the webkit.py and chromium.py Port implementations (the layout_tests/port implementations.

> 
> > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:61
> > > > +    def _make_bucket_tree(self):
> > > > +        b = bucket.BucketTree()
> > > > +        b.b(None,
> > > > +            b.b('mac',
> > > > +                b.b('mac-snowleopard',
> > > > +                    b.b('mac-leopard',
> > > > +                        b.b('mac-tiger'))),
> > > > +                b.b('win',
> > > > +                    b.b('chromium',
> > > > +                        b.b('chromium-mac'),
> > > > +                        b.b('chromium-win',
> > > > +                            b.b('chromium-win-vista',
> > > > +                                b.b('chromium-win-xp')),
> > > > +                            b.b('chromium-linux',
> > > > +                                b.b('chromium-gpu',
> > > > +                                    b.b('chromium-gpu-linux'),
> > > > +                                    b.b('chromium-gpu-win'),
> > > > +                                    b.b('chromium-gpu-mac'))))))),
> > > > +            b.b('qt',
> > > > +                b.b('qt-linux'),
> > > > +                b.b('qt-mac'),
> > > > +                b.b('qt-win')),
> > > > +            b.b('gtk'))
> > > 
> > > Is this some kind of description of the structure of the expectations fallback tree?  If so, it should be in code that's shareable with the rest of webkitpy instead of siloed away here.
> > 
> > Yes, this is the fallback tree that is implied by the various implementations of Port.baseline_search_path() - it is needed to do correct deduping. I would like to see all the different implementations of that method go away and have the tree above be the authoritive fallback tree, but that seemed like it should happen in its own patch.
> 
> It's certainly fine to do that in a separate patch, but we're not going to want to make the situation worse by landing this patch before that one.  That's how we accumulate these technical debts.

Okay, no worries but from the rest of the comments it looks like this tree might be a bit simplistic. Once we know what the right thing to do is I'll either make a patch to rewrite the baseline_search_path() methods to be in terms of a tree like this one, or I'll generate a tree (or graph) from the baseline_search_path()s at runtime.

> 
> > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:63
> > > > +        AGGREGATE_BUCKETS = ['chromium', 'chromium-win', 'win', 'mac', 'qt', 'chromium-gpu']
> > > 
> > > Why is this magic list here?  We shouldn't have magic constants like this in the code.
> > 
> > These are the names of platforms that aren't targeted by any builders because they don't denote real platforms, merely convenient fallback targets. This is significant in deduping: if we have the same expectations in mac-leopard, mac-snowleopard and mac-tiger, but a different result in mac, then we can clobber the result in mac and delete the others. We couldn't do that if mac was a platform in itself, because then it having different expectations would be valid and not a duplication.
> > 
> > I've added a clarifying comment to explain the meaning of the list.
> 
> Ok, but how does that scale when we add more ports / platforms to the project?  Will someone have to hunt around all of webkitpy to find all the port-specific logic and update it?  It seems better to centralize that knowledge in the port abstraction.

I'm happy to have this tree live in Port, that's the right place for it. This won't lead to people needing to hunt for any port-specific stuff - all of the logic surrounding fallbacks (the structure and what platforms are real platforms versus placeholder platforms) will be defined in this tree, which will live in the Port base class.

> 
> > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/result.py:35
> > > > +class Result(object):
> > > > +    """Represents the result of a single test on a single platform"""
> > > 
> > > We already have an abstraction for Result.
> > 
> > Are you referring to the webkitpy/layout_tests/test_types module? Those classes seem like they have a different usage to the ones I'm introducing here - they have no state and perform functions on filenames and file contents passed in to them. There is no point in having more than one instance of those classes, whereas the Result classes defined here are stateful and are meant to represent Results (local or remote) for the purposes of comparing them.
> 
> I meant something in buildbot.py.  I'm not sure exactly what.  The current abstraction might be something as simple as a string, but that doesn't mean it shouldn't be improved.

I've added a results() method to the buildbot.Build class, which returns the ResultSet of that build's '-actual' test results. The only thing I can find in buildbot.py that deals with tests is Build.layout_test_results(), which returns a LayoutTestResults object, which deals with test names, but I don't think it is better to have that return Results instead.

> 
> > > > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:29
> > > > +from webkitpy.layout_tests.port import factory
> > > 
> > > webkitpy.tool cannot import from webkitpy.layout_tests.  Code that's shared between these two packages should go in webkitpy.common.
> > 
> > Ah, okay - I was taking inspiration from the original rebaseline.py. I've moved the rebaseline package out of layout_tests and into webkitpy/tool/commands/rebaseline2.
> 
> Do you mean http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitpy/tool/commands/rebaseline.py ?
> 
> The only thing that file imports from layout_tests is webkitpy.layout_tests.port, but that's because webkitpy.layout_tests.port shouldn't be in layout_tests.  :P
> 
> Some of these invariants aren't perfect, but we'd like to make them better over time.

Comment 14 James Kozianski 2010-12-01 19:51:43 PST

(In reply to comment #10)
> > > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:61
> > > > > +    def _make_bucket_tree(self):
> > > > > +        b = bucket.BucketTree()
> > > > > +        b.b(None,
> > > > > +            b.b('mac',
> > > > > +                b.b('mac-snowleopard',
> > > > > +                    b.b('mac-leopard',
> > > > > +                        b.b('mac-tiger'))),
> > > > > +                b.b('win',
> > > > > +                    b.b('chromium',
> > > > > +                        b.b('chromium-mac'),
> 
> It's a little more complicated than this. chromium-mac falls back to chromium, then does the same fallback order as upstream mac (i.e. !win). See http://codesearch.google.com/codesearch/p#OAMlx_jo-ck/src/third_party/WebKit/WebKitTools/Scripts/webkitpy/layout_tests/port/chromium_mac.py&l=52&exact_package=chromium.

Yeah, this tree was a bit of a simplification. I was wondering if it was possible / a good idea to change our test fallback order so that it does make a tree? In the likely case that it isn't I could change my code to handle DAGs... I think it should be possible to extend the algorithm I use here.

> > > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:63
> > > > > +        AGGREGATE_BUCKETS = ['chromium', 'chromium-win', 'win', 'mac', 'qt', 'chromium-gpu']
> > > > 
> > > > Why is this magic list here?  We shouldn't have magic constants like this in the code.
> > > 
> > > These are the names of platforms that aren't targeted by any builders because they don't denote real platforms, merely convenient fallback targets. This is significant in deduping: if we have the same expectations in mac-leopard, mac-snowleopard and mac-tiger, but a different result in mac, then we can clobber the result in mac and delete the others. We couldn't do that if mac was a platform in itself, because then it having different expectations would be valid and not a duplication.
> > > 
> > > I've added a clarifying comment to explain the meaning of the list.
> > 
> > Ok, but how does that scale when we add more ports / platforms to the project?  Will someone have to hunt around all of webkitpy to find all the port-specific logic and update it?  It seems better to centralize that knowledge in the port abstraction.
> 
> I agree that this logic should be in shared code, probably in the port abstraction. I like this better than what we have in port right now though. In order to understand each platform's fallback rules, you need to look in a bunch of different files and it's very complicated. Having the data in one file would be great. That said, this could just be the underlying data used by the existing port code. For example, the base port class could have this mapping and the baseline_search_path implementations could just index into the mapping. 
> 
> What we lose here is that each port file completely encapsulates the data for that port, but the increased code clarity is worth it IMO.
> 
> In either case, that seems like it could easily be a separate patch. In the interested of avoiding too much yak shaving, how about you rewrite this patch to use baseline_search_path and leave refactoring this stuff as a FIXME?

I'm not sure it's easy to use baseline_search_path() instead of the current code - if I just use one port's baseline_search_path() then it won't be broad enough to be useful[1], and if combine all the baseline_search_path()s then it doesn't form a tree. I'll investigate making my deduping algorithm work for DAGs, and then I can generate the DAG from the baseline_search_paths(), and then as a next step replace the baseline_search_path() implementations with a single one that walks the DAG to calculate its fallback path.

[1] eg: I'm on snowleopard now, so my baseline_search_path() is [mac-snowleopard, mac] which should never cause deduping because we don't know anything about mac-leopard or the other paths that fallback onto mac.

Comment 15 James Kozianski 2010-12-01 19:56:32 PST

(In reply to comment #11)
> (From update of attachment 75263 [details])
> I just started looking at this patch (sorry for the delay!), so I will confine myself to a couple of comments on topics others haven't raised, and then reply to some of the other bug comments.
> 
> Mihai says that the intent of this is to replace both webkit-patch rebaseline and rebaseline-chromium-webkit-tests. If so, great! The latter code is poorly tested and hard to maintain, so it'll be nice to have something theoretically better :)

Cool! :-)

> 
> 
> > WebKitTools/Scripts/webkitpy/common/system/filesystem.py:126
> > +        shutil.copyfile(src, dest)
> 
> Why copy_file instead of copyfile(), or remove_file() instead of unlink or remove? I would prefer to have the methods be the same as the underlying package/class method names, to minimize the cognitive overhead.

I called them *_file because it seemed to fit the above methods in filesystem.py (read_text_file, write_text_file, ...), but then again there are a bunch of methods higher up that are named after the functions they call through to, so I've renamed them 'copyfile' and 'remove'.

Comment 16 James Kozianski 2010-12-01 21:50:04 PST

(In reply to comment #12)
> (In reply to comment #9)
> > 
> > The mapping between LayoutTest/platform directories and buildbots probably belongs somewhere in the port abstraction.  You'll find that we already have two port abstractions (shame on us), but we should be able to teach one fo them to find it's buildbot.
> >
> 
> In the absence of an obviously better place to put this, I'm inclined to agree that this belongs as a method on layout_tests/port/*, since those files are likely the basis for whatever persists into the future. That said, my comment above stands, in that this is code that has nothing to do w/ actually running the tests. I think we need to figure out how to break the Port objects up into composites.
> 
> > > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:61
> > > > > +    def _make_bucket_tree(self):
> > > > > +        b = bucket.BucketTree()
> > > > > +        b.b(None,
> > > > > +            b.b('mac',
> > > > > +                b.b('mac-snowleopard',
> > > > > +                    b.b('mac-leopard',
> > > > > +                        b.b('mac-tiger'))),
> > > > > +                b.b('win',
> > > > > +                    b.b('chromium',
> > > > > +                        b.b('chromium-mac'),
> > > > > +                        b.b('chromium-win',
> > > > > +                            b.b('chromium-win-vista',
> > > > > +                                b.b('chromium-win-xp')),
> > > > > +                            b.b('chromium-linux',
> > > > > +                                b.b('chromium-gpu',
> > > > > +                                    b.b('chromium-gpu-linux'),
> > > > > +                                    b.b('chromium-gpu-win'),
> > > > > +                                    b.b('chromium-gpu-mac'))))))),
> > > > > +            b.b('qt',
> > > > > +                b.b('qt-linux'),
> > > > > +                b.b('qt-mac'),
> > > > > +                b.b('qt-win')),
> > > > > +            b.b('gtk'))
> > > > 
> > > > Is this some kind of description of the structure of the expectations fallback tree?  If so, it should be in code that's shareable with the rest of webkitpy instead of siloed away here.
> > > 
> > > Yes, this is the fallback tree that is implied by the various implementations of Port.baseline_search_path() - it is needed to do correct deduping. I would like to see all the different implementations of that method go away and have the tree above be the authoritive fallback tree, but that seemed like it should happen in its own patch.
> > 
> > It's certainly fine to do that in a separate patch, but we're not going to want to make the situation worse by landing this patch before that one.  That's how we accumulate these technical debts.
> > 
> 
> As Ojan has already pointed out, this logic is wrong (e.g., chromium-mac doesn't fall back to win). 
> 
> Unfortunately, I don't think you can even draw the logic for all of the ports as a single tree, so this approach might be fatally flawed. At least it's a DAG :)
> 
> I agree with Ojan that attempting to figure out the logic in the current codebase is complicated. And, I agree that we don't want to lose port.baseline_search_path() as at least an entry point. 
> 
> Maybe this is something best solved with some form of composition, or mixins, or multiple inheritance. Not sure.

If I change my data structure to a DAG then we can generate it from the different baseline_search_path() implementations at runtime, and then as a next step we could replace all the implementations of baseline_search_path() with something like this...

class Port(object):
    ....
    def baseline_search_path(self):
        return DAG_OF_ALL_FALLBACK_PATHS.path_to_root_from(self.name())

> 
> I also agree with Adam that it would be really nice if we didn't have two sources of truth in the tree at the same time. That said, I would be okay with it as long as you agreed to immediately fix things after this feature landed. It would be better if we could refactor/change the port code first, in a separate patch, though.

Yep, I'll try to get the DAG happening, then we won't need the two sources of truth.

> 
> > > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/rebaseliner.py:63
> > > > > +        AGGREGATE_BUCKETS = ['chromium', 'chromium-win', 'win', 'mac', 'qt', 'chromium-gpu']
> > > > 
> > > > Why is this magic list here?  We shouldn't have magic constants like this in the code.
> > > 
> > > These are the names of platforms that aren't targeted by any builders because they don't denote real platforms, merely convenient fallback targets. This is significant in deduping: if we have the same expectations in mac-leopard, mac-snowleopard and mac-tiger, but a different result in mac, then we can clobber the result in mac and delete the others. We couldn't do that if mac was a platform in itself, because then it having different expectations would be valid and not a duplication.
> > > 
> > > I've added a clarifying comment to explain the meaning of the list.
> > 
> > Ok, but how does that scale when we add more ports / platforms to the project?  Will someone have to hunt around all of webkitpy to find all the port-specific logic and update it?  It seems better to centralize that knowledge in the port abstraction.
> >
> 
> I fear you aren't aware of how we name the platform dirs. If a platform directory indicates an operating system, but not a version of an operating system, then it is meant to indicate "the most recent version of that operating system" (and it includes any older versions if the older version is either untested, or uncared about).  "chromium-win" is what we use on Windows 7 bots (there is no chromium-win-7 directory). "chromium-mac" is similarly equivalent to "chromium-mac-snowleopard" (and/or Lion, which we don't really support). Same thing is true for 'mac', and 'win'. 'qt' is just a regular platform, period.
> 
> So, what you said about clobbering 'mac' in the paragraph above is wrong. You would want to clobber the 'mac-leopard' and 'mac-tiger' versions of the results, but leave 'mac' and 'mac-snowleopard' (and we would assume that 'mac' now refers to Lion).
> 
> The only truly "dummy" platforms  are 'chromium' and 'chromium-gpu'.  

I see, so the mac fallback order [mac-tiger, mac-leopard, mac-snowleopard, mac] is meant to be searching from the oldest revisions to the newest revisions of mac. Thanks for the explanation.

So this would imply that nothing in the mac directory will ever get overwritten by this rebaseline script because we don't have mac-lion builders? IE: "SnowLeopard Intel Release (Tests)" would have its results sent to mac-snowleopard?

> 
> > > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/result.py:35
> > > > > +class Result(object):
> > > > > +    """Represents the result of a single test on a single platform"""
> > > > 
> > > > We already have an abstraction for Result.
> > > 
> > > Are you referring to the webkitpy/layout_tests/test_types module? Those classes seem like they have a different usage to the ones I'm introducing here - they have no state and perform functions on filenames and file contents passed in to them. There is no point in having more than one instance of those classes, whereas the Result classes defined here are stateful and are meant to represent Results (local or remote) for the purposes of comparing them.
> > 
> 
> Ouch. My head hurts here at introducing this concept. I think what you actually want is something closer to the layout_tests.layout_package.test_output concept, possibly combined with the expected_* methods on the Port class.
> 
> Also, please don't call this "Result" - that is too generic and overloaded a term. Even if you were to call it a "TestResult", that's confusing, because is the object referring to the actual expected output, or the results of running the test and comparing against the expectations (e.g., "PASS"/"FAIL", etc.)? 

Maybe we can use the term 'baseline' to mean the thing that this class represents, ie: something that the output of a test can be compared against to determine PASS/FAIL?

That seems to fit with the general phrase 'rebaselining' - it's taking baselines from builders and using them to overwrite our existing baselines. What do you think?

> 
> I've been trying to morph the layout tests code into using something a bit more consistent, so we should line things up here if we can. I would prefer something like "TestOutput" here.

Could you be a bit more specific? How could I change Results to be more suitable?

> 
> As to the overlap between test_output, test_types, and the port/* routines, those ned to be cleaned up as well. My current thinking is that the test_types classes are legacy objects from much earlier versions of new-run-webkit-tests (predating the port/* abstraction completely), and just need to go away. I.e., it's an abstraction that is confusing things.
> 
> In the very near future we'll need to add support for audio files and tests, and this aspect of the design should really be cleaned up (I'm expecting by me) before then.
> 
> Also, the work we're doing with reftests will impact this as well, so it would be good to get up to speed on that if you aren't already (and I've cc'ed Ito-san on this to make sure he's aware of it as well).

I'm not familiar with reftests. Could you point me to somewhere I could learn more about them?

Comment 17 Dirk Pranke 2010-12-02 17:52:08 PST

(In reply to comment #14)
> I was wondering if it was possible / a good idea to change our test fallback order so that it does make a tree? 

Yes, it is theoretically possible. I have no idea how much work it would be. I do think it would be a good idea to evaluate it and see, because the current logic causes no end of confusion.

Not as part of this patch, though :)

> 
> I'm not sure it's easy to use baseline_search_path() instead of the current code - if I just use one port's baseline_search_path() then it won't be broad enough to be useful[1], and if combine all the baseline_search_path()s then it doesn't form a tree. I'll investigate making my deduping algorithm work for DAGs, and then I can generate the DAG from the baseline_search_paths(), and then as a next step replace the baseline_search_path() implementations with a single one that walks the DAG to calculate its fallback path.
> 
> [1] eg: I'm on snowleopard now, so my baseline_search_path() is [mac-snowleopard, mac] which should never cause deduping because we don't know anything about mac-leopard or the other paths that fallback onto mac.

Yes, Victor and I had a lot of discussions about how to do this when he was implementing/maintaining rebaseline-chromium-webkit-tests. You end up carrying around multiple Port objects - the port you're actually running on (to be able to diff images, etc.), the Port you're manipulating results for, and I think we even needed another one for some reason. I suggest you look carefully at that code.

Comment 18 Dirk Pranke 2010-12-02 17:56:21 PST

(In reply to comment #7)
> > > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:53
> > > +            make_option("--chromium", action="store_true", dest="use_chromium_bots", help="Fetch results from the Chromium bots."),
> > 
> > Boo.  We should use --port to specify the port instead of making a magic special case for Chromium.
> 
> Sorry, I was imitating build-webkit's --chromium flag here. I've added Port.buildbot_resultset() so the buildbots used are controlled via the port.
>

Well, --platform, actually, not --port, but otherwise I agree that these switches probably don't make sense in this context.

--chromium is useful in the run-webkit-tests case because we run on multiple operating systems, and so it enables scripts to specify that they want the chromium version regardless of what operating system is actually running.

Adam was half jesting, but one could almost argue that --webkit might make sense for the same reason, if you were attempting to define --webkit as (mac|win). But those two ports are different enough from each other that even the Apple guys don't tend to think of them that way AFAIK.

Comment 19 Dirk Pranke 2010-12-02 18:24:23 PST

(In reply to comment #16)
> I see, so the mac fallback order [mac-tiger, mac-leopard, mac-snowleopard, mac] is meant to be searching from the oldest revisions to the newest revisions of mac. Thanks for the explanation.
> 
> So this would imply that nothing in the mac directory will ever get overwritten by this rebaseline script because we don't have mac-lion builders? IE: "SnowLeopard Intel Release (Tests)" would have its results sent to mac-snowleopard?
> 

Well, it's conceivable that someone might have access to a Lion builder and want to use this script. More generally, I was trying to describe the overall method we use for handling operating system versions.

Now, SL/Lion is an interesting case since we know it will arrive "soonish". I just asked on #webkit to attempt to clarify this. The answer I got back was that platform/mac should be used to refer to "the newest version of the mac operating system". In particular, those of us who don't have access to Lion have no way of knowing whether or not our tests (or results) are SL-specific, or if they will be the same for both Lion and SL (in which case, they should be in platform/mac).

Basically, only people w/ access to Lion who know what they are doing should ever put something into mac-snowleopard. The rest of us should assume mac-snowleopard is read/delete-only (not write).

As to how you address this in the code, I have no brilliant idea at the moment. The best idea I have is to distinguish the "read" path from the "write" path, maybe through a flag passed to baseline_search_path. I suggest that for now, for this patch, assume either mac == lion and we have access to it, or mac == snow-leopard, and pretend that mac-snowleopard doesn't exist.

However, I just clarified the i
> > 
> > > > > > WebKitTools/Scripts/webkitpy/layout_tests/rebaseline/result.py:35
> > > > > > +class Result(object):
> > > > > > +    """Represents the result of a single test on a single platform"""
> > > > > 
> > > > > We already have an abstraction for Result.
> > > > 
> > > > Are you referring to the webkitpy/layout_tests/test_types module? Those classes seem like they have a different usage to the ones I'm introducing here - they have no state and perform functions on filenames and file contents passed in to them. There is no point in having more than one instance of those classes, whereas the Result classes defined here are stateful and are meant to represent Results (local or remote) for the purposes of comparing them.
> > > 
> > 
> > Ouch. My head hurts here at introducing this concept. I think what you actually want is something closer to the layout_tests.layout_package.test_output concept, possibly combined with the expected_* methods on the Port class.
> > 
> > Also, please don't call this "Result" - that is too generic and overloaded a term. Even if you were to call it a "TestResult", that's confusing, because is the object referring to the actual expected output, or the results of running the test and comparing against the expectations (e.g., "PASS"/"FAIL", etc.)? 
> 
> Maybe we can use the term 'baseline' to mean the thing that this class represents, ie: something that the output of a test can be compared against to determine PASS/FAIL?
> 
> That seems to fit with the general phrase 'rebaselining' - it's taking baselines from builders and using them to overwrite our existing baselines. What do you think?
> 
> > 
> > I've been trying to morph the layout tests code into using something a bit more consistent, so we should line things up here if we can. I would prefer something like "TestOutput" here.
> 
> Could you be a bit more specific? How could I change Results to be more suitable?
> 
> > 
> > As to the overlap between test_output, test_types, and the port/* routines, those ned to be cleaned up as well. My current thinking is that the test_types classes are legacy objects from much earlier versions of new-run-webkit-tests (predating the port/* abstraction completely), and just need to go away. I.e., it's an abstraction that is confusing things.
> > 
> > In the very near future we'll need to add support for audio files and tests, and this aspect of the design should really be cleaned up (I'm expecting by me) before then.
> > 
> > Also, the work we're doing with reftests will impact this as well, so it would be good to get up to speed on that if you aren't already (and I've cc'ed Ito-san on this to make sure he's aware of it as well).
> 
> I'm not familiar with reftests. Could you point me to somewhere I could learn more about them?

Comment 20 Dirk Pranke 2010-12-02 18:25:49 PST

(In reply to comment #19)
> As to how you address this in the code, I have no brilliant idea at the moment. The best idea I have is to distinguish the "read" path from the "write" path, maybe through a flag passed to baseline_search_path. I suggest that for now, for this patch, assume either mac == lion and we have access to it, or mac == snow-leopard, and pretend that mac-snowleopard doesn't exist.
> 

Whoops. Hit return too soon. Ignore everything below following the above paragraph.

Comment 21 Dirk Pranke 2010-12-02 18:33:02 PST

(In reply to comment #16)
> Maybe we can use the term 'baseline' to mean the thing that this class represents, ie: something that the output of a test can be compared against to determine PASS/FAIL?
> 
> That seems to fit with the general phrase 'rebaselining' - it's taking baselines from builders and using them to overwrite our existing baselines. What do you think?
>

I'm not a huge fan of this approach, because you get the concepts of "expected baseline" and "actual baseline", one of which seems like an oxymoron and the other redundantly redundant.

I think "output" is my favorite candidate so far, but admittedly it's not great either.

Also, I forgot to mention that I can't see ResultSet without thinking about JDBC and ODBC, but maybe that's just me ... I haven't thought of anything better yet as an alternative.
 
> > 
> > I've been trying to morph the layout tests code into using something a bit more consistent, so we should line things up here if we can. I would prefer something like "TestOutput" here.
> 
> Could you be a bit more specific? How could I change Results to be more suitable?
>

Unfortunately, not at the moment, as I haven't really even started to play with these ideas yet. Hopefully more next week. I need to stare more at how you're using Result to think about commonalities and differences.
 
> > Also, the work we're doing with reftests will impact this as well, so it would be good to get up to speed on that if you aren't already (and I've cc'ed Ito-san on this to make sure he's aware of it as well).
> 
> I'm not familiar with reftests. Could you point me to somewhere I could learn more about them?

reftests are the idea that foo.html should render exactly like foo-expected.html (the "reference" version). They allow you to write a large class of tests that can stay pretty platform-independent (for example, because both tests will draw scroll bars the same way). Mozilla uses them pretty extensively. 

They don't replace every kind of pixel test, though.

Ito-san is working on adding support to NRWT for them. See bug 36065 for more info and references to design, Mozilla's docs, etc.

Comment 22 Dirk Pranke 2010-12-02 18:35:58 PST

(In reply to comment #19)
> As to how you address this in the code, I have no brilliant idea at the moment. The best idea I have is to distinguish the "read" path from the "write" path, maybe through a flag passed to baseline_search_path. I suggest that for now, for this patch, assume either mac == lion and we have access to it, or mac == snow-leopard, and pretend that mac-snowleopard doesn't exist.

I meant to add that assuming mac == lion should address the lion's share of the rebaselining we need to do.

Comment 23 James Kozianski 2010-12-07 21:40:14 PST

Created attachment 75869 [details]
Patch

Comment 24 James Kozianski 2010-12-07 21:48:22 PST

(In reply to comment #23)
> Created an attachment (id=75869) [details]
> Patch

The main changes in this patch are:

- Deduping algorithm now works for DAGs, as opposed to trees.
- Removed big, explicit platform fallback order - now we use one generated from the Port.baseline_search_path()s defined in the Port subclasses.
- Added methods to Port for retrieving ResultSets.

Comment 25 James Kozianski 2010-12-07 21:53:56 PST

(In reply to comment #17)
> (In reply to comment #14)
> > I'm not sure it's easy to use baseline_search_path() instead of the current code - if I just use one port's baseline_search_path() then it won't be broad enough to be useful[1], and if combine all the baseline_search_path()s then it doesn't form a tree. I'll investigate making my deduping algorithm work for DAGs, and then I can generate the DAG from the baseline_search_paths(), and then as a next step replace the baseline_search_path() implementations with a single one that walks the DAG to calculate its fallback path.
> > 
> > [1] eg: I'm on snowleopard now, so my baseline_search_path() is [mac-snowleopard, mac] which should never cause deduping because we don't know anything about mac-leopard or the other paths that fallback onto mac.
> 
> Yes, Victor and I had a lot of discussions about how to do this when he was implementing/maintaining rebaseline-chromium-webkit-tests. You end up carrying around multiple Port objects - the port you're actually running on (to be able to diff images, etc.), the Port you're manipulating results for, and I think we even needed another one for some reason. I suggest you look carefully at that code.

Yes, it is very tricky. In my patch I've tried to take out the stuff from Port that I need (ie: the image diffing functionality and the baseline search paths) so the only time I need to reference / depend on them is in the top level rebaseliner2.py.

Comment 26 James Kozianski 2010-12-07 22:00:04 PST

(In reply to comment #18)
> (In reply to comment #7)
> > > > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2.py:53
> > > > +            make_option("--chromium", action="store_true", dest="use_chromium_bots", help="Fetch results from the Chromium bots."),
> > > 
> > > Boo.  We should use --port to specify the port instead of making a magic special case for Chromium.
> > 
> > Sorry, I was imitating build-webkit's --chromium flag here. I've added Port.buildbot_resultset() so the buildbots used are controlled via the port.
> >
> 
> Well, --platform, actually, not --port, but otherwise I agree that these switches probably don't make sense in this context.
> 
> --chromium is useful in the run-webkit-tests case because we run on multiple operating systems, and so it enables scripts to specify that they want the chromium version regardless of what operating system is actually running.
> 
> Adam was half jesting, but one could almost argue that --webkit might make sense for the same reason, if you were attempting to define --webkit as (mac|win). But those two ports are different enough from each other that even the Apple guys don't tend to think of them that way AFAIK.

I'm currently using the --platform*s* flag to specify what platforms to retrieve new baselines for - are you okay with --platform and --platforms existing?

Comment 27 WebKit Review Bot 2010-12-07 22:01:08 PST

Attachment 75869 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/update-webkit']" exit_code: 2
Updating OpenSource
Incomplete data: Delta source ended unexpectedly at /usr/lib/git-core/git-svn line 5061

Died at WebKitTools/Scripts/update-webkit line 132.


If any of these errors are false positives, please file a bug against check-webkit-style.

Comment 28 James Kozianski 2010-12-07 22:06:00 PST

(In reply to comment #21)
> (In reply to comment #16)
> > Maybe we can use the term 'baseline' to mean the thing that this class represents, ie: something that the output of a test can be compared against to determine PASS/FAIL?
> > 
> > That seems to fit with the general phrase 'rebaselining' - it's taking baselines from builders and using them to overwrite our existing baselines. What do you think?
> >
> 
> I'm not a huge fan of this approach, because you get the concepts of "expected baseline" and "actual baseline", one of which seems like an oxymoron and the other redundantly redundant.
> 
> I think "output" is my favorite candidate so far, but admittedly it's not great either.
> 
> Also, I forgot to mention that I can't see ResultSet without thinking about JDBC and ODBC, but maybe that's just me ... I haven't thought of anything better yet as an alternative.

How about TestOutput? That is probably the most literal name we could have, and it is a bit less ambiguous than just Output.

> > > Also, the work we're doing with reftests will impact this as well, so it would be good to get up to speed on that if you aren't already (and I've cc'ed Ito-san on this to make sure he's aware of it as well).
> > 
> > I'm not familiar with reftests. Could you point me to somewhere I could learn more about them?
> 
> reftests are the idea that foo.html should render exactly like foo-expected.html (the "reference" version). They allow you to write a large class of tests that can stay pretty platform-independent (for example, because both tests will draw scroll bars the same way). Mozilla uses them pretty extensively. 
> 
> They don't replace every kind of pixel test, though.
> 
> Ito-san is working on adding support to NRWT for them. See bug 36065 for more info and references to design, Mozilla's docs, etc.

Cool, cheers.

Comment 29 James Kozianski 2010-12-07 22:25:42 PST

Created attachment 75872 [details]
Patch

Comment 30 Dirk Pranke 2010-12-08 13:31:23 PST

(In reply to comment #26)
> I'm currently using the --platform*s* flag to specify what platforms to retrieve new baselines for - are you okay with --platform and --platforms existing?

Yes.

Comment 31 Dirk Pranke 2010-12-08 13:34:45 PST

(In reply to comment #28)
> 
> How about TestOutput? That is probably the most literal name we could have, and it is a bit less ambiguous than just Output.
> 

I'm good with TestOutput.

Aplogies for the delay in responding ... I'm not sure why I didn't see your comments until just now.

Patch looks pretty good otherwise, although I haven't yet stared at it in detail. It's a bit under-commented for my tastes, but so is the rest of WebKit. Can you either rename one of rebaseline.py or rebaseliner.py or at least add some docs at the top to indicate why there are two files and which is which? Maybe you should just combine them into a single file and/or call the class RebaselineCommand instead of Rebaseline2 ?

Comment 32 Dirk Pranke 2010-12-08 18:19:09 PST

After a decently-long IM conversation w/ James, I've reached the conclusion that what his code is trying to do is different enough from what the code under layout_tests is designed to do that merging the two will be non-trivial.

Where easy, he should use (and add to) the Port abstractions for now, but we need to step back and come up with a common design that works for both run-webkit-tests (which wants to pretend that all it knows about are test names, and test output, and knows nothing about files) and the rebaselining tools (which only knows about filenames and has to try and reconstruct how they map onto test names and outputs).

A resulting design will probably affect, in addition to James' code:

layout_tests/test_types/*
layout_tests/layout_package/gather_files.py
layout_tests/layout_package/test_output.py
layout_tests/port/* (the methods dealing with test name conversions and expectations, at least)

the work Ito-san is doing with reftests.

However, all that stuff is down the road and shouldn't hold up this patch. I will take another look later at the code, with that in mind, but since I'm not a reviewer, someone else should review this bearing what I've just written in mind.

Comment 33 Adam Barth 2010-12-08 19:36:18 PST

Comment on attachment 75872 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=75872&action=review

This looks much better than the previous version.  Thanks for integrating your work with webkitpy.common.  That's very valuable to the project.  I'd like Eric to look over this patch too because he's more familiar with some of these pieces.  Overall, though, I'm happy with this version.  Thanks for working on the tools.  I'd be more than happy to remove the old rebaseline commands if folks like this one better.

> WebKitTools/Scripts/webkitpy/common/diraszip.py:29
> +from __future__ import with_statement
> +from webkitpy.common.net.remotezip import ZipFileHandle
> +from webkitpy.common.system.filesystem import FileSystem
> +import os
> +import shutil

This is totally a style nit, but we usually import the Python standard library stuff first, then have a blank line, then the webkitpy stuff.

> WebKitTools/Scripts/webkitpy/common/diraszip.py:32
> +class DirectoryAsZip(object):

The file name should match the class name.

> WebKitTools/Scripts/webkitpy/common/net/result.py:34
> +class Result(object):

We really need to make a buildbot folder to put all these buildbot related classes in.  Probably not worth worrying about in this patch, but just a thought.

> WebKitTools/Scripts/webkitpy/common/net/resultset.py:38
> +        self._include_expected = kwargs.get('include_expected', True)

why kwargs and not just include_expected=True ?

> WebKitTools/Scripts/webkitpy/common/net/resultset.py:74
> +        target_type = kwargs.get('target_type', None)
> +        exact_match = kwargs.get('exact_match', False)

Here again kwargs seems like overkill.

> WebKitTools/Scripts/webkitpy/common/net/resultset_unittest.py:111
> +    def testTestExtensionIsIgnored(self):

We usually use unix_hacker_style names for tests, but whatever.

> WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2/bucket.py:26
> +class IndentedLogger(object):

This doesn't seem specific to this command.  It should be in common somewhere in case someone else wants to use an indented logger.  ;)

> WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2/rebaseline2.py:76
> +            i = options.use_zip_as_archive.index(':')
> +            platform, url = options.use_zip_as_archive[:i], options.use_zip_as_archive[i + 1:]

crazy

> WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2/rebaseline2.py:91
> +        del(ports['test'])
> +        del(ports['dryrun'])

This doesn't seem scalable.  Can we add something semantic to this port objects so we can know to disregard them?

> WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2/rebaseliner.py:141
> +        for k, v in self.calculate_rebaseline().items():

Can we use more semantic names than k and v here?

Comment 34 Dirk Pranke 2010-12-08 20:05:08 PST

(In reply to comment #33)
> 
> > WebKitTools/Scripts/webkitpy/common/net/resultset_unittest.py:111
> > +    def testTestExtensionIsIgnored(self):
> 
> We usually use unix_hacker_style names for tests, but whatever.
>

More accurately, we try to follow PEP 8's style guidelines, which use underscores instead of mixed case for function and method names. We differ from Chromium (and Google) in this regard.
 
> > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2/bucket.py:26
> > +class IndentedLogger(object):
> 
> This doesn't seem specific to this command.  It should be in common somewhere in case someone else wants to use an indented logger.  ;)
> 
> > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2/rebaseline2.py:76
> > +            i = options.use_zip_as_archive.index(':')
> > +            platform, url = options.use_zip_as_archive[:i], options.use_zip_as_archive[i + 1:]
> 
> crazy

Can you get this with something like options.use_zip_as_archive.index.split(':'), or might there be more than one colon in the value?

> 
> > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2/rebaseline2.py:91
> > +        del(ports['test'])
> > +        del(ports['dryrun'])
> 
> This doesn't seem scalable.  Can we add something semantic to this port objects so we can know to disregard them?
>

This will be fixed as part of bug 47578, which I should be addressing shortly. I wouldn't bother doing anything about it here.

Comment 35 James Kozianski 2010-12-09 13:33:57 PST

(In reply to comment #33)
> (From update of attachment 75872 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=75872&action=review
> 
> This looks much better than the previous version.  Thanks for integrating your work with webkitpy.common.  That's very valuable to the project.  I'd like Eric to look over this patch too because he's more familiar with some of these pieces.  Overall, though, I'm happy with this version.  Thanks for working on the tools.  I'd be more than happy to remove the old rebaseline commands if folks like this one better.
> 
> > WebKitTools/Scripts/webkitpy/common/diraszip.py:29
> > +from __future__ import with_statement
> > +from webkitpy.common.net.remotezip import ZipFileHandle
> > +from webkitpy.common.system.filesystem import FileSystem
> > +import os
> > +import shutil
> 
> This is totally a style nit, but we usually import the Python standard library stuff first, then have a blank line, then the webkitpy stuff.

Done.

> 
> > WebKitTools/Scripts/webkitpy/common/diraszip.py:32
> > +class DirectoryAsZip(object):
> 
> The file name should match the class name.

Renamed class to DirAsZip.

> 
> > WebKitTools/Scripts/webkitpy/common/net/resultset.py:38
> > +        self._include_expected = kwargs.get('include_expected', True)
> 
> why kwargs and not just include_expected=True ?

I think using keyword arguments here improves readability at the call sites, and it will be easier to add more orthogonal arguments later if the need arises.

> 
> > WebKitTools/Scripts/webkitpy/common/net/resultset.py:74
> > +        target_type = kwargs.get('target_type', None)
> > +        exact_match = kwargs.get('exact_match', False)
> 
> Here again kwargs seems like overkill.

As above, I think it makes the call sites nicer, but I'm happy to change it if you feel strongly about it.

> 
> > WebKitTools/Scripts/webkitpy/common/net/resultset_unittest.py:111
> > +    def testTestExtensionIsIgnored(self):
> 
> We usually use unix_hacker_style names for tests, but whatever.

That's fine, it won't be too much effort to fix.

> 
> > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2/bucket.py:26
> > +class IndentedLogger(object):
> 
> This doesn't seem specific to this command.  It should be in common somewhere in case someone else wants to use an indented logger.  ;)

Haha, of course :-) I put it in webkitpy/common/indented_logger.py

> 
> > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2/rebaseline2.py:76
> > +            i = options.use_zip_as_archive.index(':')
> > +            platform, url = options.use_zip_as_archive[:i], options.use_zip_as_archive[i + 1:]
> 
> crazy
> 
> > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2/rebaseline2.py:91
> > +        del(ports['test'])
> > +        del(ports['dryrun'])
> 
> This doesn't seem scalable.  Can we add something semantic to this port objects so we can know to disregard them?
> 
> > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2/rebaseliner.py:141
> > +        for k, v in self.calculate_rebaseline().items():
> 
> Can we use more semantic names than k and v here?

Done.

Comment 36 James Kozianski 2010-12-09 13:36:44 PST

(In reply to comment #34)
> (In reply to comment #33)
> > > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2/rebaseline2.py:76
> > > +            i = options.use_zip_as_archive.index(':')
> > > +            platform, url = options.use_zip_as_archive[:i], options.use_zip_as_archive[i + 1:]
> > 
> > crazy
> 
> Can you get this with something like options.use_zip_as_archive.index.split(':'), or might there be more than one colon in the value?

It could, depending on what the user names their zips. I remember changing it from .split(':') because I broke something.

> 
> > 
> > > WebKitTools/Scripts/webkitpy/tool/commands/rebaseline2/rebaseline2.py:91
> > > +        del(ports['test'])
> > > +        del(ports['dryrun'])
> > 
> > This doesn't seem scalable.  Can we add something semantic to this port objects so we can know to disregard them?
> >
> 
> This will be fixed as part of bug 47578, which I should be addressing shortly. I wouldn't bother doing anything about it here.

Ok.

Comment 37 James Kozianski 2010-12-09 13:49:35 PST

(In reply to comment #31)
> (In reply to comment #28)
> > 
> > How about TestOutput? That is probably the most literal name we could have, and it is a bit less ambiguous than just Output.
> > 
> 
> I'm good with TestOutput.
> 
> Aplogies for the delay in responding ... I'm not sure why I didn't see your comments until just now.
> 
> Patch looks pretty good otherwise, although I haven't yet stared at it in detail. It's a bit under-commented for my tastes, but so is the rest of WebKit. Can you either rename one of rebaseline.py or rebaseliner.py or at least add some docs at the top to indicate why there are two files and which is which? Maybe you should just combine them into a single file and/or call the class RebaselineCommand instead of Rebaseline2 ?

I called it rebaseline2 to distinguish it from the current rebaseline command (which is plain Rebaseline). I'll add comments to clarify.

Comment 38 James Kozianski 2010-12-09 14:01:09 PST

Created attachment 76120 [details]
Patch

Comment 39 Eric Seidel (no email) 2010-12-09 18:48:58 PST

Comment on attachment 76120 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=76120&action=review

I really feel like we're doing way too much in one patch here.  Maybe you and I should talk in person?  I'm glad that this is no longer re-inventing the wheel, but I'm not sure this is really woven into webkitpy like it should be yet.

> WebKitTools/Scripts/webkitpy/common/diraszip.py:34
> +class DirAsZip(object):

I would have prefered "directory", but this is OK.

> WebKitTools/Scripts/webkitpy/common/diraszip.py:36
> +    def __init__(self, path, fs=FileSystem()):

filesystem=None is much better.

> WebKitTools/Scripts/webkitpy/common/diraszip.py:38
> +        self._fs = fs

_filesystem is more readable than _fs in my mind.

> WebKitTools/Scripts/webkitpy/common/diraszip.py:40
> +        if not self._path.endswith(os.path.sep):
> +            self._path += os.path.sep

If you always use os.join then you don't need this step.

> WebKitTools/Scripts/webkitpy/common/diraszip.py:42
> +    def handle_to(self, filename):

So we just call this open?  PYthon doesn't talk about "handles"  you just pass around file-like-objects.

> WebKitTools/Scripts/webkitpy/common/diraszip.py:51
> +    def namelist(self):
> +        results = []
> +        for (path, _, filenames) in os.walk(self._path):
> +            for filename in filenames:
> +                # We drop the path to the directory from our namelist.
> +                results.append(os.path.join(path, filename)[len(self._path):])
> +        return results

This is a list comprehension. :)

What is the purpose of this?
os.path.join(path, filename)[len(self._path):]

Why do we need to append the path only to remove the _path?  Ar the paths from os.walk not relative to the root?  I thought they were.

return list(itertools.chain([filenames for (_, _, filenames) in os.walk(self._path)])
would have done the same thing, minus the strange path re-writing.

> WebKitTools/Scripts/webkitpy/common/diraszip.py:54
> +        return self._fs.read_text_file(os.path.join(self._path, filename))

Seems you want a _full_path(filename) helper function since you do os.path.join all over the place.

> WebKitTools/Scripts/webkitpy/common/diraszip.py:61
> +        path_to_file = os.path.split(filename)[0]

Here is that funny add-the-path-but-remove-it-again dance.  Seems this should be a function if you really need it.

> WebKitTools/Scripts/webkitpy/common/diraszip.py:67
> +        assert '..' not in filename

What about leading / ?  Does filename simply need to be underneeth _path?  There is an os.path.relpath which can tell you that.. there may be others.  (os.path.relpath is 2.6 only but we have an implementation in ospath.py in common)

> WebKitTools/Scripts/webkitpy/common/net/buildbot.py:93
> +    def latest_build(self):
> +        revision_build_pairs = self.revision_build_pairs_with_results()

We didn't have a latest_build method already?  Seems we could get this from buildbot json.  I'm not sure what revision_pairs_with_results does w/o looking.

> WebKitTools/Scripts/webkitpy/common/net/buildbot.py:231
> +        return ResultSet(self._builder.name(), None, RemoteZip(self.results_zip_url()), include_expected=False)

Naming the None arg would make this more readable.

Comment 40 Eric Seidel (no email) 2010-12-09 18:49:40 PST

Comment on attachment 76120 [details]
Patch

r- for the mentioned nits.  I'm not sure the path forward here.  I think this patch is too big to review as is.

Comment 41 Eric Seidel (no email) 2010-12-09 19:16:29 PST

I'm sure we can land a few smaller pieces of this, like the IndentedLogger first, no?

Comment 42 Eric Seidel (no email) 2010-12-09 19:17:52 PST

Also, I think the "IndentedLogger" is technically a "logging formatter" based on python's definition of logging:
http://docs.python.org/library/logging.html

We're slowly trying to move off of my crappy deprecated_logging.py file which I created before any of us even knew logging.py existed in python. :)

Comment 43 James Kozianski 2010-12-09 21:37:37 PST

(In reply to comment #39)
> (From update of attachment 76120 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=76120&action=review
> 
> I really feel like we're doing way too much in one patch here.  Maybe you and I should talk in person?  I'm glad that this is no longer re-inventing the wheel, but I'm not sure this is really woven into webkitpy like it should be yet.

Apologies for the large patch here. I'd be glad to talk about it but unfortunately I'm in Sydney, so talking in person would be difficult. Perhaps I should try breaking the patch up and we can discuss it as we get through it all? This patch could be decomposed into these parts:

- The random utility classes (DirAsZip, RemoteZip together, IndentedLogger in a separate one?)
- Introduce Result and ResultSet classes (includes changes to Buildbot to support returning ResultSets).
- Introduce new 'webkit-patch rebaseline2' command, which requires Bucket/BucketTree/Rebaseliner.

I'll respond to your comments below in the appropriate patches.

Comment 44 Eric Seidel (no email) 2010-12-09 21:45:17 PST

I would like to say that I'm *very* supportive of bringing our various rebaseline commands into the 21st century.  The original chromium rebaseline architecture was wholy separate from webkitpy and does not work with WebKit.  I wrote one for webkit on top of webkitpy in a quick hack one evening, but it's obviously incomplete.  A unified rebaseline command which leverages webkitpy's infrastructure (notice how little code webkit-patch rebaseline is!) is a fantastic goal!

Comment 45 James Kozianski 2010-12-12 20:53:23 PST

I've split out the first sub-patch at https://bugs.webkit.org/show_bug.cgi?id=50901.

(In reply to comment #39)
> (From update of attachment 76120 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=76120&action=review
> 
> > WebKitTools/Scripts/webkitpy/common/diraszip.py:36
> > +    def __init__(self, path, fs=FileSystem()):
> 
> filesystem=None is much better.

Done.

> 
> > WebKitTools/Scripts/webkitpy/common/diraszip.py:38
> > +        self._fs = fs
> 
> _filesystem is more readable than _fs in my mind.

Done.

> 
> > WebKitTools/Scripts/webkitpy/common/diraszip.py:40
> > +        if not self._path.endswith(os.path.sep):
> > +            self._path += os.path.sep
> 
> If you always use os.join then you don't need this step.

Done.

> 
> > WebKitTools/Scripts/webkitpy/common/diraszip.py:42
> > +    def handle_to(self, filename):
> 
> So we just call this open?  PYthon doesn't talk about "handles"  you just pass around file-like-objects.

Done.

> 
> > WebKitTools/Scripts/webkitpy/common/diraszip.py:51
> > +    def namelist(self):
> > +        results = []
> > +        for (path, _, filenames) in os.walk(self._path):
> > +            for filename in filenames:
> > +                # We drop the path to the directory from our namelist.
> > +                results.append(os.path.join(path, filename)[len(self._path):])
> > +        return results
> 
> This is a list comprehension. :)

Cool! Done.

> 
> What is the purpose of this?
> os.path.join(path, filename)[len(self._path):]
> 
> Why do we need to append the path only to remove the _path?  Ar the paths from os.walk not relative to the root?  I thought they were.

os.walk() returns results relative to the path that it is called from so all results from os.walk(X) will have X as a prefix. We want to pretend that the directory at X is a zip file (in the same vein as http://docs.python.org/library/zipfile.html), so namelist() shouldn't expose X in its results.

> 
> return list(itertools.chain([filenames for (_, _, filenames) in os.walk(self._path)])
> would have done the same thing, minus the strange path re-writing.

That would only provide a list of the bare filenames under the directory and not include the relative paths to them which we need for extracting files.

> 
> > WebKitTools/Scripts/webkitpy/common/diraszip.py:54
> > +        return self._fs.read_text_file(os.path.join(self._path, filename))
> 
> Seems you want a _full_path(filename) helper function since you do os.path.join all over the place.

Done.

> 
> > WebKitTools/Scripts/webkitpy/common/diraszip.py:61
> > +        path_to_file = os.path.split(filename)[0]
> 
> Here is that funny add-the-path-but-remove-it-again dance.  Seems this should be a function if you really need it.

This is actually a different thing, but in looking at the code I realise it is unnecessarily circuitous, so I've made it more concise. The os.path.split() is used to ensure that the directories above the file we want to extract exist at the destination. In Python ZipFile.extract('src/file', 'dest') will copy 'src/file' to 'dest/src/file'.

> 
> > WebKitTools/Scripts/webkitpy/common/diraszip.py:67
> > +        assert '..' not in filename
> 
> What about leading / ?  Does filename simply need to be underneeth _path?  There is an os.path.relpath which can tell you that.. there may be others.  (os.path.relpath is 2.6 only but we have an implementation in ospath.py in common)

Ah, I see - I didn't know os.path.join() interpreted leading slashes in the second argument. I've added an _is_under() method that uses ospath.relpath() and moved the assert to the new _full_path() method.

(In reply to comment #44)
> I would like to say that I'm *very* supportive of bringing our various rebaseline commands into the 21st century.  The original chromium rebaseline architecture was wholy separate from webkitpy and does not work with WebKit.  I wrote one for webkit on top of webkitpy in a quick hack one evening, but it's obviously incomplete.  A unified rebaseline command which leverages webkitpy's infrastructure (notice how little code webkit-patch rebaseline is!) is a fantastic goal!

Thanks for the encouragement! I must admit I didn't expect this script to be as involved as it has been, but I'll be happy if it is a positive contribution :-)

Comment 46 Adam Barth 2010-12-13 01:16:38 PST

> Apologies for the large patch here. I'd be glad to talk about it but unfortunately I'm in Sydney, so talking in person would be difficult. Perhaps I should try breaking the patch up and we can discuss it as we get through it all?

I think that's a great idea.  Can you upload the different pieces in separate bugs that block this bug?  I think we're at the point where we should start landing this patch in pieces.

> This patch could be decomposed into these parts:
> 
> - The random utility classes (DirAsZip, RemoteZip together, IndentedLogger in a separate one?)

Yep.  I'd put DirAsZip and RemoteZip together but put IndentedLogger in a separate patch.  It might seem pedantic to break these things up into smaller pieces, but it's very helpful.

> - Introduce Result and ResultSet classes (includes changes to Buildbot to support returning ResultSets).

Yep.

> - Introduce new 'webkit-patch rebaseline2' command, which requires Bucket/BucketTree/Rebaseliner.

I'd have to look at the details, but that sounds like a reasonable plan.  Don't forget:

- Remove old rebaseline commands.  :)

Comment 47 James Kozianski 2010-12-13 22:10:50 PST

(In reply to comment #46)
> > Apologies for the large patch here. I'd be glad to talk about it but unfortunately I'm in Sydney, so talking in person would be difficult. Perhaps I should try breaking the patch up and we can discuss it as we get through it all?
> 
> I think that's a great idea.  Can you upload the different pieces in separate bugs that block this bug?  I think we're at the point where we should start landing this patch in pieces.

Yep, will do. I've split off the first patch into bug 50901.

> 
> > This patch could be decomposed into these parts:
> > 
> > - The random utility classes (DirAsZip, RemoteZip together, IndentedLogger in a separate one?)
> 
> Yep.  I'd put DirAsZip and RemoteZip together but put IndentedLogger in a separate patch.  It might seem pedantic to break these things up into smaller pieces, but it's very helpful.

Oh, not at all. I can see the wisdom of having smaller patches. My original concern was that landing the overall context and direction of the script would be lost in smaller patches, but I think it's good now.

> 
> > - Introduce Result and ResultSet classes (includes changes to Buildbot to support returning ResultSets).
> 
> Yep.
> 
> > - Introduce new 'webkit-patch rebaseline2' command, which requires Bucket/BucketTree/Rebaseliner.
> 
> I'd have to look at the details, but that sounds like a reasonable plan.  Don't forget:
> 
> - Remove old rebaseline commands.  :)

Yep, will do :-)

Comment 48 Adam Barth 2011-08-06 13:18:46 PDT

This bug hasn't been touched in over six months and we now have a bunch of fancy new rebaselining tools and infrastructure.  Please let me know if there are use cases we haven't addressed.