37630 – delete redundant test outputs

Evan Martin

Reported 2010-04-14 18:49:22 PDT

delete redundant test outputs

Evan Martin

Comment 1 2010-04-14 18:51:03 PDT

Created attachment 53398 [details] Patch

Alexey Proskuryakov

Comment 2 2010-04-16 11:37:33 PDT

Comment on attachment 53398 [details] Patch It might be slightly clearer if the script was named deduplicate-webkit-tests.

Evan Martin

Comment 3 2010-04-16 12:54:12 PDT

I can rename it. BTW, this found 1300 redundant files in the current tree.

Eric Seidel (no email)

Comment 4 2010-04-16 13:26:18 PDT

Comment on attachment 53398 [details] Patch Tricky stuff. No clue what this does: + _, _, hash = attrs.split(' ') Does that ignore the first two split values?

Evan Martin

Comment 5 2010-04-16 13:31:04 PDT

(In reply to comment #4) > (From update of attachment 53398 [details]) > Tricky stuff. > > No clue what this does: > + _, _, hash = attrs.split(' ') > > Does that ignore the first two split values? Yes. Two features you might not be aware of: 1) foo, bar = x.split(' ') unpacks into two variables. 2) _ = blah() "_" is the name of the variable meaning "I don't want this variable". That style is common for self-documenting the following: - throw an exception if the thing doesn't split into three components exactly - I don't care about the first two It probably could be aided by including in a comment what kind of input it's parsing. (The git output is designed to be consumed by scripts in this way.)

Adam Barth

Comment 6 2010-04-22 15:54:03 PDT

Comment on attachment 53398 [details] Patch I'm sorry, but we're not accepting any new python code that doesn't have unit tests.

Marcus Bulach

Comment 7 2010-08-09 05:01:10 PDT

Hi Evan, I was working on something pretty similar https://bugs.webkit.org/show_bug.cgi?id=40792 Are you still working on this? I think you can either get some of the tests I wrote in the bug above and go ahead here, or I could take your get_port_fallbacks pieces.. :) Either way is fine by me, which one would you prefer?

Marcus Bulach

Comment 8 2010-08-09 06:55:34 PDT

(In reply to comment #7) > Hi Evan, > > I was working on something pretty similar https://bugs.webkit.org/show_bug.cgi?id=40792 > > Are you still working on this? I think you can either get some of the tests I wrote in the bug above and go ahead here, or I could take your get_port_fallbacks pieces.. :) > Either way is fine by me, which one would you prefer? on a second read: I think your patch using git ls-tree is much simpler than mine (which used python's hashlib.sha1()), so if you're ok, I'd be happy to write some tests to get this patch (rather than mine) in!

Evan Martin

Comment 9 2010-08-09 11:31:41 PDT

I would love it if you finished either this patch or another one! Please be my guest and use whichever code seems appropriate to you.

Marcus Bulach

Comment 10 2010-08-09 11:38:45 PDT

(In reply to comment #9) > I would love it if you finished either this patch or another one! Please be my guest and use whichever code seems appropriate to you. great, thanks Evan! I'll reuse your code and upload a new patch with tests here soon.

Marcus Bulach

Comment 11 2010-08-10 02:22:39 PDT

Created attachment 63990 [details] Patch

Marcus Bulach

Comment 12 2010-08-10 02:27:52 PDT

Thanks Evan! I took over your patch, did some minor changes (using log, optparse, added a glob for filtering file names, etc..), and added tests. I'm also cc:ing everybody that was on https://bugs.webkit.org/show_bug.cgi?id=40792, can you all please take a look?

Marcus Bulach

Comment 13 2010-08-10 02:30:03 PDT

*** Bug 40792 has been marked as a duplicate of this bug. ***

Tony Chang

Comment 14 2010-08-10 09:58:03 PDT

Comment on attachment 63990 [details] Patch > diff --git a/WebKitTools/ChangeLog b/WebKitTools/ChangeLog > + * Scripts/deduplicate_tests.py: Added. > + * Scripts/deduplicate_tests_unittest.py: Added. There are no other unittest.py files or files with _ in the filename in Scripts and there's only one file that ends in .py. Maybe it would be better to put these files in Scripts/webkitpy somewhere and have a wrapper file (e.g., duplicate-tests) in Scripts that just imports the files from webkitpy. I think the unittest needs to be in webkitpy anyway for Scripts/test-webkitpy to find it. > diff --git a/WebKitTools/Scripts/deduplicate_tests.py b/WebKitTools/Scripts/deduplicate_tests.py > +# 2. Redistributions in binary form must reproduce the above copyright > +# notice, this list of conditions and the following disclaimer in the > +# documentation and/or other materials provided with the distribution. The third clause of the license seems to be missing. There seems to be some inconsistency among the existing files in Scripts, but that's a separate issue. I would use something like what's in webkit-tools-completion.sh. > + # Fill in the map. > + cmd = ['git', 'ls-tree', '-r', 'HEAD', 'LayoutTests'] Nit: () instead of [] since cmd doesn't change. The unittests look great.

Marcus Bulach

Comment 15 2010-08-10 10:32:30 PDT

Created attachment 64026 [details] Patch

Marcus Bulach

Comment 16 2010-08-10 10:35:59 PDT

thanks Tony! replies inline, new patch uploaded: (In reply to comment #14) > (From update of attachment 63990 [details]) > > diff --git a/WebKitTools/ChangeLog b/WebKitTools/ChangeLog > > + * Scripts/deduplicate_tests.py: Added. > > + * Scripts/deduplicate_tests_unittest.py: Added. > > There are no other unittest.py files or files with _ in the filename in Scripts and there's only one file that ends in .py. Maybe it would be better to put these files in Scripts/webkitpy somewhere and have a wrapper file (e.g., duplicate-tests) in Scripts that just imports the files from webkitpy. I think the unittest needs to be in webkitpy anyway for Scripts/test-webkitpy to find it. ahn, right! I was running the tests manually, didn't know about test-webkitpy. it's now much clearer with the wrapper in Scripts and the module / unittest in webkitpy, thanks! > > > > diff --git a/WebKitTools/Scripts/deduplicate_tests.py b/WebKitTools/Scripts/deduplicate_tests.py > > +# 2. Redistributions in binary form must reproduce the above copyright > > +# notice, this list of conditions and the following disclaimer in the > > +# documentation and/or other materials provided with the distribution. > > The third clause of the license seems to be missing. There seems to be some inconsistency among the existing files in Scripts, but that's a separate issue. I would use something like what's in webkit-tools-completion.sh. hehe, I'm confused. in this bug I've been told to use the two clause version.. ;) https://bugs.webkit.org/show_bug.cgi?id=40250#c5 do you know what's the real latest? :) > > > + # Fill in the map. > > + cmd = ['git', 'ls-tree', '-r', 'HEAD', 'LayoutTests'] > > Nit: () instead of [] since cmd doesn't change. done. > > The unittests look great. thanks! another look, please?

Tony Chang

Comment 17 2010-08-10 10:41:20 PDT

Comment on attachment 64026 [details] Patch Jeremy probably knows the license situation better than me.

Eric Seidel (no email)

Comment 18 2010-08-10 11:09:35 PDT

Comment on attachment 64026 [details] Patch WebKitTools/ChangeLog:1 + 2010-08-10 Marcus Bulach <bulach@chromium.org> Should be a new line between the author line and the reviewed by line. WebKitTools/Scripts/deduplicate-tests:65 + if options.pretty_print: Is this some sort of verbose mode? WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:47 + def get_port_fallbacks(): I dont' think we usually use get_* in python code. Check PEP8. We certainly don't use get* in our c++ webkit code. WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:58 + _log.error("'%s' lacks baseline_search_path()" % port_name) Do we ever hit this? Wouldn't that be a bug in the port implementation? I think it's OK to be robust here, but we should be more explicit that the port needs fixing. WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:59 + port_fallbacks[port_name] = ['base'] base? Is that a magic string here? Seems we might want a constant for that string instead of repeating it. WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:79 + cmd = ('git', 'ls-tree', '-r', 'HEAD', 'LayoutTests') I feel like the running of ls tree and returning of the split lines should be a separate funciton. WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:98 + _, _, hash = attrs.split(' ') This is a cool pattern, I wish we had understood it better when writing the rest of webkitpy. WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:89 + for line in git.split('\n'): This loop feels like it wants to call some helper function to proces each line. WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:113 + if len(cluster) < 2: This per-test block feels like a function. :) Mostly because I was initially confused as to how you were avoiding deduping the empty file. But it now seems more clear to me that your'e not looping on hashes, but rather test paths (which is important). WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:116 + # Compute the list of platforms we have this particular hash for. Isn't this a function? platforms_with_test_matching_hash()? WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:139 + for test, platform, fallback, path in find_dups(hashes, port_fallbacks): If this was a function, then deduplicate just becomes a list comprehension. WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests_unittest.py:32 + class MockExecutive(object): Doesn't Mock.py already do this for you? We include it in webkitpy/third_party iirc. It's not great, but it might be easier than rolling your own. WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests_unittest.py:57 + '100644 blob 5053240b3353f6eb39f7cb00259785f16d121df2\tLayoutTests/mac/foo-expected.txt\n' Would be nice to have a parsing function split out from the logic so we could test that independently. WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests_unittest.py:64 + self.assertEquals(('git', 'ls-tree', '-r', 'HEAD', 'LayoutTests'), MockExecutive.last_run_command[-1]) I see. Yeah, we don't have quite so slick an interface for this in our existing mocks. If you look at mocktool.py you'll find a MockExecutive I think. But the way we've been asserting is using OutputCapture. Your solution is more pythony, I think. We may wish to move some variant of your MockExecutive into a shared file at some point. WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests_unittest.py:113 + if __name__ == '__main__': I don't think we need this. Did you verify that test-webkitpy will automagically find and run your unit tests? I would expect it to. Overall I think this ia a FANTASTIC patch. I really can't thank you and evan enough for taking this on. I would like to see one more round of this. Don't feel obliged to answer each of my nits above, but you should at least thumb through them. I think this patch is OK to land as is, but it would be nice to at least fix the ChangeLog and consider the nits listed.

Eric Seidel (no email)

Comment 19 2010-08-10 11:10:36 PDT

Oh, and my understanding is that Google uses the 2-clause BSD for all new webkit contributions. I'm not sure what Apple's standard template looks like.

Marcus Bulach

Comment 20 2010-08-10 12:43:55 PDT

Created attachment 64040 [details] Patch

Marcus Bulach

Comment 21 2010-08-10 12:51:27 PDT

thanks Eric! I think I addressed most of your comments, would you mind taking another look please? (In reply to comment #18) > (From update of attachment 64026 [details]) > WebKitTools/ChangeLog:1 > + 2010-08-10 Marcus Bulach <bulach@chromium.org> > Should be a new line between the author line and the reviewed by line. Done > > WebKitTools/Scripts/deduplicate-tests:65 > + if options.pretty_print: > Is this some sort of verbose mode? yep, changed to verbose. > > WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:47 > + def get_port_fallbacks(): > I dont' think we usually use get_* in python code. Check PEP8. We certainly don't use get* in our c++ webkit code. done. > > WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:58 > + _log.error("'%s' lacks baseline_search_path()" % port_name) > Do we ever hit this? Wouldn't that be a bug in the port implementation? I think it's OK to be robust here, but we should be more explicit that the port needs fixing. Clarified the message. > > WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:59 > + port_fallbacks[port_name] = ['base'] > base? Is that a magic string here? > > Seems we might want a constant for that string instead of repeating it. Added as constant. > > WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:79 > + cmd = ('git', 'ls-tree', '-r', 'HEAD', 'LayoutTests') > I feel like the running of ls tree and returning of the split lines should be a separate funciton. agreed! split. > > WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:98 > + _, _, hash = attrs.split(' ') > This is a cool pattern, I wish we had understood it better when writing the rest of webkitpy. :) > > WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:89 > + for line in git.split('\n'): > This loop feels like it wants to call some helper function to proces each line. done. > > WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:113 > + if len(cluster) < 2: > This per-test block feels like a function. :) Mostly because I was initially confused as to how you were avoiding deduping the empty file. But it now seems more clear to me that your'e not looping on hashes, but rather test paths (which is important). I split this up, let me know if it's any clearer. > > WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:116 > + # Compute the list of platforms we have this particular hash for. > Isn't this a function? platforms_with_test_matching_hash()? I named "extract_platforms", hope it's ok? > > WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:139 > + for test, platform, fallback, path in find_dups(hashes, port_fallbacks): > If this was a function, then deduplicate just becomes a list comprehension. yay, more python-foo! :) > > WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests_unittest.py:32 > + class MockExecutive(object): > Doesn't Mock.py already do this for you? We include it in webkitpy/third_party iirc. It's not great, but it might be easier than rolling your own. well, I'm not doing anything particularly complicated, so I think it's simpler to keep it? > > WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests_unittest.py:57 > + '100644 blob 5053240b3353f6eb39f7cb00259785f16d121df2\tLayoutTests/mac/foo-expected.txt\n' > Would be nice to have a parsing function split out from the logic so we could test that independently. done. > > WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests_unittest.py:64 > + self.assertEquals(('git', 'ls-tree', '-r', 'HEAD', 'LayoutTests'), MockExecutive.last_run_command[-1]) > I see. Yeah, we don't have quite so slick an interface for this in our existing mocks. If you look at mocktool.py you'll find a MockExecutive I think. But the way we've been asserting is using OutputCapture. Your solution is more pythony, I think. > > We may wish to move some variant of your MockExecutive into a shared file at some point. I'm happy to move out at any point.. > > WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests_unittest.py:113 > + if __name__ == '__main__': > I don't think we need this. Did you verify that test-webkitpy will automagically find and run your unit tests? I would expect it to. right, it does find. removed it. > > Overall I think this ia a FANTASTIC patch. I really can't thank you and evan enough for taking this on. all credits to Evan, my changes were mostly cosmetic. :) > > I would like to see one more round of this. Don't feel obliged to answer each of my nits above, but you should at least thumb through them. I think this patch is OK to land as is, but it would be nice to at least fix the ChangeLog and consider the nits listed. I think I addressed all of them. another look please?

Eric Seidel (no email)

Comment 22 2010-08-11 09:43:05 PDT

Comment on attachment 64040 [details] Patch WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:53 + back on. All platforms fall back on 'base'. Seems part of this comment belongs next to _BASE_PLATFORM, but ok. WebKitTools/Scripts/webkitpy/layout_tests/deduplicate_tests.py:68 + def parse_git_output(git_output, glob_pattern): I would have named this parse_git_ls_tree_output Looks great though. Thanks for taking this on. We probably should have done more of this as a class, rather than free functions. Eventually it will probably get moved deeper into webkitpy. But I think this is a good starting point.

WebKit Commit Bot

Comment 23 2010-08-11 11:31:27 PDT

Comment on attachment 64040 [details] Patch Clearing flags on attachment: 64040 Committed r65166: <http://trac.webkit.org/changeset/65166>

WebKit Commit Bot

Comment 24 2010-08-11 11:31:34 PDT

All reviewed patches have been landed. Closing bug.

WebKit Review Bot

Comment 25 2010-08-11 11:49:39 PDT

http://trac.webkit.org/changeset/65166 might have broken Chromium Win Release The following changes are on the blame list: http://trac.webkit.org/changeset/65166 http://trac.webkit.org/changeset/65167

Dirk Pranke

Comment 26 2010-08-12 18:26:52 PDT

It looks like this code assumes you're running in a git checkout with the current working directory == the top of the tree, and it seems to gork test-webkitpy so that the rest of the test suite doesn't work right. Can you please fix this?

Eric Seidel (no email)

Comment 27 2010-08-12 18:38:14 PDT

scm.py knows how to handle running in those cases. Perhaps this should use more of webkitpy code to take care of that for you. You can use detect_scm_system to find the checkout_root. http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitpy/common/checkout/scm.py#L41 We've long talked about making it easier to write non-webkit-patch tools which use webkitpy. We just have a couple bugs to fix to make that possible. (Mostly abstracting the data storage out of WebKitPatch into a separate Tool class that can be set as self.tool on Command instances instead of WebKitPatch.)

Dirk Pranke

Comment 28 2010-08-12 20:08:09 PDT

(In reply to comment #27) > scm.py knows how to handle running in those cases. Perhaps this should use more of webkitpy code to take care of that for you. > > You can use detect_scm_system to find the checkout_root. > > http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitpy/common/checkout/scm.py#L41 > > We've long talked about making it easier to write non-webkit-patch tools which use webkitpy. We just have a couple bugs to fix to make that possible. (Mostly abstracting the data storage out of WebKitPatch into a separate Tool class that can be set as self.tool on Command instances instead of WebKitPatch.) detect_scm_system() takes a path as input, which doesn't help if you don't know what path you want to use. It would be nice if it had a way to bootstrap itself better, like we do in webkitpy/layout_tests/port/base.py: if not self._webkit_base_dir: abspath = os.path.abspath(__file__) self._webkit_base_dir = abspath[0:abspath.find('WebKitTools')] _log.debug("Using WebKit root: %s" % self._webkit_base_dir) You could do something like this if path was None.

Eric Seidel (no email)

Comment 29 2010-08-12 20:22:51 PDT

detect_scm_system knows how to crawl up until it finds the root. I think you want something like this: http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitpy/tool/main.py#L80 Which is why that code should be shared on some object other than WebKitPatch's main class.

Dirk Pranke

Comment 30 2010-08-13 13:21:34 PDT

(In reply to comment #29) > detect_scm_system knows how to crawl up until it finds the root. > > I think you want something like this: > http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitpy/tool/main.py#L80 > > Which is why that code should be shared on some object other than WebKitPatch's main class. Well, sort of. That code finds the root given $CWD, which is appropriate in some cases. The code in port.base finds the root given the path to port/base.py, which means that it'll work even if you're not actually in the tree; it uses the root of the tree containing the code you're running. I'm not sure if that would be appropriate for scm-related stuff, but it seems fine for test-related stuff.

Eric Seidel (no email)

Comment 31 2010-08-13 14:09:17 PDT

The code I linked to tries to use the CWD, if its not a webkit checkout, then it falls back to being __file__ relative.

Dirk Pranke

Comment 32 2010-08-13 14:18:57 PDT

(In reply to comment #31) > The code I linked to tries to use the CWD, if its not a webkit checkout, then it falls back to being __file__ relative. Oh, i see. Interesting; I didn't realize that sys.path[0] was defined to be the script directory. That's useful. So, in this case, if I have a webkit-patch from one checkout in my path, and I'm in another checkout, it will use the checkout I'm in, not the checkout the code is running from. Probably the right thing to do in the scm case; possibly the wrong thing to do in the test case. Perhaps in either case you might want to prompt/warn that you're mixing checkouts. But this is a corner condition to be sure.

Dirk Pranke

Comment 33 2010-08-13 17:48:48 PDT

Another issue is that deduplicate_tests.py:44 appears to change the default configuration of the logger, which causes a whole bunch of [ERROR] messages to start showing up when you run test-webkitpy (although the tests succeed). I'll post a separate patch to address these two issues and we can continue the discussion of what to do about this there.

Dirk Pranke

Comment 34 2010-08-13 19:10:03 PDT

(In reply to comment #33) > I'll post a separate patch to address these two issues and we can continue the discussion of what to do about this there. okay, follow up in bug 44001.

Attachments
Patch (6.34 KB, patch) 2010-04-14 18:51 PDT, Evan Martin	abarth: review-	Details Formatted Diff Diff
Patch (14.40 KB, patch) 2010-08-10 02:22 PDT, Marcus Bulach	no flags	Details Formatted Diff Diff
Patch (16.57 KB, patch) 2010-08-10 10:32 PDT, Marcus Bulach	no flags	Details Formatted Diff Diff
Patch (19.86 KB, patch) 2010-08-10 12:43 PDT, Marcus Bulach	no flags	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.