Bug 50635 - [new-run-webkit-tests] expectations parsing is slow
Summary: [new-run-webkit-tests] expectations parsing is slow
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on: 168304
Blocks: 50681
  Show dependency treegraph
 
Reported: 2010-12-07 09:37 PST by Philippe Normand
Modified: 2017-02-14 05:08 PST (History)
6 users (show)

See Also:


Attachments
proposed patch (2.14 KB, patch)
2010-12-07 09:47 PST, Philippe Normand
eric: review-
Details | Formatted Diff | Diff
updated patch (2.16 KB, patch)
2010-12-12 03:36 PST, Philippe Normand
no flags Details | Formatted Diff | Diff
updated patch (2.15 KB, patch)
2010-12-13 05:55 PST, Philippe Normand
no flags Details | Formatted Diff | Diff
updated patch (5.51 KB, patch)
2010-12-13 09:45 PST, Philippe Normand
ojan: review-
Details | Formatted Diff | Diff
updated patch (5.58 KB, patch)
2010-12-14 06:55 PST, Philippe Normand
no flags Details | Formatted Diff | Diff
updated patch (5.58 KB, patch)
2010-12-14 07:04 PST, Philippe Normand
ojan: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Philippe Normand 2010-12-07 09:37:58 PST
In the _read method of the TestExpectationsFile class in test_expectations.py the full list of tests is iterated once for each expectation in _expand_tests(). This could be optimized by checking if the expectation is a test case file or a test category. In the first case the list iteration can be avoided.
Comment 1 Philippe Normand 2010-12-07 09:47:49 PST
Created attachment 75825 [details]
proposed patch
Comment 2 WebKit Review Bot 2010-12-07 10:21:46 PST
Attachment 75825 [details] did not pass style-queue:

Failed to run "[u'git', u'reset', u'--hard', u'refs/remotes/trunk']" exit_code: 128
error: Could not write new index file.
fatal: Could not reset index file to revision 'refs/remotes/trunk'.


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 WebKit Review Bot 2010-12-07 11:22:55 PST
Attachment 75825 [details] did not pass style-queue:

Failed to run "[u'git', u'reset', u'--hard', u'refs/remotes/trunk']" exit_code: 128
error: Could not write new index file.
fatal: Could not reset index file to revision 'refs/remotes/trunk'.


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 4 WebKit Review Bot 2010-12-07 12:24:15 PST
Attachment 75825 [details] did not pass style-queue:

Failed to run "[u'git', u'reset', u'--hard', u'refs/remotes/trunk']" exit_code: 128
error: Could not write new index file.
fatal: Could not reset index file to revision 'refs/remotes/trunk'.


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 5 WebKit Review Bot 2010-12-07 21:42:36 PST
Attachment 75825 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/update-webkit']" exit_code: 2
Updating OpenSource
Incomplete data: Delta source ended unexpectedly at /usr/lib/git-core/git-svn line 5061

Died at WebKitTools/Scripts/update-webkit line 132.


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 6 Eric Seidel (no email) 2010-12-12 02:23:09 PST
Comment on attachment 75825 [details]
proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=75825&action=review

Well, instead this causes us to stat the file system for each path.  I guess we're doing so right above, so the inode should be cached...

Would this be faster/just as fast if we just compiled a regexp for the path first?

How did you perf test this?  How much speedup did you see?

> WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py:723
> +        if not os.path.isfile(path):

I guess I would have reversed this if to be positive instead of negative.

> WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py:725
> +            result = [ test for test in self._full_test_list if test.startswith(path) ]

I don't think PEP8 says we put extra spaces around [ ] like this.
Comment 7 Philippe Normand 2010-12-12 02:35:09 PST
(In reply to comment #6)
> 
> How did you perf test this?  How much speedup did you see?
> 

This is easy to perf :) Just try new-run-webkit-tests --platform=gtk with and without the patch. We (Alexg and I) profiled this with the python hotshot profiler.

Without the patch parsing the expectations takes about 20 seconds here and with the patch less than 2 seconds.

> > WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py:723
> > +        if not os.path.isfile(path):
> 
> I guess I would have reversed this if to be positive instead of negative.
> 
> > WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py:725
> > +            result = [ test for test in self._full_test_list if test.startswith(path) ]
> 
> I don't think PEP8 says we put extra spaces around [ ] like this.

OK I can fix those. Thanks for the review
Comment 8 Philippe Normand 2010-12-12 03:36:46 PST
Created attachment 76320 [details]
updated patch
Comment 9 WebKit Review Bot 2010-12-12 03:39:25 PST
Attachment 76320 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style', '--diff-files', u'WebKitTools/ChangeLog', u'WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py']" exit_code: 1
WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py:739:  missing whitespace after ','  [pep8/E231] [5]
Total errors found: 1 in 2 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 10 Philippe Normand 2010-12-13 05:55:23 PST
Created attachment 76377 [details]
updated patch

and "style" fixed!
Comment 11 Ojan Vafai 2010-12-13 08:31:32 PST
Comment on attachment 76377 [details]
updated patch

I assume you ran the unittests for this. If not, please make sure to do so before committing.

Also, it would be nice if you could add a test for this case. The closest thing I could find was test_precedence, but that doesn't test the case of a file being listed without it's parent directory also being listed. Minor edge case, but more tests make me feel safer. :)
Comment 12 Philippe Normand 2010-12-13 09:44:00 PST
(In reply to comment #11)
> (From update of attachment 76377 [details])
> I assume you ran the unittests for this. If not, please make sure to do so before committing.
> 

Yes I did check that before sending the patch for review.

> Also, it would be nice if you could add a test for this case. The closest thing I could find was test_precedence, but that doesn't test the case of a file being listed without it's parent directory also being listed. Minor edge case, but more tests make me feel safer. :)

Hum now I figured out my patch should rely on port's filesystem API :) Will send a new patch with a test!
Comment 13 Philippe Normand 2010-12-13 09:45:53 PST
Created attachment 76397 [details]
updated patch
Comment 14 Dirk Pranke 2010-12-13 10:02:57 PST
Nice catch! I knew there had to be a reason chromium's parsing was getting quadratically slower but I hadn't had the time to look into it.

The change basically looks fine but I would make one minor modification. We already know whether the path points to a directory or a file from the check on line 732, so you can merge the two if blocks together and save yourself an extra os call.

Seems like we could probably still speed up the directory/category branch of the loop by switching to a more tree-based storage mechanism for the file list, or by revamping the calling code to do a merge join instead of a nested-loop join but it's probably not worth it for now, so I would leave the code as is since it's a lot more obviously correct. You could consider adding a #FIXME that there's an opportunity for a speed up here.

As far as using a regexp goes, I'd be more than a little surprised if a regexp was faster than a string prefix match.

-- Dirk
Comment 15 Ojan Vafai 2010-12-13 10:12:15 PST
Comment on attachment 76397 [details]
updated patch

r- per Dirk's comments. Specifically, making use of the is_dir check above, which will mean we don't need to add isfile to port either.
Comment 16 Eric Seidel (no email) 2010-12-14 01:26:48 PST
Comment on attachment 76377 [details]
updated patch

Cleared Ojan Vafai's review+ from obsolete attachment 76377 [details] so that this bug does not appear in http://webkit.org/pending-commit.
Comment 17 Philippe Normand 2010-12-14 06:55:00 PST
Created attachment 76532 [details]
updated patch
Comment 18 WebKit Review Bot 2010-12-14 06:56:16 PST
Attachment 76532 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style', '--diff-files', u'WebKitTools/ChangeLog', u'WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py', u'WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations_unittest.py', u'WebKitTools/Scripts/webkitpy/layout_tests/port/test.py']" exit_code: 1
WebKitTools/Scripts/webkitpy/layout_tests/port/test.py:222:  missing whitespace around operator  [pep8/E225] [5]
Total errors found: 1 in 4 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 19 Philippe Normand 2010-12-14 07:04:13 PST
Created attachment 76533 [details]
updated patch
Comment 20 Ojan Vafai 2010-12-14 07:25:20 PST
Comment on attachment 76533 [details]
updated patch

View in context: https://bugs.webkit.org/attachment.cgi?id=76533&action=review

> WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py:742
> +            result = [test for test in self._full_test_list if test.startswith(path)]

nit: I would make this an early return, then you can drop the "else:" line.

> WebKitTools/Scripts/webkitpy/layout_tests/port/test.py:214
> +        if path.find('.') != -1:

This is fine for now given that this is just for testing, but in the real webkit tree, we have the CSS2.1 directory, which we may want to add tests for in the future. Please add a FIXME to make this work for that case.
Comment 21 Philippe Normand 2010-12-14 09:55:12 PST
Committed r74036: <http://trac.webkit.org/changeset/74036>
Comment 22 Dirk Pranke 2010-12-14 12:22:55 PST
Comment on attachment 76533 [details]
updated patch

View in context: https://bugs.webkit.org/attachment.cgi?id=76533&action=review

>> WebKitTools/Scripts/webkitpy/layout_tests/port/test.py:214
>> +        if path.find('.') != -1:
> 
> This is fine for now given that this is just for testing, but in the real webkit tree, we have the CSS2.1 directory, which we may want to add tests for in the future. Please add a FIXME to make this work for that case.

In particular, since the list of files and directories is fixed and small in this case, all you are doing is speeding up the check in line 224, but I doubt that produces a noticeable speedup.
Comment 23 Philippe Normand 2010-12-14 13:33:56 PST
(In reply to comment #22)
> (From update of attachment 76533 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=76533&action=review
> 
> >> WebKitTools/Scripts/webkitpy/layout_tests/port/test.py:214
> >> +        if path.find('.') != -1:
> > 
> > This is fine for now given that this is just for testing, but in the real webkit tree, we have the CSS2.1 directory, which we may want to add tests for in the future. Please add a FIXME to make this work for that case.
> 
> In particular, since the list of files and directories is fixed and small in this case, all you are doing is speeding up the check in line 224, but I doubt that produces a noticeable speedup.

I had to make this change to actually make the tests pass...

The noticeable speedup is in the real tests expectations parsing, eg. avoiding the expensive iteration over the *BIG* list of tests, when possible. :)
Comment 24 Ojan Vafai 2010-12-14 13:40:14 PST
FWIW, this cut parse expectations time on the chromium linux bot from 9 seconds to 2.
Comment 25 Dirk Pranke 2010-12-14 15:56:48 PST
(In reply to comment #23)
> (In reply to comment #22)
> > (From update of attachment 76533 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=76533&action=review
> > 
> > >> WebKitTools/Scripts/webkitpy/layout_tests/port/test.py:214
> > >> +        if path.find('.') != -1:
> > > 
> > > This is fine for now given that this is just for testing, but in the real webkit tree, we have the CSS2.1 directory, which we may want to add tests for in the future. Please add a FIXME to make this work for that case.
> > 
> > In particular, since the list of files and directories is fixed and small in this case, all you are doing is speeding up the check in line 224, but I doubt that produces a noticeable speedup.
> 
> I had to make this change to actually make the tests pass...
> 
> The noticeable speedup is in the real tests expectations parsing, eg. avoiding the expensive iteration over the *BIG* list of tests, when possible. :)

Oh, yeah, obviously there's a big speedup in the real parsing. I'm just surprised that you would need to make a change to test.py for the tests to pass. I'll have to look into that.
Comment 26 Dirk Pranke 2010-12-15 01:08:15 PST
(In reply to comment #25)
> > > >> WebKitTools/Scripts/webkitpy/layout_tests/port/test.py:214
> > > >> +        if path.find('.') != -1:
> > > > 
> > > > This is fine for now given that this is just for testing, but in the real webkit tree, we have the CSS2.1 directory, which we may want to add tests for in the future. Please add a FIXME to make this work for that case.
> > > 
> > 
> > I had to make this change to actually make the tests pass...
> > 
> > 
> I'm just surprised that you would need to make a change to test.py for the tests to pass. I'll have to look into that.

I guess I was justifiably surprised. That line actually causes test-webkitpy to fail for me, because my source tree is rooted in a directory with a "." in it (e.g. /src/foo.bar/WebKit/WebKitTools/...). However, if I comment out those two lines, it passes for me. What fails for you without those lines?
Comment 27 Philippe Normand 2010-12-15 12:44:37 PST
(In reply to comment #26)
> (In reply to comment #25)
> > > > >> WebKitTools/Scripts/webkitpy/layout_tests/port/test.py:214
> > > > >> +        if path.find('.') != -1:
> > > > > 
> > > > > This is fine for now given that this is just for testing, but in the real webkit tree, we have the CSS2.1 directory, which we may want to add tests for in the future. Please add a FIXME to make this work for that case.
> > > > 
> > > 
> > > I had to make this change to actually make the tests pass...
> > > 
> > > 
> > I'm just surprised that you would need to make a change to test.py for the tests to pass. I'll have to look into that.
> 
> I guess I was justifiably surprised. That line actually causes test-webkitpy to fail for me, because my source tree is rooted in a directory with a "." in it (e.g. /src/foo.bar/WebKit/WebKitTools/...). However, if I comment out those two lines, it passes for me. What fails for you without those lines?

Oh I see what happened :) I made that change before doing the early return in the expand_tests code. It was needed in that case because path_isdir() was returning False and the wrong code path was taken, making two test-cases fail.

With the early return in expand_tests that dummy code in test.py is no longer needed.
Comment 28 Dirk Pranke 2010-12-15 12:51:29 PST
(In reply to comment #27)
> 
> Oh I see what happened :) I made that change before doing the early return in the expand_tests code. It was needed in that case because path_isdir() was returning False and the wrong code path was taken, making two test-cases fail.
> 
> With the early return in expand_tests that dummy code in test.py is no longer needed.

That makes sense. I remember doing something similarly before. Of course, my testing has reviewed that test.py is sensitive to where it is being run from (using a real layout_tests_dir()), which it shouldn't be.

So, I'll upload a separate patch that fixes that and removes this line.
Comment 29 Dirk Pranke 2010-12-15 12:52:05 PST
(In reply to comment #28)
> (In reply to comment #27)
> > 
> > Oh I see what happened :) I made that change before doing the early return in the expand_tests code. It was needed in that case because path_isdir() was returning False and the wrong code path was taken, making two test-cases fail.
> > 
> > With the early return in expand_tests that dummy code in test.py is no longer needed.
> 
> That makes sense. I remember doing something similarly before. Of course, my testing has reviewed that test.py is sensitive to where it is being run from (using a real layout_tests_dir()), which it shouldn't be.
> 
> So, I'll upload a separate patch that fixes that and removes this line.

Oh, and add some tests to make sure that we handle directories with "." in them properly.