RESOLVED FIXED 50635
[new-run-webkit-tests] expectations parsing is slow
https://bugs.webkit.org/show_bug.cgi?id=50635
Summary [new-run-webkit-tests] expectations parsing is slow
Philippe Normand
Reported 2010-12-07 09:37:58 PST
In the _read method of the TestExpectationsFile class in test_expectations.py the full list of tests is iterated once for each expectation in _expand_tests(). This could be optimized by checking if the expectation is a test case file or a test category. In the first case the list iteration can be avoided.
Attachments
proposed patch (2.14 KB, patch)
2010-12-07 09:47 PST, Philippe Normand
eric: review-
updated patch (2.16 KB, patch)
2010-12-12 03:36 PST, Philippe Normand
no flags
updated patch (2.15 KB, patch)
2010-12-13 05:55 PST, Philippe Normand
no flags
updated patch (5.51 KB, patch)
2010-12-13 09:45 PST, Philippe Normand
ojan: review-
updated patch (5.58 KB, patch)
2010-12-14 06:55 PST, Philippe Normand
no flags
updated patch (5.58 KB, patch)
2010-12-14 07:04 PST, Philippe Normand
ojan: review+
Philippe Normand
Comment 1 2010-12-07 09:47:49 PST
Created attachment 75825 [details] proposed patch
WebKit Review Bot
Comment 2 2010-12-07 10:21:46 PST
Attachment 75825 [details] did not pass style-queue: Failed to run "[u'git', u'reset', u'--hard', u'refs/remotes/trunk']" exit_code: 128 error: Could not write new index file. fatal: Could not reset index file to revision 'refs/remotes/trunk'. If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Review Bot
Comment 3 2010-12-07 11:22:55 PST
Attachment 75825 [details] did not pass style-queue: Failed to run "[u'git', u'reset', u'--hard', u'refs/remotes/trunk']" exit_code: 128 error: Could not write new index file. fatal: Could not reset index file to revision 'refs/remotes/trunk'. If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Review Bot
Comment 4 2010-12-07 12:24:15 PST
Attachment 75825 [details] did not pass style-queue: Failed to run "[u'git', u'reset', u'--hard', u'refs/remotes/trunk']" exit_code: 128 error: Could not write new index file. fatal: Could not reset index file to revision 'refs/remotes/trunk'. If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Review Bot
Comment 5 2010-12-07 21:42:36 PST
Attachment 75825 [details] did not pass style-queue: Failed to run "['WebKitTools/Scripts/update-webkit']" exit_code: 2 Updating OpenSource Incomplete data: Delta source ended unexpectedly at /usr/lib/git-core/git-svn line 5061 Died at WebKitTools/Scripts/update-webkit line 132. If any of these errors are false positives, please file a bug against check-webkit-style.
Eric Seidel (no email)
Comment 6 2010-12-12 02:23:09 PST
Comment on attachment 75825 [details] proposed patch View in context: https://bugs.webkit.org/attachment.cgi?id=75825&action=review Well, instead this causes us to stat the file system for each path. I guess we're doing so right above, so the inode should be cached... Would this be faster/just as fast if we just compiled a regexp for the path first? How did you perf test this? How much speedup did you see? > WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py:723 > + if not os.path.isfile(path): I guess I would have reversed this if to be positive instead of negative. > WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py:725 > + result = [ test for test in self._full_test_list if test.startswith(path) ] I don't think PEP8 says we put extra spaces around [ ] like this.
Philippe Normand
Comment 7 2010-12-12 02:35:09 PST
(In reply to comment #6) > > How did you perf test this? How much speedup did you see? > This is easy to perf :) Just try new-run-webkit-tests --platform=gtk with and without the patch. We (Alexg and I) profiled this with the python hotshot profiler. Without the patch parsing the expectations takes about 20 seconds here and with the patch less than 2 seconds. > > WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py:723 > > + if not os.path.isfile(path): > > I guess I would have reversed this if to be positive instead of negative. > > > WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py:725 > > + result = [ test for test in self._full_test_list if test.startswith(path) ] > > I don't think PEP8 says we put extra spaces around [ ] like this. OK I can fix those. Thanks for the review
Philippe Normand
Comment 8 2010-12-12 03:36:46 PST
Created attachment 76320 [details] updated patch
WebKit Review Bot
Comment 9 2010-12-12 03:39:25 PST
Attachment 76320 [details] did not pass style-queue: Failed to run "['WebKitTools/Scripts/check-webkit-style', '--diff-files', u'WebKitTools/ChangeLog', u'WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py']" exit_code: 1 WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py:739: missing whitespace after ',' [pep8/E231] [5] Total errors found: 1 in 2 files If any of these errors are false positives, please file a bug against check-webkit-style.
Philippe Normand
Comment 10 2010-12-13 05:55:23 PST
Created attachment 76377 [details] updated patch and "style" fixed!
Ojan Vafai
Comment 11 2010-12-13 08:31:32 PST
Comment on attachment 76377 [details] updated patch I assume you ran the unittests for this. If not, please make sure to do so before committing. Also, it would be nice if you could add a test for this case. The closest thing I could find was test_precedence, but that doesn't test the case of a file being listed without it's parent directory also being listed. Minor edge case, but more tests make me feel safer. :)
Philippe Normand
Comment 12 2010-12-13 09:44:00 PST
(In reply to comment #11) > (From update of attachment 76377 [details]) > I assume you ran the unittests for this. If not, please make sure to do so before committing. > Yes I did check that before sending the patch for review. > Also, it would be nice if you could add a test for this case. The closest thing I could find was test_precedence, but that doesn't test the case of a file being listed without it's parent directory also being listed. Minor edge case, but more tests make me feel safer. :) Hum now I figured out my patch should rely on port's filesystem API :) Will send a new patch with a test!
Philippe Normand
Comment 13 2010-12-13 09:45:53 PST
Created attachment 76397 [details] updated patch
Dirk Pranke
Comment 14 2010-12-13 10:02:57 PST
Nice catch! I knew there had to be a reason chromium's parsing was getting quadratically slower but I hadn't had the time to look into it. The change basically looks fine but I would make one minor modification. We already know whether the path points to a directory or a file from the check on line 732, so you can merge the two if blocks together and save yourself an extra os call. Seems like we could probably still speed up the directory/category branch of the loop by switching to a more tree-based storage mechanism for the file list, or by revamping the calling code to do a merge join instead of a nested-loop join but it's probably not worth it for now, so I would leave the code as is since it's a lot more obviously correct. You could consider adding a #FIXME that there's an opportunity for a speed up here. As far as using a regexp goes, I'd be more than a little surprised if a regexp was faster than a string prefix match. -- Dirk
Ojan Vafai
Comment 15 2010-12-13 10:12:15 PST
Comment on attachment 76397 [details] updated patch r- per Dirk's comments. Specifically, making use of the is_dir check above, which will mean we don't need to add isfile to port either.
Eric Seidel (no email)
Comment 16 2010-12-14 01:26:48 PST
Comment on attachment 76377 [details] updated patch Cleared Ojan Vafai's review+ from obsolete attachment 76377 [details] so that this bug does not appear in http://webkit.org/pending-commit.
Philippe Normand
Comment 17 2010-12-14 06:55:00 PST
Created attachment 76532 [details] updated patch
WebKit Review Bot
Comment 18 2010-12-14 06:56:16 PST
Attachment 76532 [details] did not pass style-queue: Failed to run "['WebKitTools/Scripts/check-webkit-style', '--diff-files', u'WebKitTools/ChangeLog', u'WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py', u'WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations_unittest.py', u'WebKitTools/Scripts/webkitpy/layout_tests/port/test.py']" exit_code: 1 WebKitTools/Scripts/webkitpy/layout_tests/port/test.py:222: missing whitespace around operator [pep8/E225] [5] Total errors found: 1 in 4 files If any of these errors are false positives, please file a bug against check-webkit-style.
Philippe Normand
Comment 19 2010-12-14 07:04:13 PST
Created attachment 76533 [details] updated patch
Ojan Vafai
Comment 20 2010-12-14 07:25:20 PST
Comment on attachment 76533 [details] updated patch View in context: https://bugs.webkit.org/attachment.cgi?id=76533&action=review > WebKitTools/Scripts/webkitpy/layout_tests/layout_package/test_expectations.py:742 > + result = [test for test in self._full_test_list if test.startswith(path)] nit: I would make this an early return, then you can drop the "else:" line. > WebKitTools/Scripts/webkitpy/layout_tests/port/test.py:214 > + if path.find('.') != -1: This is fine for now given that this is just for testing, but in the real webkit tree, we have the CSS2.1 directory, which we may want to add tests for in the future. Please add a FIXME to make this work for that case.
Philippe Normand
Comment 21 2010-12-14 09:55:12 PST
Dirk Pranke
Comment 22 2010-12-14 12:22:55 PST
Comment on attachment 76533 [details] updated patch View in context: https://bugs.webkit.org/attachment.cgi?id=76533&action=review >> WebKitTools/Scripts/webkitpy/layout_tests/port/test.py:214 >> + if path.find('.') != -1: > > This is fine for now given that this is just for testing, but in the real webkit tree, we have the CSS2.1 directory, which we may want to add tests for in the future. Please add a FIXME to make this work for that case. In particular, since the list of files and directories is fixed and small in this case, all you are doing is speeding up the check in line 224, but I doubt that produces a noticeable speedup.
Philippe Normand
Comment 23 2010-12-14 13:33:56 PST
(In reply to comment #22) > (From update of attachment 76533 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=76533&action=review > > >> WebKitTools/Scripts/webkitpy/layout_tests/port/test.py:214 > >> + if path.find('.') != -1: > > > > This is fine for now given that this is just for testing, but in the real webkit tree, we have the CSS2.1 directory, which we may want to add tests for in the future. Please add a FIXME to make this work for that case. > > In particular, since the list of files and directories is fixed and small in this case, all you are doing is speeding up the check in line 224, but I doubt that produces a noticeable speedup. I had to make this change to actually make the tests pass... The noticeable speedup is in the real tests expectations parsing, eg. avoiding the expensive iteration over the *BIG* list of tests, when possible. :)
Ojan Vafai
Comment 24 2010-12-14 13:40:14 PST
FWIW, this cut parse expectations time on the chromium linux bot from 9 seconds to 2.
Dirk Pranke
Comment 25 2010-12-14 15:56:48 PST
(In reply to comment #23) > (In reply to comment #22) > > (From update of attachment 76533 [details] [details]) > > View in context: https://bugs.webkit.org/attachment.cgi?id=76533&action=review > > > > >> WebKitTools/Scripts/webkitpy/layout_tests/port/test.py:214 > > >> + if path.find('.') != -1: > > > > > > This is fine for now given that this is just for testing, but in the real webkit tree, we have the CSS2.1 directory, which we may want to add tests for in the future. Please add a FIXME to make this work for that case. > > > > In particular, since the list of files and directories is fixed and small in this case, all you are doing is speeding up the check in line 224, but I doubt that produces a noticeable speedup. > > I had to make this change to actually make the tests pass... > > The noticeable speedup is in the real tests expectations parsing, eg. avoiding the expensive iteration over the *BIG* list of tests, when possible. :) Oh, yeah, obviously there's a big speedup in the real parsing. I'm just surprised that you would need to make a change to test.py for the tests to pass. I'll have to look into that.
Dirk Pranke
Comment 26 2010-12-15 01:08:15 PST
(In reply to comment #25) > > > >> WebKitTools/Scripts/webkitpy/layout_tests/port/test.py:214 > > > >> + if path.find('.') != -1: > > > > > > > > This is fine for now given that this is just for testing, but in the real webkit tree, we have the CSS2.1 directory, which we may want to add tests for in the future. Please add a FIXME to make this work for that case. > > > > > > > I had to make this change to actually make the tests pass... > > > > > I'm just surprised that you would need to make a change to test.py for the tests to pass. I'll have to look into that. I guess I was justifiably surprised. That line actually causes test-webkitpy to fail for me, because my source tree is rooted in a directory with a "." in it (e.g. /src/foo.bar/WebKit/WebKitTools/...). However, if I comment out those two lines, it passes for me. What fails for you without those lines?
Philippe Normand
Comment 27 2010-12-15 12:44:37 PST
(In reply to comment #26) > (In reply to comment #25) > > > > >> WebKitTools/Scripts/webkitpy/layout_tests/port/test.py:214 > > > > >> + if path.find('.') != -1: > > > > > > > > > > This is fine for now given that this is just for testing, but in the real webkit tree, we have the CSS2.1 directory, which we may want to add tests for in the future. Please add a FIXME to make this work for that case. > > > > > > > > > > I had to make this change to actually make the tests pass... > > > > > > > > I'm just surprised that you would need to make a change to test.py for the tests to pass. I'll have to look into that. > > I guess I was justifiably surprised. That line actually causes test-webkitpy to fail for me, because my source tree is rooted in a directory with a "." in it (e.g. /src/foo.bar/WebKit/WebKitTools/...). However, if I comment out those two lines, it passes for me. What fails for you without those lines? Oh I see what happened :) I made that change before doing the early return in the expand_tests code. It was needed in that case because path_isdir() was returning False and the wrong code path was taken, making two test-cases fail. With the early return in expand_tests that dummy code in test.py is no longer needed.
Dirk Pranke
Comment 28 2010-12-15 12:51:29 PST
(In reply to comment #27) > > Oh I see what happened :) I made that change before doing the early return in the expand_tests code. It was needed in that case because path_isdir() was returning False and the wrong code path was taken, making two test-cases fail. > > With the early return in expand_tests that dummy code in test.py is no longer needed. That makes sense. I remember doing something similarly before. Of course, my testing has reviewed that test.py is sensitive to where it is being run from (using a real layout_tests_dir()), which it shouldn't be. So, I'll upload a separate patch that fixes that and removes this line.
Dirk Pranke
Comment 29 2010-12-15 12:52:05 PST
(In reply to comment #28) > (In reply to comment #27) > > > > Oh I see what happened :) I made that change before doing the early return in the expand_tests code. It was needed in that case because path_isdir() was returning False and the wrong code path was taken, making two test-cases fail. > > > > With the early return in expand_tests that dummy code in test.py is no longer needed. > > That makes sense. I remember doing something similarly before. Of course, my testing has reviewed that test.py is sensitive to where it is being run from (using a real layout_tests_dir()), which it shouldn't be. > > So, I'll upload a separate patch that fixes that and removes this line. Oh, and add some tests to make sure that we handle directories with "." in them properly.
Note You need to log in before you can comment on or make changes to this bug.