RESOLVED FIXED 186443
Test262-Runner: Improve files queue to optimize CPU usage/balancing
https://bugs.webkit.org/show_bug.cgi?id=186443
Summary Test262-Runner: Improve files queue to optimize CPU usage/balancing
Leo Balter
Reported 2018-06-08 13:44:21 PDT
Test262-Runner: Improve files queue to optimize CPU usage/balancing
Attachments
Patch (6.16 KB, patch)
2018-06-08 13:50 PDT, Leo Balter
no flags
Patch (6.10 KB, patch)
2018-06-08 14:10 PDT, Leo Balter
no flags
Patch (6.23 KB, patch)
2018-06-12 15:15 PDT, Leo Balter
no flags
Patch (5.98 KB, patch)
2018-06-12 15:16 PDT, Leo Balter
no flags
Patch (5.92 KB, patch)
2018-06-12 16:10 PDT, Leo Balter
no flags
Archive of layout-test-results from ews202 for win-future (12.90 MB, application/zip)
2018-06-13 04:43 PDT, EWS Watchlist
no flags
Patch (6.60 KB, patch)
2018-06-13 12:25 PDT, Leo Balter
no flags
Archive of layout-test-results from ews201 for win-future (12.76 MB, application/zip)
2018-06-13 15:36 PDT, EWS Watchlist
no flags
Patch (6.62 KB, patch)
2018-06-14 10:58 PDT, Leo Balter
no flags
Patch (6.51 KB, patch)
2018-06-14 18:47 PDT, Leo Balter
no flags
Archive of layout-test-results from ltilve-gtk-wk2-ews for gtk-wk2 (2.88 MB, application/zip)
2018-06-18 07:30 PDT, Igalia-pontevedra EWS
no flags
Patch (6.57 KB, patch)
2018-06-18 10:57 PDT, Leo Balter
no flags
Leo Balter
Comment 1 2018-06-08 13:50:22 PDT
Leo Balter
Comment 2 2018-06-08 14:10:18 PDT
Filip Pizlo
Comment 3 2018-06-08 15:11:32 PDT
Comment on attachment 342324 [details] Patch Why is it necessary to only do for Kim when the number of files is large? Why is there chunking? Would it be possible to start nproc = ncpu + 1, and then have a balancer in proc0 that, if asked to do so via a request on a pipe, will bend a test index. All other processes ask proc0 for a test after each test they run. Running a test requires starting a VM, right? So that’s way more expensive than a packet of data on a pipe (or socketpair). That’s what I meant by load balancer. I’m not sure exactly what this patch achieves - perhaps an improvement in some specific case of load balancing. Maybe you could quote the speedup in the changelog?
Filip Pizlo
Comment 4 2018-06-08 15:12:23 PDT
(In reply to Filip Pizlo from comment #3) > Comment on attachment 342324 [details] > Patch > > Why is it necessary to only do for Kim when the number of files is large? Lol I meant forking, not for Kim. > > Why is there chunking? > > Would it be possible to start nproc = ncpu + 1, and then have a balancer in > proc0 that, if asked to do so via a request on a pipe, will bend a test > index. All other processes ask proc0 for a test after each test they run. > Running a test requires starting a VM, right? So that’s way more expensive > than a packet of data on a pipe (or socketpair). > > That’s what I meant by load balancer. I’m not sure exactly what this patch > achieves - perhaps an improvement in some specific case of load balancing. > Maybe you could quote the speedup in the changelog?
Filip Pizlo
Comment 5 2018-06-08 15:13:18 PDT
(In reply to Filip Pizlo from comment #3) > Comment on attachment 342324 [details] > Patch > > Why is it necessary to only do for Kim when the number of files is large? > > Why is there chunking? > > Would it be possible to start nproc = ncpu + 1, and then have a balancer in > proc0 that, if asked to do so via a request on a pipe, will bend a test Vend a test. > index. All other processes ask proc0 for a test after each test they run. > Running a test requires starting a VM, right? So that’s way more expensive > than a packet of data on a pipe (or socketpair). > > That’s what I meant by load balancer. I’m not sure exactly what this patch > achieves - perhaps an improvement in some specific case of load balancing. > Maybe you could quote the speedup in the changelog?
Leo Balter
Comment 6 2018-06-12 09:55:47 PDT
I had a very long time trying to get pipe/socketpair to work here. I also relying on some help from friends as Rick Waldron and even Perl committers. Turns out, there is no way to create a queue were we can communicate from a single parent to many children properly. Perl will block the first children to set a 1:1 fetching. I have a non-working example here: https://gist.github.com/leobalter/779cbaa8a4da1e7a0148c475ab822d65 I also tried parallel queues, waiting for the children to signal for more items, but I also found no lucky there. The current resources are limiting me to block the parent itself or closing the sockets earlier than I want. Another example here: https://gist.github.com/leobalter/d5c2817af13cfbe5e59b9ec958071352 I'm avoiding to rely on threads as I'm guessing the targets won't have any compiled Perls using threads appropriately. I still want to fix this, but at this point I'm way out of hope. Let me know how should I follow from here.
Leo Balter
Comment 7 2018-06-12 15:15:44 PDT
Leo Balter
Comment 8 2018-06-12 15:16:55 PDT
Leo Balter
Comment 9 2018-06-12 15:19:33 PDT
ok, the latest patch should handle the queue manager as desired. there are some uncommon loops in the processes to avoid blocking states in the parent and the children processes. I'm yet to check the final timing results.
Leo Balter
Comment 10 2018-06-12 15:38:35 PDT
I broke something else... I'll need to fix it in a next path
Leo Balter
Comment 11 2018-06-12 16:10:19 PDT
Leo Balter
Comment 12 2018-06-12 16:11:17 PDT
Ok, final fix is done.
EWS Watchlist
Comment 13 2018-06-13 04:43:34 PDT
Comment on attachment 342607 [details] Patch Attachment 342607 [details] did not pass win-ews (win): Output: http://webkit-queues.webkit.org/results/8161866 New failing tests: http/tests/security/canvas-remote-read-remote-video-localhost.html
EWS Watchlist
Comment 14 2018-06-13 04:43:45 PDT
Created attachment 342646 [details] Archive of layout-test-results from ews202 for win-future The attached test failures were seen while running run-webkit-tests on the win-ews. Bot: ews202 Port: win-future Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Michael Saboff
Comment 15 2018-06-13 07:51:50 PDT
Comment on attachment 342607 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=342607&action=review r- See my update to 1.pm which uses IO::Select at https://gist.github.com/msaboff/8540538562a411f3b2b6b9029aa23cb2.js > Tools/Scripts/test262/Runner.pm:358 > + foreach my $child (@children) { > + if ($child->getline) { Instead of busy polling, this code should use IO::Select.
Michael Saboff
Comment 16 2018-06-13 07:52:48 PDT
(In reply to Michael Saboff from comment #15) > Comment on attachment 342607 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=342607&action=review > > r- > > See my update to 1.pm which uses IO::Select at > https://gist.github.com/msaboff/8540538562a411f3b2b6b9029aa23cb2.js > > > Tools/Scripts/test262/Runner.pm:358 > > + foreach my $child (@children) { > > + if ($child->getline) { > > Instead of busy polling, this code should use IO::Select. This link works better - https://gist.github.com/msaboff/8540538562a411f3b2b6b9029aa23cb2
Leo Balter
Comment 17 2018-06-13 12:25:55 PDT
Leo Balter
Comment 18 2018-06-13 12:27:28 PDT
oh thanks! The new patch is using select as you suggested.
EWS Watchlist
Comment 19 2018-06-13 15:36:47 PDT
Comment on attachment 342681 [details] Patch Attachment 342681 [details] did not pass win-ews (win): Output: http://webkit-queues.webkit.org/results/8168574 New failing tests: http/tests/preload/onload_event.html
EWS Watchlist
Comment 20 2018-06-13 15:36:59 PDT
Created attachment 342699 [details] Archive of layout-test-results from ews201 for win-future The attached test failures were seen while running run-webkit-tests on the win-ews. Bot: ews201 Port: win-future Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Leo Balter
Comment 21 2018-06-14 10:58:45 PDT
Michael Saboff
Comment 22 2018-06-14 14:16:25 PDT
Comment on attachment 342745 [details] Patch r=me
WebKit Commit Bot
Comment 23 2018-06-14 14:44:19 PDT
Comment on attachment 342745 [details] Patch Rejecting attachment 342745 [details] from commit-queue. Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-03', 'land-attachment', '--force-clean', '--non-interactive', '--parent-command=commit-queue', 342745, '--port=mac']" exit_code: 2 cwd: /Volumes/Data/EWS/WebKit Logging in as commit-queue@webkit.org... Fetching: https://bugs.webkit.org/attachment.cgi?id=342745&action=edit Fetching: https://bugs.webkit.org/show_bug.cgi?id=186443&ctype=xml&excludefield=attachmentdata Processing 1 patch from 1 bug. Updating working directory Processing patch 342745 from bug 186443. Fetching: https://bugs.webkit.org/attachment.cgi?id=342745 Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force', '--reviewer', u'Michael Saboff']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit Parsed 2 diffs from patch file(s). patching file Tools/ChangeLog Hunk #1 succeeded at 1 with fuzz 3. patching file Tools/Scripts/test262/Runner.pm Hunk #4 FAILED at 246. Hunk #5 succeeded at 268 with fuzz 1. Hunk #7 succeeded at 1061 (offset 34 lines). 1 out of 7 hunks FAILED -- saving rejects to file Tools/Scripts/test262/Runner.pm.rej Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force', '--reviewer', u'Michael Saboff']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit Parsed 2 diffs from patch file(s). patching file Tools/ChangeLog Hunk #1 succeeded at 1 with fuzz 3. patching file Tools/Scripts/test262/Runner.pm Hunk #4 FAILED at 246. Hunk #5 succeeded at 268 with fuzz 1. Hunk #7 succeeded at 1061 (offset 34 lines). 1 out of 7 hunks FAILED -- saving rejects to file Tools/Scripts/test262/Runner.pm.rej Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force', '--reviewer', u'Michael Saboff']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit Updating OpenSource From https://git.webkit.org/git/WebKit 22e11a8051f..ab90f1dc70f master -> origin/master Partial-rebuilding .git/svn/refs/remotes/origin/master/.rev_map.268f45cc-cd09-0410-ab3c-d52691b4dbfc ... Currently at 232852 = 22e11a8051f314f141d406c5e996b1782140e67e r232853 = af256119d9cbac2ff09685739e5208e4f9648933 r232854 = 6a3a1c2e9fda291c1dbd5df60847132e4c68241c r232855 = ab90f1dc70f49c274e8de47947c29273c5a3381a Done rebuilding .git/svn/refs/remotes/origin/master/.rev_map.268f45cc-cd09-0410-ab3c-d52691b4dbfc First, rewinding head to replay your work on top of it... Fast-forwarded master to refs/remotes/origin/master. Full output: http://webkit-queues.webkit.org/results/8185091
Leo Balter
Comment 24 2018-06-14 18:47:26 PDT
Michael Saboff
Comment 25 2018-06-16 10:48:52 PDT
Comment on attachment 342782 [details] Patch r=me
WebKit Commit Bot
Comment 26 2018-06-16 11:15:06 PDT
Comment on attachment 342782 [details] Patch Rejecting attachment 342782 [details] from commit-queue. Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-01', 'land-attachment', '--force-clean', '--non-interactive', '--parent-command=commit-queue', 342782, '--port=mac']" exit_code: 2 cwd: /Volumes/Data/EWS/WebKit Logging in as commit-queue@webkit.org... Fetching: https://bugs.webkit.org/attachment.cgi?id=342782&action=edit Fetching: https://bugs.webkit.org/show_bug.cgi?id=186443&ctype=xml&excludefield=attachmentdata Processing 1 patch from 1 bug. Updating working directory Processing patch 342782 from bug 186443. Fetching: https://bugs.webkit.org/attachment.cgi?id=342782 Failed to run "['git', 'svn', 'dcommit', '--rmdir']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit Committing to http://svn.webkit.org/repository/webkit/trunk ... M Tools/ChangeLog ERROR from SVN: Item is out of date: File '/trunk/Tools/ChangeLog' is out of date W: fb5beed2b61cc9bc0f5e14278ad5dd0c882ec74d and refs/remotes/origin/master differ, using rebase: :040000 040000 bfac9e8560236c3df3aa663672e9a859a2fe6a6e 2a8c19cc9ab51c68fade1351d0ddd533f6ae630c M Tools Current branch master is up to date. ERROR: Not all changes have been committed into SVN, however the committed ones (if any) seem to be successfully integrated into the working tree. Please see the above messages for details. Failed to run "['git', 'svn', 'dcommit', '--rmdir']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit Committing to http://svn.webkit.org/repository/webkit/trunk ... M Tools/ChangeLog ERROR from SVN: Item is out of date: File '/trunk/Tools/ChangeLog' is out of date W: fb5beed2b61cc9bc0f5e14278ad5dd0c882ec74d and refs/remotes/origin/master differ, using rebase: :040000 040000 bfac9e8560236c3df3aa663672e9a859a2fe6a6e 2a8c19cc9ab51c68fade1351d0ddd533f6ae630c M Tools Current branch master is up to date. ERROR: Not all changes have been committed into SVN, however the committed ones (if any) seem to be successfully integrated into the working tree. Please see the above messages for details. Failed to run "['git', 'svn', 'dcommit', '--rmdir']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit Updating OpenSource Current branch master is up to date. Full output: http://webkit-queues.webkit.org/results/8212515
Igalia-pontevedra EWS
Comment 27 2018-06-18 07:30:43 PDT
Comment on attachment 342782 [details] Patch Attachment 342782 [details] did not pass gtk-wk2-ews (gtk-wk2): Output: http://webkit-queues.webkit.org/results/8231543 New failing tests: http/tests/misc/cached-scripts.html
Igalia-pontevedra EWS
Comment 28 2018-06-18 07:30:48 PDT
Created attachment 342934 [details] Archive of layout-test-results from ltilve-gtk-wk2-ews for gtk-wk2 The attached test failures were seen while running run-webkit-tests on the gtk-wk2-ews. Bot: ltilve-gtk-wk2-ews Port: gtk-wk2 Platform: Linux-4.16.0-0.bpo.1-amd64-x86_64-with-debian-9.4
Michael Catanzaro
Comment 29 2018-06-18 10:47:27 PDT
(In reply to Igalia-pontevedra EWS from comment #27) > Comment on attachment 342782 [details] > Patch > > Attachment 342782 [details] did not pass gtk-wk2-ews (gtk-wk2): > Output: http://webkit-queues.webkit.org/results/8231543 > > New failing tests: > http/tests/misc/cached-scripts.html Reported bug #186778
Leo Balter
Comment 30 2018-06-18 10:57:47 PDT
WebKit Commit Bot
Comment 31 2018-06-19 11:59:57 PDT
Comment on attachment 342951 [details] Patch Clearing flags on attachment: 342951 Committed r232972: <https://trac.webkit.org/changeset/232972>
WebKit Commit Bot
Comment 32 2018-06-19 11:59:59 PDT
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 33 2018-06-19 12:02:39 PDT
Dawei Fenton (:realdawei)
Comment 34 2018-06-19 15:01:49 PDT
Seeing perl test failure after this latest revision for instance: https://build.webkit.org/builders/Apple%20Sierra%20Release%20WK2%20(Tests)/builds/10036/steps/webkitperl-test/logs/stdio Can't exec "/usr/bin/non-existent-command": No such file or directory at /Volumes/Data/slave/sierra-release-tests-wk2/build/Tools/Scripts/VCSUtils.pm line 2403. Failed to exec(): No such file or directory at /Volumes/Data/slave/sierra-release-tests-wk2/build/Tools/Scripts/VCSUtils.pm line 2403. and # Failed test 'expectations yaml file format' # at Tools/Scripts/webkitperl/test262_unittest/test262-runner-tests.pl line 157. # Looks like you failed 2 tests of 13.
Leo Balter
Comment 35 2018-06-19 15:43:56 PDT
Thanks for the report, David. I filed a new patch here: https://bugs.webkit.org/show_bug.cgi?id=186824
Note You need to log in before you can comment on or make changes to this bug.