RESOLVED FIXED 186635
[GTK][buildbot] Raise timeouts for the step benchmark-test on the GTK perf bot
https://bugs.webkit.org/show_bug.cgi?id=186635
Summary [GTK][buildbot] Raise timeouts for the step benchmark-test on the GTK perf bot
Carlos Alberto Lopez Perez
Reported 2018-06-14 16:07:27 PDT
The step benchmark-test on the GTK perf bot runs all the benchmark plans one by one (one after the other). Each one of this benchmarks plans have different timeouts. The problem is that buildbot itself will kill the whole step when no output is produced after a 1200 seconds timeout, but some of the plans for run-benchmark have higher timeouts than 1200... that means that when things go wrong (or the machine/built-product is slower than expected) the buildbot timeout will happen before the plan one. And when that happens, that means that the whole buildbot step will be stopped instead of having simply one test of the test set failing because its timeout has been reached. This has been an already an issue on the past, and we already fixed this by growing the buildbot timeout for this step on the past But it seems now some plans (like motionmark) have grown the plan timeout further, so we should adjust.
Attachments
Patch (3.71 KB, patch)
2018-06-14 17:04 PDT, Carlos Alberto Lopez Perez
no flags
Carlos Alberto Lopez Perez
Comment 1 2018-06-14 16:10:36 PDT
Example of the issue: * Build: https://build.webkit.org/builders/GTK%20Linux%2064-bit%20Release%20%28Perf%29/builds/3150 * Stdio log: https://build.webkit.org/builders/GTK%20Linux%2064-bit%20Release%20%28Perf%29/builds/3150/steps/benchmark-test/logs/stdio 2018-06-14 00:22:07,450 - INFO - Copied the benchmark into: /tmp/tmpxeBC7R/motionmark patching file resources/runner/motionmark.js 2018-06-14 00:22:07,455 - INFO - Start the iteration 1 of 1 for current benchmark 2018-06-14 00:22:07,456 - INFO - Launching an http server 2018-06-14 00:22:07,460 - INFO - Start to fetching the port number of the http server 2018-06-14 00:22:07,460 - INFO - Server port is not found this time, retry after 0.500000 seconds 2018-06-14 00:22:07,964 - INFO - HTTP Server is serving at port: 36079 2018-06-14 00:22:07,981 - INFO - Executing: /home/slave/webkitgtk/gtk-linux-64-release-perf-tests/build/Tools/Scripts/run-minibrowser --gtk --geometry=1024x768 http://127.0.0.1:36079/motionmark/index.html command timed out: 1200 seconds without output running ['python', './Tools/Scripts/run-benchmark', '--allplans', '--browser', 'minibrowser-gtk'], attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=5473.596568 Buildbot aborts at 1200 seconds without output.. but the motionmark timeout was 1800 according to https://svn.webkit.org/repository/webkit/trunk/Tools/Scripts/webkitpy/benchmark_runner/data/plans/motionmark.plan so the buildbot one happened before the plan one.. we need to avoid this
Carlos Alberto Lopez Perez
Comment 2 2018-06-14 17:04:54 PDT
EWS Watchlist
Comment 3 2018-06-14 17:07:05 PDT
Attachment 342779 [details] did not pass style-queue: ERROR: Tools/BuildSlaveSupport/build.webkit.org-config/factories.py:203: [BuildAndPerfTestFactory.__init__] Undefined variable 'RunBenchmarkTests' [pylint/E0602] [5] ERROR: Tools/BuildSlaveSupport/build.webkit.org-config/factories.py:213: [DownloadAndPerfTestFactory.__init__] Undefined variable 'RunBenchmarkTests' [pylint/E0602] [5] Total errors found: 2 in 2 files If any of these errors are false positives, please file a bug against check-webkit-style.
Michael Catanzaro
Comment 4 2018-06-14 19:04:24 PDT
Comment on attachment 342779 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=342779&action=review > Tools/BuildSlaveSupport/build.webkit.org-config/steps.py:-831 > - # Buildbot default timeout without output for a step is 1200. > - # The current maximum timeout for a benchmark plan is also 1200. > - # So raise the buildbot timeout to avoid aborting this whole step when a test timeouts. > - timeout = 1500 Weird that this was added if it has been ignored this whole time... maybe it regressed at some point in buildbot...?
Carlos Alberto Lopez Perez
Comment 5 2018-06-14 19:09:04 PDT
(In reply to Michael Catanzaro from comment #4) > Comment on attachment 342779 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=342779&action=review > > > Tools/BuildSlaveSupport/build.webkit.org-config/steps.py:-831 > > - # Buildbot default timeout without output for a step is 1200. > > - # The current maximum timeout for a benchmark plan is also 1200. > > - # So raise the buildbot timeout to avoid aborting this whole step when a test timeouts. > > - timeout = 1500 > > Weird that this was added if it has been ignored this whole time... maybe it > regressed at some point in buildbot...? I think it never worked. I guess I just assumed it would, but it didn't ... Testing buildbot master changes is kind of difficult because its not easy to access to a test environment. Now its a bit easier thanks to the script run-buildbot-test.py that was added on r225336 ... I used that now to verify this will work. But that script was not available back then.
Carlos Alberto Lopez Perez
Comment 6 2018-06-14 19:11:41 PDT
Comment on attachment 342779 [details] Patch Clearing flags on attachment: 342779 Committed r232866: <https://trac.webkit.org/changeset/232866>
Carlos Alberto Lopez Perez
Comment 7 2018-06-14 19:11:45 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.