WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
167626
REGRESSION(
r182916
) run-perf-tests never timeouts
https://bugs.webkit.org/show_bug.cgi?id=167626
Summary
REGRESSION(r182916) run-perf-tests never timeouts
Carlos Alberto Lopez Perez
Reported
2017-01-30 19:31:40 PST
Not sure when this started happening, but it seems that the timeout feature of run-perf-tests is broken. The Script run-perf-tests accepts a parameter --time-out-ms with default value of 600 seconds (600 * 1000). But this parameter is completely ignored because in webkitpy/performance_tests/perftest.py the driver is created with no_timeout=True unconditionally. def _create_driver(self): return self._port.create_driver(worker_number=0, no_timeout=True) And this causes that the tests run without any timeout deadline: See Driver.run_test() at webkitpy/port/driver.py when self._no_timeout is true. So, I think we should: * Make the --time-out-ms parameter work back as expected. * Add a --no-timeout parameter for testing purposes. * Keep the default 600 second time out value for for each performance (sub)test. This means that: * Some tests on the perf bots may break if we do what I propose, as they are running now without any timeout value. But I think this tests should be fixed to not take so much time to run. * There is a general 1200 second timeout value for the whole perf-test step (command timed out: 1200 seconds without output), but when this triggers it kills the whole step rather than stopping only the running test. * If some port (Apple/EFL/GTK+) has any reason to want different defaults, they can override this for their own Perf bots in the master.cfg file at Tools/BuildSlaveSupport/build.webkit.org-config/
Attachments
Patch
(3.72 KB, patch)
2017-01-30 19:49 PST
,
Carlos Alberto Lopez Perez
no flags
Details
Formatted Diff
Diff
Patch
(7.23 KB, patch)
2017-02-03 11:51 PST
,
Carlos Alberto Lopez Perez
no flags
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Carlos Alberto Lopez Perez
Comment 1
2017-01-30 19:49:43 PST
Created
attachment 300181
[details]
Patch
Ryosuke Niwa
Comment 2
2017-01-31 00:36:52 PST
Comment on
attachment 300181
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=300181&action=review
> Tools/ChangeLog:10 > + The --time-out-ms value (defaults to 600 seconds) of run-perf-tests > + was beeing ignored because the driver was created with no_timeout=True > + unconditionally.
This is intentional. The timeout is enforced by Python code in _run_with_driver. In fact, if we don't use no timeout option, DRT/WTR would timeout after 10 minutes or so.
Carlos Alberto Lopez Perez
Comment 3
2017-01-31 04:24:40 PST
(In reply to
comment #2
)
> Comment on
attachment 300181
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=300181&action=review
> > > Tools/ChangeLog:10 > > + The --time-out-ms value (defaults to 600 seconds) of run-perf-tests > > + was beeing ignored because the driver was created with no_timeout=True > > + unconditionally. > > This is intentional. The timeout is enforced by Python code in > _run_with_driver. > In fact, if we don't use no timeout option, DRT/WTR would timeout after 10 > minutes or so.
I don't think that is the case. I have been testing this and I'm pretty sure that the tests are all ran without any timeout value (actually a 7 days one). See a more detailed explanation of the issue below: At Tools/Scripts/webkitpy/performance_tests/perftest.py line 177 you have. def _run_with_driver(self, driver, time_out_ms): output = self.run_single(driver, self.test_path(), time_out_ms) So PerfTest.run_single() is effectively called with the timeout value. This is defined as: def run_single(self, driver, test_path, time_out_ms, should_run_pixel_test=False): return driver.run_test(DriverInput(test_path, time_out_ms, image_hash=None, should_run_pixel_test=should_run_pixel_test), stop_when_done=False) So it ends calling Driver.run_test(). And this function is defined as follows at Tools/Scripts/webkitpy/port/driver.py line 174 def run_test(self, driver_input, stop_when_done): """Run a single test and return the results. Note that it is okay if a test times out or crashes and leaves the driver in an indeterminate state. The upper layers of the program are responsible for cleaning up and ensuring things are okay. Returns a DriverOutput object. """ start_time = time.time() self.start(driver_input.should_run_pixel_test, driver_input.args) test_begin_time = time.time() self._driver_timed_out = False self._crash_report_from_driver = None self.error_from_test = str() self.err_seen_eof = False command = self._command_from_driver_input(driver_input) # Certain timeouts are detected by the tool itself; tool detection is better, # because results contain partial output in this case. Make script timeout longer # by 5 seconds to avoid racing for which timeout is detected first. # FIXME: It's not the job of the driver to decide what the timeouts should be. # Move the additional timeout to driver_input. if self._no_timeout: deadline = test_begin_time + 60 * 60 * 24 * 7 # 7 days. Using sys.maxint causes a hang. else: deadline = test_begin_time + int(driver_input.timeout) / 1000.0 + 5 [..... ] See the issue above? The driver class that is called was created at Tools/Scripts/webkitpy/performance_tests/perftest.py line 125 def _create_driver(self): return self._port.create_driver(worker_number=0, no_timeout=True) So Driver().run_test() will enter into this: if self._no_timeout: deadline = test_begin_time + 60 * 60 * 24 * 7 # 7 days. Using Causing the driver_input.timeout value to be ignored, and all the tests to run with no timeout (a 7 days one actually!)
Carlos Alberto Lopez Perez
Comment 4
2017-01-31 04:56:58 PST
(In reply to
comment #2
)
> In fact, if we don't use no timeout option, DRT/WTR would timeout after 10 > minutes or so.
If that were the case, then I wonder how the following can be happening: 1) The test IndexedDB/large-binary-keys.html is taking 1048 seconds (~17 minutes) to run on the Apple perf bot: Apple perf bot
https://build.webkit.org/builders/Apple%20El%20Capitan%20Release%20WK2%20%28Perf%29/builds/4138/steps/perf-test/logs/stdio
[.............] Running IndexedDB/large-binary-keys.html (78 of 162) RESULT IndexedDB: large-binary-keys: Time= 12710.29875 ms median= 12767.6 ms, stdev= 192.085687956 ms, min= 12049.8 ms, max= 12873.7 ms RESULT IndexedDB: large-binary-keys: JSHeap= 690339155.65 bytes median= 663655063.0 bytes, stdev= 248329564.206 bytes, min= 248397774.0 bytes, max= 1206834295.0 bytes RESULT IndexedDB: large-binary-keys: Malloc= 847334707.2 bytes median= 872071168.0 bytes, stdev= 258591471.735 bytes, min= 369184768.0 bytes, max= 1255882752.0 bytes Finished: 1048.817416 s 2) In both the GTK+ and EFL perf bots, some of this IndexedDBs are causing the whole step to abort because the 1200 second timeout value for the whole perf-test step (command timed out: 1200 seconds without output) triggers before any individual perf test timeout? GTK perf bot
https://build.webkit.org/builders/GTK%20Linux%2064-bit%20Release%20%28Perf%29/builds/7761/steps/perf-test/logs/stdio
[.............] Running IndexedDB/index-get.html (75 of 161) command timed out: 1200 seconds without output, killing pid 12302 process killed by signal 9 program finished with exit code -1 elapsedTime=7711.394962 EFL perf bot
https://build.webkit.org/builders/EFL%20Linux%2064-bit%20Release%20WK2%20%28Perf%29/builds/11181/steps/perf-test/logs/stdio
[.............] Running IndexedDB/large-number-of-inserts-responsiveness.html (79 of 162) command timed out: 1200 seconds without output running ['python', './Tools/Scripts/run-perf-tests', '--output-json-path', 'perf-test-results.json', '--slave-config-json-path', '../../perf-test-config.json', '--no-show-results', '--reset-results', '--test-results-server', 'perf.webkit.org', '--builder-name', 'EFL Linux 64-bit Release WK2 (Perf)', '--build-number', '11181', '--platform', 'efl', '--no-build', '--release'], attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=8274.328622
Ryosuke Niwa
Comment 5
2017-01-31 13:54:48 PST
That got regressed in
https://trac.webkit.org/changeset/182916
. We need to differentiate the timeout in DRT/WTR from the timeout in the tool. I think
https://trac.webkit.org/changeset/182916
is trying to disable both, but for perf testing, we need to only disable the watchdog timer in DRT/WTR, not the one in the tool.
Alexey Proskuryakov
Comment 6
2017-01-31 14:07:55 PST
To only disable DRT timers, --additional-drt-flag may be best.
Ryosuke Niwa
Comment 7
2017-01-31 14:16:43 PST
(In reply to
comment #6
)
> To only disable DRT timers, --additional-drt-flag may be best.
Well, this is what run-perf-tests always uses but I suppose we can re-use the same facility used to add additional DRT flags.
Carlos Alberto Lopez Perez
Comment 8
2017-02-01 11:16:58 PST
So... How do you propose we fix this? Is the patch I uploaded not good enough?
Ryosuke Niwa
Comment 9
2017-02-01 13:27:43 PST
(In reply to
comment #8
)
> How do you propose we fix this?
We need to pass --no-timeout to DRT using the same facility as --additional-drt-flag's
> Is the patch I uploaded not good enough?
Unfortunately not.
Carlos Alberto Lopez Perez
Comment 10
2017-02-01 14:17:27 PST
(In reply to
comment #9
)
> (In reply to
comment #8
) > > How do you propose we fix this? > > We need to pass --no-timeout to DRT using the same facility as > --additional-drt-flag's >
I'm confused. How passing --no-timeout to DRT will fix the issue? The issue is that the driver that runs the tests from run-perf-tests ignores the timeout value. Tools/Scripts/run-perf-tests --time-out-ms=${value} This ${value} defaults to 10 minutes if not specified, but Driver().run_test() ignores any ${value} because no_timeout=True is passed.
Ryosuke Niwa
Comment 11
2017-02-01 17:02:17 PST
(In reply to
comment #10
)
> (In reply to
comment #9
) > > (In reply to
comment #8
) > > > How do you propose we fix this? > > > > We need to pass --no-timeout to DRT using the same facility as > > --additional-drt-flag's > > > > I'm confused. > > How passing --no-timeout to DRT will fix the issue? > > The issue is that the driver that runs the tests from run-perf-tests ignores > the timeout value.
What I'm saying is that we need to make DRT take --no-timeout, and then make Python code still use the timeout set to run-perf-tests.
> This ${value} defaults to 10 minutes if not specified, but > Driver().run_test() ignores any ${value} because no_timeout=True is passed.
We need to 1. fix this code, and 2. making sure DRT/WTR is ran with --no-timeout inside run-perf-tests. Your patch only does (1) and not (2).
Carlos Alberto Lopez Perez
Comment 12
2017-02-03 11:51:19 PST
Created
attachment 300553
[details]
Patch
Michael Catanzaro
Comment 13
2017-02-10 20:06:53 PST
Hi Ryosuke, does this look good?
Ryosuke Niwa
Comment 14
2017-02-10 21:52:01 PST
Comment on
attachment 300553
[details]
Patch Yup, this looks correct. Thanks for the fix!
WebKit Commit Bot
Comment 15
2017-02-10 22:02:55 PST
Comment on
attachment 300553
[details]
Patch Clearing flags on attachment: 300553 Committed
r212182
: <
http://trac.webkit.org/changeset/212182
>
WebKit Commit Bot
Comment 16
2017-02-10 22:03:02 PST
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug