Bug 167626 - REGRESSION(r182916) run-perf-tests never timeouts
Summary: REGRESSION(r182916) run-perf-tests never timeouts
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Carlos Alberto Lopez Perez
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-30 19:31 PST by Carlos Alberto Lopez Perez
Modified: 2017-02-10 22:03 PST (History)
10 users (show)

See Also:


Attachments
Patch (3.72 KB, patch)
2017-01-30 19:49 PST, Carlos Alberto Lopez Perez
no flags Details | Formatted Diff | Diff
Patch (7.23 KB, patch)
2017-02-03 11:51 PST, Carlos Alberto Lopez Perez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Carlos Alberto Lopez Perez 2017-01-30 19:31:40 PST
Not sure when this started happening, but it seems that the timeout feature of run-perf-tests is broken.

The Script run-perf-tests accepts a parameter --time-out-ms with default value of 600 seconds (600 * 1000).

But this parameter is completely ignored because in webkitpy/performance_tests/perftest.py the driver is created with no_timeout=True unconditionally.

def _create_driver(self):
   return self._port.create_driver(worker_number=0, no_timeout=True)

And this causes that the tests run without any timeout deadline: See Driver.run_test() at webkitpy/port/driver.py when self._no_timeout is true.


So, I think we should:

 * Make the --time-out-ms parameter work back as expected.
 * Add a --no-timeout parameter for testing purposes.
 * Keep the default 600 second time out value for for each performance (sub)test.


This means that:

* Some tests on the perf bots may break if we do what I propose, as they are running now without any timeout value. But I think this tests should be fixed to not take so much time to run. 
   *  There is a general 1200 second timeout value for the whole perf-test step (command timed out: 1200 seconds without output), but when this triggers it kills the whole step rather than stopping only the running test.
   *  If some port (Apple/EFL/GTK+) has any reason to want different defaults, they can override this for their own Perf bots in the master.cfg file at Tools/BuildSlaveSupport/build.webkit.org-config/
Comment 1 Carlos Alberto Lopez Perez 2017-01-30 19:49:43 PST
Created attachment 300181 [details]
Patch
Comment 2 Ryosuke Niwa 2017-01-31 00:36:52 PST
Comment on attachment 300181 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=300181&action=review

> Tools/ChangeLog:10
> +        The --time-out-ms value (defaults to 600 seconds) of run-perf-tests
> +        was beeing ignored because the driver was created with no_timeout=True
> +        unconditionally.

This is intentional. The timeout is enforced by Python code in _run_with_driver.
In fact, if we don't use no timeout option, DRT/WTR would timeout after 10 minutes or so.
Comment 3 Carlos Alberto Lopez Perez 2017-01-31 04:24:40 PST
(In reply to comment #2)
> Comment on attachment 300181 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=300181&action=review
> 
> > Tools/ChangeLog:10
> > +        The --time-out-ms value (defaults to 600 seconds) of run-perf-tests
> > +        was beeing ignored because the driver was created with no_timeout=True
> > +        unconditionally.
> 
> This is intentional. The timeout is enforced by Python code in
> _run_with_driver.
> In fact, if we don't use no timeout option, DRT/WTR would timeout after 10
> minutes or so.

I don't think that is the case.

I have been testing this and I'm pretty sure that the tests are all ran without any timeout value (actually a 7 days one).

See a more detailed explanation of the issue below:



At Tools/Scripts/webkitpy/performance_tests/perftest.py line 177

you have.

    def _run_with_driver(self, driver, time_out_ms):
        output = self.run_single(driver, self.test_path(), time_out_ms)


So PerfTest.run_single() is effectively called with the timeout value.


This is defined as:

    def run_single(self, driver, test_path, time_out_ms, should_run_pixel_test=False):
        return driver.run_test(DriverInput(test_path, time_out_ms, image_hash=None, should_run_pixel_test=should_run_pixel_test), stop_when_done=False)


So it ends calling Driver.run_test().


And this function is defined as follows at Tools/Scripts/webkitpy/port/driver.py line 174


    def run_test(self, driver_input, stop_when_done):
        """Run a single test and return the results.

        Note that it is okay if a test times out or crashes and leaves
        the driver in an indeterminate state. The upper layers of the program
        are responsible for cleaning up and ensuring things are okay.

        Returns a DriverOutput object.
        """
        start_time = time.time()
        self.start(driver_input.should_run_pixel_test, driver_input.args)
        test_begin_time = time.time()
        self._driver_timed_out = False
        self._crash_report_from_driver = None
        self.error_from_test = str()
        self.err_seen_eof = False

        command = self._command_from_driver_input(driver_input)

        # Certain timeouts are detected by the tool itself; tool detection is better,
        # because results contain partial output in this case. Make script timeout longer
        # by 5 seconds to avoid racing for which timeout is detected first.
        # FIXME: It's not the job of the driver to decide what the timeouts should be.
        # Move the additional timeout to driver_input.
        if self._no_timeout:
            deadline = test_begin_time + 60 * 60 * 24 * 7  # 7 days. Using sys.maxint causes a hang.
        else:
            deadline = test_begin_time + int(driver_input.timeout) / 1000.0 + 5


        [..... ]


See the issue above?


The driver class that is called was created at Tools/Scripts/webkitpy/performance_tests/perftest.py line 125


    def _create_driver(self):
        return self._port.create_driver(worker_number=0, no_timeout=True)


So Driver().run_test() will enter into this:

        if self._no_timeout:
            deadline = test_begin_time + 60 * 60 * 24 * 7  # 7 days. Using 

Causing the driver_input.timeout value to be ignored, and all the tests to run with no timeout (a 7 days one actually!)
Comment 4 Carlos Alberto Lopez Perez 2017-01-31 04:56:58 PST
(In reply to comment #2)
> In fact, if we don't use no timeout option, DRT/WTR would timeout after 10
> minutes or so.

If that were the case, then I wonder how the following can be happening:

 
1) The test IndexedDB/large-binary-keys.html is taking 1048 seconds (~17 minutes) to run on the Apple perf bot:

Apple perf bot

                https://build.webkit.org/builders/Apple%20El%20Capitan%20Release%20WK2%20%28Perf%29/builds/4138/steps/perf-test/logs/stdio
                [.............]                
                Running IndexedDB/large-binary-keys.html (78 of 162)
                RESULT IndexedDB: large-binary-keys: Time= 12710.29875 ms
                median= 12767.6 ms, stdev= 192.085687956 ms, min= 12049.8 ms, max= 12873.7 ms
                RESULT IndexedDB: large-binary-keys: JSHeap= 690339155.65 bytes
                median= 663655063.0 bytes, stdev= 248329564.206 bytes, min= 248397774.0 bytes, max= 1206834295.0 bytes
                RESULT IndexedDB: large-binary-keys: Malloc= 847334707.2 bytes
                median= 872071168.0 bytes, stdev= 258591471.735 bytes, min= 369184768.0 bytes, max= 1255882752.0 bytes
                Finished: 1048.817416 s
                


2) In both the GTK+ and EFL perf bots, some of this IndexedDBs are causing the whole step to abort because the  1200 second timeout value for the whole perf-test step (command timed out: 1200 seconds without output) triggers before any individual perf test timeout?


GTK perf bot 
                https://build.webkit.org/builders/GTK%20Linux%2064-bit%20Release%20%28Perf%29/builds/7761/steps/perf-test/logs/stdio
                [.............]
                Running IndexedDB/index-get.html (75 of 161)

                command timed out: 1200 seconds without output, killing pid 12302
                process killed by signal 9
                program finished with exit code -1
                elapsedTime=7711.394962
               

               
EFL perf bot 

                https://build.webkit.org/builders/EFL%20Linux%2064-bit%20Release%20WK2%20%28Perf%29/builds/11181/steps/perf-test/logs/stdio
                [.............]
                Running IndexedDB/large-number-of-inserts-responsiveness.html (79 of 162)

                command timed out: 1200 seconds without output running ['python', './Tools/Scripts/run-perf-tests', '--output-json-path', 'perf-test-results.json', '--slave-config-json-path', '../../perf-test-config.json', '--no-show-results', '--reset-results', '--test-results-server', 'perf.webkit.org', '--builder-name', 'EFL Linux 64-bit Release WK2 (Perf)', '--build-number', '11181', '--platform', 'efl', '--no-build', '--release'], attempting to kill
                process killed by signal 9
                program finished with exit code -1
                elapsedTime=8274.328622
Comment 5 Ryosuke Niwa 2017-01-31 13:54:48 PST
That got regressed in https://trac.webkit.org/changeset/182916.

We need to differentiate the timeout in DRT/WTR from the timeout in the tool.

I think https://trac.webkit.org/changeset/182916 is trying to disable both, but for perf testing, we need to only disable the watchdog timer in DRT/WTR, not the one in the tool.
Comment 6 Alexey Proskuryakov 2017-01-31 14:07:55 PST
To only disable DRT timers, --additional-drt-flag may be best.
Comment 7 Ryosuke Niwa 2017-01-31 14:16:43 PST
(In reply to comment #6)
> To only disable DRT timers, --additional-drt-flag may be best.

Well, this is what run-perf-tests always uses but I suppose we can re-use the same facility used to add additional DRT flags.
Comment 8 Carlos Alberto Lopez Perez 2017-02-01 11:16:58 PST
So...

How do you propose we fix this?

Is the patch I uploaded not good enough?
Comment 9 Ryosuke Niwa 2017-02-01 13:27:43 PST
(In reply to comment #8)
> How do you propose we fix this?

We need to pass --no-timeout to DRT using the same facility as --additional-drt-flag's

> Is the patch I uploaded not good enough?

Unfortunately not.
Comment 10 Carlos Alberto Lopez Perez 2017-02-01 14:17:27 PST
(In reply to comment #9)
> (In reply to comment #8)
> > How do you propose we fix this?
> 
> We need to pass --no-timeout to DRT using the same facility as
> --additional-drt-flag's
> 

I'm confused.

How passing --no-timeout to DRT will fix the issue?

The issue is that the driver that runs the tests from run-perf-tests ignores the timeout value.

Tools/Scripts/run-perf-tests --time-out-ms=${value}

This ${value} defaults to 10 minutes if not specified, but Driver().run_test() ignores any ${value} because no_timeout=True is passed.
Comment 11 Ryosuke Niwa 2017-02-01 17:02:17 PST
(In reply to comment #10)
> (In reply to comment #9)
> > (In reply to comment #8)
> > > How do you propose we fix this?
> > 
> > We need to pass --no-timeout to DRT using the same facility as
> > --additional-drt-flag's
> > 
> 
> I'm confused.
> 
> How passing --no-timeout to DRT will fix the issue?
> 
> The issue is that the driver that runs the tests from run-perf-tests ignores
> the timeout value.

What I'm saying is that we need to make DRT take --no-timeout, and then make Python code still use the timeout set to run-perf-tests.

> This ${value} defaults to 10 minutes if not specified, but
> Driver().run_test() ignores any ${value} because no_timeout=True is passed.

We need to 1. fix this code, and 2. making sure DRT/WTR is ran with --no-timeout inside run-perf-tests. Your patch only does (1) and not (2).
Comment 12 Carlos Alberto Lopez Perez 2017-02-03 11:51:19 PST
Created attachment 300553 [details]
Patch
Comment 13 Michael Catanzaro 2017-02-10 20:06:53 PST
Hi Ryosuke, does this look good?
Comment 14 Ryosuke Niwa 2017-02-10 21:52:01 PST
Comment on attachment 300553 [details]
Patch

Yup, this looks correct. Thanks for the fix!
Comment 15 WebKit Commit Bot 2017-02-10 22:02:55 PST
Comment on attachment 300553 [details]
Patch

Clearing flags on attachment: 300553

Committed r212182: <http://trac.webkit.org/changeset/212182>
Comment 16 WebKit Commit Bot 2017-02-10 22:03:02 PST
All reviewed patches have been landed.  Closing bug.