Streams API tests should have a common timeout
Created attachment 250787 [details] Patch Sets a default timeout of 150ms for all testharness tests.
I know this solution sucks, but there is no other way to get the properties set to the test. setup({timeout: 150}) does not work because those properties are for the test suite, not for the tests themselves. I suggest we land this because it is needed for our bots and in parallel I'll try to patch testharness upstream.
(In reply to comment #2) > I suggest we land this because it is needed for our bots and in parallel > I'll try to patch testharness upstream. Latest committed streams API patch failed some tests in slower bots because of timeout value being to low. Hence the idea to increase it.
Comment on attachment 250787 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=250787&action=review > LayoutTests/resources/testharnessreport.js:16 > + I would not add this here. Maybe we will need something like this if more WebKit testharness tests appear. At the moment, it might be better to add it in each of the 7 test files we have. That is duplicating 7 lines but in the future, if a specific test file is failing, we could then increase it per test/per test file, without increasing for the other files.
I don't think that 150ms ever makes sense in a regression test, unless we retry when the timeout passes. It is a very short period of time, a test can legitimately stall for seconds.
(In reply to comment #5) > I don't think that 150ms ever makes sense in a regression test, unless we > retry when the timeout passes. It is a very short period of time, a test can > legitimately stall for seconds. I don't know if you can do that with testharness, at least automatically. The only solution I can think of is splitting all tests in single files and letting the webkit test runner handle the timeouts, but we are speaking of files with a lot of test, more than 20.
(In reply to comment #6) > (In reply to comment #5) > > I don't think that 150ms ever makes sense in a regression test, unless we > > retry when the timeout passes. It is a very short period of time, a test can > > legitimately stall for seconds. > > I don't know if you can do that with testharness, at least automatically. > The only solution I can think of is splitting all tests in single files and > letting the webkit test runner handle the timeouts, but we are speaking of > files with a lot of test, more than 20. There are 2 timeouts: WK test runner and test harness. We want test harness ones to be short enough so that test runner timeout is not hit. We could set the test harness timeout per file depending on the number of timeouting tests in the file. The less timeouting tests in a file, the longer the test harness timeout value.
Comment on attachment 250787 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=250787&action=review > LayoutTests/ChangeLog:23 > + * resources/testharnessreport.js: Added default property variables > + for tests including a timeout of 150ms. If any timeout is short (shorter than, say, 5 seconds), then tests will be flaky. That's not OK. Please find a way to test without introducing flakiness. This may mean finding a different way to structure the tests.
For the moment, we could just comment out the failing tests. We would umcomment them as we add features. And no more need to set specific timeouts.
(In reply to comment #8) > Comment on attachment 250787 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=250787&action=review > > > LayoutTests/ChangeLog:23 > > + * resources/testharnessreport.js: Added default property variables > > + for tests including a timeout of 150ms. > > If any timeout is short (shorter than, say, 5 seconds), then tests will be > flaky. That's not OK. > > Please find a way to test without introducing flakiness. This may mean > finding a different way to structure the tests. I understand that it is not ok. The issue here is that we have a bunch of tests coming from the spec. Tests related to the same matter are placed in the same file. What we did was translating all those tests from tape to testharness (quick and painless) but we kept the files mapping so that they are easier to maintain. The key here is that a file can contain several tests so as Youenn said we have two timeouts, testharness and WebKitTestRunner. We can do several things here: 1. Remove (or comment out) the failing tests from the files. This is hard to maintain because for every patch we want to submit we would need to readd or uncomment the harness tests to see if they pass. 2. Create a 1-1 relation between harness tests and webkit tests. This would be hell to maintain because we would need to use either random filenames or name the files after the test description, which is a plain string. Every time a test is changed in the spec we would need to rename our test and that leads to tracing problems, etc. 3. Remove all timeouts. This causes testharness timeouts to trigger webkit timeouts, which renders useless the finegraining of the testharness tests as we don't have output. Both things reduce flakyness, but have severe issues that make our lifes much more complicated. Another solution that introduces some flakyness (that is under control right now) is keeping things as we have them now, but setting that timeout for the testharness tests (we won't touch webkit's).
It is not OK to introduce any known flaky tests. Each failure has a cost, as bot watchers see a red bubble, and need to investigate why it's red.
Flakky tests are painful. Let's comment out timeouting tests for the moment and remove specific test timeouts. If no one objects, we will land these changes tomorrow as an unreviewed patch.
(In reply to comment #12) > Flakky tests are painful. > Let's comment out timeouting tests for the moment and remove specific test > timeouts. > > If no one objects, we will land these changes tomorrow as an unreviewed > patch. As a first, step, let's remove all timeout values from testharness tests, except for individual tests that actually timeout. That way, we still ensure that these tests do not crash. If that solves flakiness issues, we are good. Otherwise, let's comment out flakky tests and readd them when features will get added.