WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
NEW
228429
[run-webkit-tests] Run largest shards first
https://bugs.webkit.org/show_bug.cgi?id=228429
Summary
[run-webkit-tests] Run largest shards first
Jonathan Bedard
Reported
2021-07-27 15:16:16 PDT
Alexey pointed this out a few weeks ago, and it's particularly obvious with webgl tests. Essentially, because we run tests in alphabetical order, any given invocation of run-webkit-tests will be inefficiently running towards the end, because there will be a larger shard running on a single process. The solution to this is to run the largest shards first. The potential problem with this is that we're sacrificing predictable test order, which may have an effect on flakiness. I think this trade-off will be worth it.
Attachments
Patch
(1.39 KB, patch)
2021-07-27 15:20 PDT
,
Jonathan Bedard
no flags
Details
Formatted Diff
Diff
Patch
(3.75 KB, patch)
2021-07-27 16:21 PDT
,
Jonathan Bedard
no flags
Details
Formatted Diff
Diff
Patch
(3.74 KB, patch)
2021-07-28 19:08 PDT
,
Jonathan Bedard
no flags
Details
Formatted Diff
Diff
Patch
(2.00 KB, patch)
2022-05-25 15:58 PDT
,
Jonathan Bedard
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
Radar WebKit Bug Importer
Comment 1
2021-07-27 15:16:46 PDT
<
rdar://problem/81184667
>
Jonathan Bedard
Comment 2
2021-07-27 15:20:56 PDT
Created
attachment 434344
[details]
Patch
Jonathan Bedard
Comment 3
2021-07-27 16:21:54 PDT
Created
attachment 434382
[details]
Patch
Jonathan Bedard
Comment 4
2021-07-27 16:56:05 PDT
Let's wait for EWS before officially landing this, since we know it could effect flakiness, this will at least help avoid anything obvious.
Alexey Proskuryakov
Comment 5
2021-07-27 17:44:49 PDT
It looks like build 12531 (for this patch) was slower than others on this bot? So not sure if this is a progression or a regression. Perhaps test run time has a huge variance. I didn't look at the logs to dig deeply.
https://ews-build.webkit.org/#/workers/56
Jonathan Bedard
Comment 6
2021-07-28 08:05:33 PDT
The more concerning thing is that we surfaced a debug crash by changing test ordering. That being said, looking through our test runs, it looks to be a 10-15% improvement in most cases, layout tests just have so much variance it's hard to tell.
Jonathan Bedard
Comment 7
2021-07-28 17:43:26 PDT
There aren't any easy answers to this, the two crashes are pretty difficult to reproduce, but EWS is able to reliably do so, and it's a good bet infrastructure will as well. We need to resolve these crashes (and maybe a few others) before landing.
Jonathan Bedard
Comment 8
2021-07-28 19:08:06 PDT
Created
attachment 434490
[details]
Patch
Sam Sneddon [:gsnedders]
Comment 9
2021-07-30 06:03:09 PDT
(In reply to Jonathan Bedard from
comment #7
)
> There aren't any easy answers to this, the two crashes are pretty difficult > to reproduce, but EWS is able to reliably do so, and it's a good bet > infrastructure will as well. We need to resolve these crashes (and maybe a > few others) before landing.
I'm only seeing one crash in WK2 in all the patches? The WK1 crash seems to have affected main as well. But needless to say, the WK2 crash is
bug 224085
; note that other fullscreen tests have been skipped as a result of it. In general: I'm concerned that changing the order that the shards are run effects test results, as it implies little in the way of isolation between the different shards, and any bad state in one shard can end up affecting other shards. That said, some of the problem here is that we allocate all the shards to workers ahead-of-time, instead of (e.g.) reading from a Queue or having any sort of work-stealing, hence we can still end up with imbalances in worker execution time even after this patch.
Jonathan Bedard
Comment 10
2021-08-05 15:48:33 PDT
(In reply to Sam Sneddon [:gsnedders] from
comment #9
)
> (In reply to Jonathan Bedard from
comment #7
) > > There aren't any easy answers to this, the two crashes are pretty difficult > > to reproduce, but EWS is able to reliably do so, and it's a good bet > > infrastructure will as well. We need to resolve these crashes (and maybe a > > few others) before landing. > > I'm only seeing one crash in WK2 in all the patches? The WK1 crash seems to > have affected main as well. But needless to say, the WK2 crash is bug > 224085; note that other fullscreen tests have been skipped as a result of it. > > In general: I'm concerned that changing the order that the shards are run > effects test results, as it implies little in the way of isolation between > the different shards, and any bad state in one shard can end up affecting > other shards. > > That said, some of the problem here is that we allocate all the shards to > workers ahead-of-time, instead of (e.g.) reading from a Queue or having any > sort of work-stealing, hence we can still end up with imbalances in worker > execution time even after this patch.
EWS retries failed tests individually, the flaky failures in something like
https://ews-build.webkit.org/#/builders/59/builds/12630
are still of concern. Also, the TaskPool code doesn't allocate the shards to the workers ahead of time, we are using a Queue that's shared between each process. You will always get some imbalances at the end of the test run, but this patch should minimize them.
Jonathan Bedard
Comment 11
2022-05-25 15:58:06 PDT
Created
attachment 459772
[details]
Patch
Sam Sneddon [:gsnedders]
Comment 12
2022-05-25 22:30:52 PDT
Comment on
attachment 459772
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=459772&action=review
I don't know what your plans are here (and this isn't r? anyway), but…
> Tools/Scripts/webkitpy/layout_tests/controllers/layout_test_runner.py:568 > - shard = TestShard(directory, test_inputs) > - shards.append(shard) > + while len(test_inputs) > MAX_SHARD_SIZE: > + shards.append(TestShard(directory, test_inputs[:MAX_SHARD_SIZE])) > + test_inputs = test_inputs[MAX_SHARD_SIZE:] > + shards.append(TestShard(directory, test_inputs))
I'd rather we didn't change this in the same patch. That said, simple numeric splits have the risk of things moving between shards too often (i.e., adding a new test early on in a shard will change every split), but let's take this design discussion to a new/different bug?
Jonathan Bedard
Comment 13
2022-05-27 15:08:49 PDT
Comment on
attachment 459772
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=459772&action=review
>> Tools/Scripts/webkitpy/layout_tests/controllers/layout_test_runner.py:568 >> + shards.append(TestShard(directory, test_inputs)) > > I'd rather we didn't change this in the same patch. That said, simple numeric splits have the risk of things moving between shards too often (i.e., adding a new test early on in a shard will change every split), but let's take this design discussion to a new/different bug?
Didn't mark this as review because this was just posted for testing, no intention of landing this change as is.
Jonathan Bedard
Comment 14
2022-12-21 14:23:19 PST
Pull request:
https://github.com/WebKit/WebKit/pull/7979
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug