RESOLVED FIXED314101
Rework ParkingLot spinloop to reduce scheduler thrashing
https://bugs.webkit.org/show_bug.cgi?id=314101
Summary Rework ParkingLot spinloop to reduce scheduler thrashing
Marcus Plutowski
Reported 2026-05-05 09:38:01 PDT
rdar://176237718 After failing to acquire a contended lock, a thread will optimistically spin for a little while before parking itself on the ParkingLot. This is beneficial to performance because very often the lock will be freed before the waiting thread needs to park. However, at present this loop calls sched_yield 40 times in a row. This causes two problems: 1. It reduces the “handoff quality” of the lock, as any thread acquiring the lock is ~guaranteed to resume execution with a depressed priority 2. It causes churn in the scheduler, especially when a large number of threads are all attempting to yield at once Experimentally, replacing the sched_yield with an equal time spinning on an appropriate nop has broad benefits across a number of platforms, but especially for A-series chips. Any test with significant GC activity benefits, as well as tests with light contention (where threads mostly don’t make it all the way through the parking loop). However, a small number of benchmarks — particularly JS3’s `bomb-workers` — are heavily contended enough that a large number of attempts at acquiring locks will park regardless of time-to-park, meaning the entire duration of the spinloop is executed on-core. In this scenario, sched_yield has the salubrious effect of reducing wasteage by allowing the scheduler to schedule this work on e.g. an e-core, reducing the impact. Entirely removing the sched_yield leads to upwards of a 40% regression on that subtest in particular, wiping out the benefits. The simple approach is to optimistically assume that, of those lock-acquisitions which *do* succeed in acquiring the lock during the spinloop period, most will acquire it early on. As such, we can avoid calling sched_yield until some number of iterations have passed (again, replacing it with a suitable nop-loop). Thereafter, the loop only needs to call sched_yield once every few iterations, since the priority-depression lasts for some time — and again, optimistically, we want to decrease the depression-duration that will be left if the thread does succeed in acquiring the core before the next sched_yield call.
Attachments
Marcus Plutowski
Comment 1 2026-05-05 09:52:49 PDT
EWS
Comment 2 2026-05-06 11:04:12 PDT
Committed 312720@main (697f0265b8f6): <https://commits.webkit.org/312720@main> Reviewed commits have been landed. Closing PR #64278 and removing active labels.
WebKit Commit Bot
Comment 3 2026-05-06 17:52:26 PDT
Re-opened since this is blocked by bug 314267
Marcus Plutowski
Comment 4 2026-05-07 10:36:17 PDT
EWS
Comment 5 2026-05-11 20:21:23 PDT
Committed 313051@main (d17da45094f8): <https://commits.webkit.org/313051@main> Reviewed commits have been landed. Closing PR #64477 and removing active labels.
Note You need to log in before you can comment on or make changes to this bug.