Reproducing steps: 1. Open https://hajimehoshi.github.io/sandbox/worker/ 2. See the numbers for a while There are two counters. The upper is counted by the main frame and the lower is counted by a dedicated worker. 3. Click 'Google.com' 4. Stay for a while (10 seconds or so) 5. Go back to the original page. The worker counter is increased for the time duration of being in google.com. Source code: https://github.com/hajimehoshi/sandbox/tree/main/worker
The issue is that this worker thread is busy looping and thus never processing the message from the main thread asking it to suspend.
Maybe the suspend message should kill the worker if there's no timely reply.
(In reply to Geoffrey Garen from comment #2) > Maybe the suspend message should kill the worker if there's no timely reply. We could do that. I was thinking about calling WTF::Thread::suspend() instead of posting a message to the thread. Any opposition to that?
Would it suspend threads holding mutexes, and freeze the main thread as a consequence?
(In reply to Alexey Proskuryakov from comment #4) > Would it suspend threads holding mutexes, and freeze the main thread as a > consequence? You're right, I think that would be a risk indeed and very likely the reason why our suspension logic currently works the way it does. I guess timeout + kill is the way to go then.
In my suggestion to "kill the worker", I had assumed this was a ServiceWorker. But I see now that it's just a Worker. It's an interesting edge case because there's no plain definition of "kill". Should we exit the process and possibly take down other pages with us? I wonder if the right behavior is just to wait synchronously for the worker to terminate. Then, if the worker is hung, you get the same behavior you would have gotten if the main thread were hung. In practice, that might often trigger some kind of watchdog kill. Or something else. It seems to me that modeling this like a hang of the main thread might be the most accurate thing, if that's possible.
There are other cases where dedicated workers are supposed to be paused or terminated (see https://html.spec.whatwg.org/multipage/workers.html#terminate-a-worker and around). E.g. I don't think that we are allowed to let it spin once it's no longer observable in an active page either. I think that it's mostly implemented in WebKit, but have been out of the loop for too long to point it out.
<rdar://problem/86287415>
(In reply to Chris Dumez from comment #5) > (In reply to Alexey Proskuryakov from comment #4) > > Would it suspend threads holding mutexes, and freeze the main thread as a > > consequence? > > You're right, I think that would be a risk indeed and very likely the reason > why our suspension logic currently works the way it does. > > I guess timeout + kill is the way to go then. Yes. One example is that malloc can have a global mutex for the slow path (e.g. stealing a page from the global heap instead of thread local heap). If we suspend a thread holding that mutex, most likely, soon, the other threads will be blocked once malloc happens.
(In reply to Yusuke Suzuki from comment #9) > (In reply to Chris Dumez from comment #5) > > (In reply to Alexey Proskuryakov from comment #4) > > > Would it suspend threads holding mutexes, and freeze the main thread as a > > > consequence? > > > > You're right, I think that would be a risk indeed and very likely the reason > > why our suspension logic currently works the way it does. > > > > I guess timeout + kill is the way to go then. > > Yes. One example is that malloc can have a global mutex for the slow path > (e.g. stealing a page from the global heap instead of thread local heap). If > we suspend a thread holding that mutex, most likely, soon, the other threads > will be blocked once malloc happens. We can use a mechanism similar to global GC's stop thread mechanism (doesn't exist yet) and stop the target thread at a GC consistent point. Doing so ensure that the target thread won't be in the middle of any malloc operation. That said, it doesn't guarantee yet that there won't be other locks that could be held. If suspension is something we think we need, I can consider it in my GC work. I think with a some work, it is achievable.
Technically the web standard makes reference to a behavior much more like JSC::Watchdog / JSC::VM::throwTerminationException(). For this case, I think just waiting for termination, combined with our process model, will get the job done. For the more challenging case of explicitly calling Worker.terminate() in the middle of an infinite loop, I guess we'll eventually need to adopt / add support for JSC::VM::throwTerminationException() in Worker scripts.
(In reply to Geoffrey Garen from comment #11) > Technically the web standard makes reference to a behavior much more like > JSC::Watchdog / JSC::VM::throwTerminationException(). Exactly what I've been vaguely recalling, yes - see WorkerOrWorkletScriptController::scheduleExecutionTermination().
(In reply to Geoffrey Garen from comment #11) > For the more challenging case of explicitly calling Worker.terminate() in > the middle of an infinite loop, I guess we'll eventually need to adopt / add > support for JSC::VM::throwTerminationException() in Worker scripts. We already do: Worker.terminate() uses the VMTraps mechanism, which allows a worker to terminate while in an infinite loop.
Maybe it's a simple fix just to use VMTraps for shutdown too?