Bug 164305 - Web Inspector: Worker debugging should pause all targets and view call frames in all targets
Summary: Web Inspector: Worker debugging should pause all targets and view call frames...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Inspector (show other bugs)
Version: WebKit Nightly Build
Hardware: All All
: P2 Normal
Assignee: Joseph Pecoraro
URL:
Keywords: InRadar
Depends on: 161951
Blocks:
  Show dependency treegraph
 
Reported: 2016-11-01 17:34 PDT by Joseph Pecoraro
Modified: 2016-11-14 20:04 PST (History)
18 users (show)

See Also:


Attachments
[PATCH] Proposed Fix (109.75 KB, patch)
2016-11-11 23:15 PST, Joseph Pecoraro
buildbot: commit-queue-
Details | Formatted Diff | Diff
[IMAGE] Threads UI (302.95 KB, image/png)
2016-11-11 23:15 PST, Joseph Pecoraro
no flags Details
[IMAGE] Threads GTK UI (360.49 KB, image/png)
2016-11-11 23:16 PST, Joseph Pecoraro
no flags Details
[IMAGE] Context Menu - "Resume Thread" (77.76 KB, image/png)
2016-11-11 23:16 PST, Joseph Pecoraro
no flags Details
Archive of layout-test-results from ews104 for mac-yosemite-wk2 (1.26 MB, application/zip)
2016-11-11 23:58 PST, Build Bot
no flags Details
[PATCH] Proposed Fix (110.89 KB, patch)
2016-11-12 00:01 PST, Joseph Pecoraro
timothy: review+
buildbot: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from ews101 for mac-yosemite (986.26 KB, application/zip)
2016-11-14 14:28 PST, Build Bot
no flags Details
Archive of layout-test-results from ews114 for mac-yosemite (1.66 MB, application/zip)
2016-11-14 14:43 PST, Build Bot
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Joseph Pecoraro 2016-11-01 17:34:59 PDT
Summary:
Worker debugging should pause all targets and view call frames in all targets

Steps to Reproduce:
1. Inspect a page with multiple workers performing work
2. Show the Debugger tab in inspector
3. Click the "Pause" button
  => Expect to pause and see call frames in page and worker targets ("threads")
Comment 1 Radar WebKit Bug Importer 2016-11-01 17:35:35 PDT
<rdar://problem/29056192>
Comment 2 Joseph Pecoraro 2016-11-02 14:31:30 PDT
A fundamental problem this needs to solve is that each target will send its own Debugger.paused and Debugger.resumed events. So again we have per-Target state that we need to track.

I see the object relationships as:

    DebuggerManager
      - "paused" computed state
        - true if any Target is paused, false otherwise

    Target DebuggerData
      - "paused"
        - true if Target is paused, false otherwise

We have to handle multiple Targets pausing in different ways. For the descriptions below you can think of "Targets" as "Threads". "Main Page" and "Worker" targets are basically the same as the "Main Thread" and "Worker Thread".

For example in this test case:

    // worker.js
    let before = 1;
    debugger;
    let after = 2;

    // main.html
    <script>
    let worker1 = new Worker("worker.js");
    let worker2 = new Worker("worker.js")
    </script>

The Main page will spawn 2 workers, and when each runs (async, and not tied to the main page) it will pause. So the frontend will receive two independent Debugger.paused events from each of the different targets.

How we should handle this ties into the policy of how we should handle any single target pausing (e.g. hitting a breakpoint or debugger statement), and how we should handle the user clicking "Pause" in the UI:

There are a few policies we can do for pausing multiple contexts:

    (A) Pause all targets
      - This is like pausing all threads in traditional native app debuggers (lldb, Xcode, ...)

    (B) Leave other targets unaffected
      - In this scenario you could be paused and step through a Worker while the page can still run

Ultimately we could implement both policies behind a Setting. I do see value in both behaviors. It seems different browsers implement one of each of these policies whether or not their UI indicates it or not. Currently WebKit implements something like (B) but lacking the UI indicating that other targets are paused, and if multiple targets pause we break.

I'm considering Policy (A) right now because I think it most closely matches what users will expect.

Implementing the policy:

    • When not paused and a Target pauses => pause all targets
      - DebuggerManager receives the first Debugger.paused event from Target 1
        - DebuggerManager goes from paused=false => paused=true
        - DebuggerManager requests immediate pause in all other Targets
        - Either immediately or once all targets are paused, update frontend UI to paused UI
        - The default activeCallFrame/sidebar display should be from the first Target that paused (Target 1)

    • When not paused and user requests immediate pause => pause all targets
      - User clicks the UI's pause button
        - This button is only available when all Targets are unpaused
        - DebuggerManager requests immediate pause in all Targets
        - Either immediately or once all targets are paused, update frontend UI to paused UI
        - The default activeCallFrame/sidebar display should be from a Target of our choosing
          - We should probably default to the Page, or the last Target we had selected

    • When paused and a Target pauses
      - Update this target's pause data.
      - DebuggerManager may wish to act on this information if this means "All targets are now paused"

    • When paused - Step-in, Step-out, Step-out
      - Perform these actions on just the activeCallFrame
      - (‡) Problem: What if this resumes this target and exits the VM? Should this be a "resume everyone" scenario?

    • When paused - Resume
      - Resume all targets

The frontend changes should be:

    • Target DebuggerData have "unpaused" / "paused" / "pause pending" state
    • DebuggerManager behaviors mentioned above
    • Debugger sidebar shows per-target DebuggerData based on its state:
      - "unpaused" => Show pausing message
      - "pause pending" => Show pausing message / Idle message
      - "paused" => Show call frames

The backend changes should be:

    • To solve (‡), add a Debugger event along the lines of "we were paused, you asked to step, and we completed the current task"
      - E.g. "Debugger.reallyResumed", "Debugger.idle", "Debugger.completedRunLoop"
      - This should also let us solve other known issues (LayoutTests, update Pause Reason more accurately)

In Policy (A) there is one sub policy decision we need to consider on what the Debugger's Pause button means. It could mean "Pause Immediately even if  Targets are Idle", or "Pause on Next JavaScript that runs" (which is what it currently means).

Essentially, this comes down to what does the "Debugger.pause" command mean. Should it be split up if we want to support different behaviors (Debugger.pauseImmediately, Debugger.pauseOnNextStatement). And if we support pausing when idle, what change in events do we send to the frontend.

    (X) Add a backend concept of pausing immediately - Pause a Target when it is not running JavaScript
      - New backend logic. Either in the handling of Debugger.pause or a new Debugger command
      - New Debugger event (Debugger.idle) or reuse Debugger.paused with no call frames.
      - Frontend must expect some response to Debugger.pause that is either "Paused with call frames, or Paused without call frames"
      - Backend runs nested runloop even though it is not running JavaScript
      - Backend waits for Debugger.resume to resume

    (Y) Add a frontend concept of pausing immediately - Target is "Pending Pause" and it will Pause as soon as JavaScript runs
      - New per-target frontend state "pause pending"
      - In this state we have sent Debugger.pause but have not yet received Debugger.paused
      - Frontend can receive Debugger.paused as soon as this target tries to run javascript and can choose to update its call stack or not
      - Frontend can resume this target without it ever having actually paused by sending Debugger.resume
      - Problem: No way to determine if we are "Idle" or just haven't paused yet

I'm liking (Y) right now. Ultimately we will either need to add the ability to pause a target in an infinite loop which will have a backend impact, but until then we can just assume that target is "Idle" or "Pending Pause" as we can't pause in whatever JavaScript is running.

--

I will look into implementing (A) and (Y).
Comment 3 BJ Burg 2016-11-07 11:22:02 PST
* Regarding

- (‡) Problem: What if this resumes this target and exits the VM? Should this be a "resume everyone" scenario?

This is never really a scenario in Xcode because there's always a few frames in the call stack below the event loop, no matter what.


* Regarding A vs B, A seems much easier to use and less confusing, though it does make it hard/impossible to debug through both sides of a postMessage via stepping.


* Regarding X vs Y, I prefer being able to pause a context while it's idle and see that it's idle.
 - It's consistent with what you can do in Xcode (stepping mode never exits via stepping).
 - It solves the "I've stepped past my last call frame" problem.
 - As for the UI, I've been in favor of adding fake call frames for VM entry points anyway (timer fired, DOM event dispatched, etc). So in the UI this would show as "Event Loop (idle)" or something like that.

It does seem a little harder to implement, unfortunately. But we could implement both variants and make one of them happen on option-click. I do think that using PauseOnNextStatement as the default will make option A effectively become option B in practice if the worker is blocking on a message receive.
Comment 4 Joseph Pecoraro 2016-11-07 13:57:40 PST
> * Regarding A vs B, A seems much easier to use and less confusing, though it
> does make it hard/impossible to debug through both sides of a postMessage
> via stepping.

That was exactly my concern. If you really want "Pause Thread 1 here" and "Pause Thread 2 here" then you'd need (B). I think it would be reasonable to have a Setting to allow this behavior. Once I get multi-target pausing working in general this wouldn't be hard to do.
Comment 5 Joseph Pecoraro 2016-11-08 13:49:33 PST
> The backend changes should be:
> 
>     • To solve (‡), add a Debugger event along the lines of "we were paused,
> you asked to step, and we completed the current task"
>       - E.g. "Debugger.reallyResumed", "Debugger.idle",
> "Debugger.completedRunLoop"
>       - This should also let us solve other known issues (LayoutTests,
> update Pause Reason more accurately)

I'm addressing this in bug 161951.

The approach I'm taking is actually simpler and doesn't involve creating new events:

    - Debugger.step* WILL produce EITHER a Debugger.paused or Debugger.resumed event.
    - Debugger.continueToLocation behaves like stepping.
    - Debugger.resume WILL produce a Debugger.resumed event.

This makes the contract simpler, and avoids any ambiguity on the frontend. The frontend will now only receive Debugger.resumed events when it actually was resumed, instead of every time we are stepping when it is ambiguous. Lets see what reviewers think!
Comment 6 Joseph Pecoraro 2016-11-11 23:15:27 PST
Created attachment 294596 [details]
[PATCH] Proposed Fix
Comment 7 Joseph Pecoraro 2016-11-11 23:15:48 PST
Created attachment 294597 [details]
[IMAGE] Threads UI
Comment 8 Joseph Pecoraro 2016-11-11 23:16:06 PST
Created attachment 294598 [details]
[IMAGE] Threads GTK UI
Comment 9 Joseph Pecoraro 2016-11-11 23:16:56 PST
Created attachment 294599 [details]
[IMAGE] Context Menu - "Resume Thread"
Comment 10 Build Bot 2016-11-11 23:58:43 PST
Comment on attachment 294596 [details]
[PATCH] Proposed Fix

Attachment 294596 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/2502179

New failing tests:
inspector/unit-tests/target-manager.html
Comment 11 Build Bot 2016-11-11 23:58:48 PST
Created attachment 294604 [details]
Archive of layout-test-results from ews104 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews104  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 12 Joseph Pecoraro 2016-11-12 00:00:43 PST
> New failing tests:
> inspector/unit-tests/target-manager.html

Oops, trivial fix.
Comment 13 Joseph Pecoraro 2016-11-12 00:01:52 PST
Created attachment 294605 [details]
[PATCH] Proposed Fix
Comment 14 Timothy Hatcher 2016-11-14 13:35:56 PST
Comment on attachment 294605 [details]
[PATCH] Proposed Fix

View in context: https://bugs.webkit.org/attachment.cgi?id=294605&action=review

Very clean, nice!

> Source/WebInspectorUI/UserInterface/Controllers/DebuggerManager.js:605
> +        // FIXME: Should this be done on the backend?

Yeah, I think so. I worry about races being run into with site code. If a worker paused, then it gets here, and by the time we are here the page thread has hit breakpoint or exception, etc to trigger another pause. Pausing in the worker should give the developer time to prevent all other workers or the page from doing anything.
Comment 15 Build Bot 2016-11-14 14:28:21 PST
Comment on attachment 294605 [details]
[PATCH] Proposed Fix

Attachment 294605 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/2515745

New failing tests:
inspector/worker/debugger-pause.html
Comment 16 Build Bot 2016-11-14 14:28:26 PST
Created attachment 294749 [details]
Archive of layout-test-results from ews101 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews101  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 17 Build Bot 2016-11-14 14:43:30 PST
Comment on attachment 294605 [details]
[PATCH] Proposed Fix

Attachment 294605 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/2515808

New failing tests:
inspector/worker/debugger-pause.html
inspector/worker/debugger-multiple-targets-pause.html
Comment 18 Build Bot 2016-11-14 14:43:35 PST
Created attachment 294752 [details]
Archive of layout-test-results from ews114 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews114  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 19 Joseph Pecoraro 2016-11-14 16:58:48 PST
Comment on attachment 294605 [details]
[PATCH] Proposed Fix

View in context: https://bugs.webkit.org/attachment.cgi?id=294605&action=review

> Source/WebInspectorUI/UserInterface/Views/NavigationSidebarPanel.js:-709
> -            || treeElement instanceof WebInspector.ApplicationCacheManifestTreeElement

Err, this shouldn't have removed ApplicationCacheManifestTreeElement. I'm going to add that back.
Comment 20 Joseph Pecoraro 2016-11-14 17:12:03 PST
(In reply to comment #17)
> Comment on attachment 294605 [details]
> [PATCH] Proposed Fix
> 
> Attachment 294605 [details] did not pass mac-debug-ews (mac):
> Output: http://webkit-queues.webkit.org/results/2515808
> 
> New failing tests:
> inspector/worker/debugger-multiple-targets-pause.html

This test appears to fail on the bots because things are slow.

  • The test triggers 2 worker creation and schedules work to do in 100ms
  • By the time the first worker pauses and attempts to cause the other threads to pause they have apparently already done their work because they do not pause.

Due to the nature of this test (we want to cause background threads to pause naturally) it will be tricky to do things like this. I'm going to try to update the delay to 250ms. This is very unfortunate, but I'm not sure of a better solution.

I can have the page trigger an event when it executes its work, and if that happened before we tried to pause we should be able to fail the test early.
Comment 21 Joseph Pecoraro 2016-11-14 19:58:44 PST
I also rewrote the patch to be a little bit less timing specific. I create the works separately from triggering the work. This made it far more consistent on my machine in Debug builds.
Comment 22 Joseph Pecoraro 2016-11-14 20:03:30 PST
<https://trac.webkit.org/changeset/208725>
Comment 23 Joseph Pecoraro 2016-11-14 20:04:08 PST
(In reply to comment #21)
> I also rewrote the patch

Err, I only rewrote the test heh.