RESOLVED FIXED 188426
JSRunLoopTimer may run part of a member function after it's destroyed
https://bugs.webkit.org/show_bug.cgi?id=188426
Summary JSRunLoopTimer may run part of a member function after it's destroyed
Saam Barati
Reported 2018-08-08 18:33:18 PDT
Let's consider this function: ``` void JSRunLoopTimer::timerDidFire() { JSLock* apiLock = m_apiLock.get(); if (!apiLock) { // Likely a buggy usage: the timer fired while JSRunLoopTimer was being destroyed. return; } // HERE std::lock_guard<JSLock> lock(*apiLock); RefPtr<VM> vm = apiLock->vm(); if (!vm) { // The VM has been destroyed, so we should just give up. return; } doWork(); } ``` Look at the comment "HERE" above. Let's say the thread gets context switched out for X seconds at that point. And let's say that during that X seconds, the VM is destroyed. Transitively, the VM owns this JSRunLoopTimer (e.g, GCActivityCallback). If that happens, we're now accessing fields on a dead object. (An easy way to reproduce this bug is to just drop a sleep at the "HERE" comment).
Attachments
patch (20.75 KB, patch)
2018-08-09 18:28 PDT, Saam Barati
ews-watchlist: commit-queue-
Archive of layout-test-results from ews103 for mac-sierra (2.62 MB, application/zip)
2018-08-09 19:18 PDT, EWS Watchlist
no flags
Archive of layout-test-results from ews107 for mac-sierra-wk2 (3.05 MB, application/zip)
2018-08-09 19:46 PDT, EWS Watchlist
no flags
Archive of layout-test-results from ews117 for mac-sierra (3.47 MB, application/zip)
2018-08-09 20:07 PDT, EWS Watchlist
no flags
patch (23.19 KB, patch)
2018-08-09 21:49 PDT, Saam Barati
ews-watchlist: commit-queue-
Archive of layout-test-results from ews101 for mac-sierra (2.42 MB, application/zip)
2018-08-09 22:58 PDT, EWS Watchlist
no flags
patch (23.57 KB, patch)
2018-08-09 23:07 PDT, Saam Barati
no flags
patch (23.54 KB, patch)
2018-08-09 23:10 PDT, Saam Barati
no flags
patch (23.57 KB, patch)
2018-08-09 23:14 PDT, Saam Barati
no flags
patch (26.28 KB, patch)
2018-08-10 14:40 PDT, Saam Barati
saam: review-
WIP (43.89 KB, patch)
2018-08-12 22:56 PDT, Saam Barati
ews-watchlist: commit-queue-
Archive of layout-test-results from ews101 for mac-sierra (5.17 MB, application/zip)
2018-08-13 00:14 PDT, EWS Watchlist
no flags
Archive of layout-test-results from ews107 for mac-sierra-wk2 (4.99 MB, application/zip)
2018-08-13 00:49 PDT, EWS Watchlist
no flags
Archive of layout-test-results from ews126 for ios-simulator-wk2 (4.70 MB, application/zip)
2018-08-13 01:06 PDT, EWS Watchlist
no flags
Archive of layout-test-results from ews125 for ios-simulator-wk2 (4.42 MB, application/zip)
2018-08-13 03:11 PDT, EWS Watchlist
no flags
WIP (43.96 KB, patch)
2018-08-13 11:46 PDT, Saam Barati
ews-watchlist: commit-queue-
Archive of layout-test-results from ews103 for mac-sierra (2.29 MB, application/zip)
2018-08-13 13:06 PDT, EWS Watchlist
no flags
WIP (44.22 KB, patch)
2018-08-13 13:07 PDT, Saam Barati
no flags
WIP (44.24 KB, patch)
2018-08-13 13:16 PDT, Saam Barati
no flags
WIP (44.22 KB, patch)
2018-08-13 13:22 PDT, Saam Barati
no flags
WIP (72.19 MB, patch)
2018-08-13 17:27 PDT, Saam Barati
no flags
patch (50.82 KB, patch)
2018-08-13 23:01 PDT, Saam Barati
no flags
patch (51.75 KB, patch)
2018-08-14 12:25 PDT, Saam Barati
mark.lam: review+
ews-watchlist: commit-queue-
Archive of layout-test-results from ews117 for mac-sierra (1.43 MB, application/zip)
2018-08-14 13:32 PDT, EWS Watchlist
no flags
Archive of layout-test-results from ews206 for win-future (13.06 MB, application/zip)
2018-08-15 17:42 PDT, EWS Watchlist
no flags
patch for landing (51.96 KB, patch)
2018-08-20 17:35 PDT, Saam Barati
no flags
patch for landing (52.12 KB, patch)
2018-08-20 18:40 PDT, Saam Barati
no flags
attaching logs of stack traces (12.85 KB, text/plain)
2018-08-21 18:14 PDT, Saam Barati
no flags
patch for landing (52.12 KB, patch)
2018-08-22 11:58 PDT, Saam Barati
no flags
Saam Barati
Comment 1 2018-08-09 11:18:51 PDT
After thinking about this, it seems like the only reasonable way to ensure the lifetime is for this class to ref() itself when it sets a timer, and deref when the callback is called. But this get's slightly tricky with how we use these RunLoopTimer objects.
Saam Barati
Comment 2 2018-08-09 18:28:18 PDT
EWS Watchlist
Comment 3 2018-08-09 19:18:14 PDT
Comment on attachment 346884 [details] patch Attachment 346884 [details] did not pass mac-ews (mac): Output: https://webkit-queues.webkit.org/results/8814816 New failing tests: imported/w3c/web-platform-tests/css/cssom/cssstyledeclaration-mutationrecord-001.html
EWS Watchlist
Comment 4 2018-08-09 19:18:16 PDT
Created attachment 346885 [details] Archive of layout-test-results from ews103 for mac-sierra The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-sierra Platform: Mac OS X 10.12.6
EWS Watchlist
Comment 5 2018-08-09 19:46:46 PDT
Comment on attachment 346884 [details] patch Attachment 346884 [details] did not pass mac-wk2-ews (mac-wk2): Output: https://webkit-queues.webkit.org/results/8815017 New failing tests: editing/selection/select-bidi-run.html
EWS Watchlist
Comment 6 2018-08-09 19:46:48 PDT
Created attachment 346887 [details] Archive of layout-test-results from ews107 for mac-sierra-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-sierra-wk2 Platform: Mac OS X 10.12.6
EWS Watchlist
Comment 7 2018-08-09 20:07:13 PDT
Comment on attachment 346884 [details] patch Attachment 346884 [details] did not pass mac-debug-ews (mac): Output: https://webkit-queues.webkit.org/results/8815036 New failing tests: fast/events/fire-scroll-event-element.html
EWS Watchlist
Comment 8 2018-08-09 20:07:15 PDT
Created attachment 346888 [details] Archive of layout-test-results from ews117 for mac-sierra The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews117 Port: mac-sierra Platform: Mac OS X 10.12.6
Saam Barati
Comment 9 2018-08-09 20:58:02 PDT
I need to handle the VM dying before the timer.
Saam Barati
Comment 10 2018-08-09 21:49:56 PDT
EWS Watchlist
Comment 11 2018-08-09 22:58:14 PDT
Comment on attachment 346890 [details] patch Attachment 346890 [details] did not pass mac-ews (mac): Output: https://webkit-queues.webkit.org/results/8816225 New failing tests: workers/bomb.html
EWS Watchlist
Comment 12 2018-08-09 22:58:16 PDT
Created attachment 346891 [details] Archive of layout-test-results from ews101 for mac-sierra The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews101 Port: mac-sierra Platform: Mac OS X 10.12.6
Saam Barati
Comment 13 2018-08-09 23:07:56 PDT
Saam Barati
Comment 14 2018-08-09 23:10:05 PDT
Saam Barati
Comment 15 2018-08-09 23:14:23 PDT
Mark Lam
Comment 16 2018-08-10 12:44:12 PDT
Comment on attachment 346894 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=346894&action=review Some comments for now. Per our offline discussion, I'll wait for the next patch with bug fixes before continuing the review. > Source/JavaScriptCore/heap/GCActivityCallback.cpp:60 > doCollection(); I think you should update the year on the copyright header. > Source/JavaScriptCore/heap/GCActivityCallback.cpp:69 > if (newDelay * timerSlop > m_delay) > return; > Seconds delta = m_delay - newDelay; > m_delay = newDelay; > - CFRunLoopTimerSetNextFireDate(m_timer.get(), CFRunLoopTimerGetNextFireDate(m_timer.get()) - delta.seconds()); > + setTimeUntilFire(timeUntilFire() - delta); This logic is now broken because you no longer reset m_delay to s_decade in cancelTimer(). You still need cancelTimer() just for the purpose of resetting m_delay. > Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:211 > + if (m_isTimerFiringForBalancedDeref) { Use UNLIKELY() for this condition?
Saam Barati
Comment 17 2018-08-10 14:40:54 PDT
Saam Barati
Comment 18 2018-08-12 14:40:59 PDT
Comment on attachment 346929 [details] patch I have an easier way to do this.
Saam Barati
Comment 19 2018-08-12 22:56:17 PDT
EWS Watchlist
Comment 20 2018-08-12 22:58:43 PDT
Attachment 346996 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:89: This { should be at the end of the previous line [whitespace/braces] [4] ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:198: This { should be at the end of the previous line [whitespace/braces] [4] Total errors found: 2 in 19 files If any of these errors are false positives, please file a bug against check-webkit-style.
EWS Watchlist
Comment 21 2018-08-13 00:14:04 PDT
Comment on attachment 346996 [details] WIP Attachment 346996 [details] did not pass mac-ews (mac): Output: https://webkit-queues.webkit.org/results/8841409 New failing tests: http/wpt/resource-timing/rt-initiatorType.worker.html
EWS Watchlist
Comment 22 2018-08-13 00:14:06 PDT
Created attachment 346997 [details] Archive of layout-test-results from ews101 for mac-sierra The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews101 Port: mac-sierra Platform: Mac OS X 10.12.6
EWS Watchlist
Comment 23 2018-08-13 00:49:39 PDT
Comment on attachment 346996 [details] WIP Attachment 346996 [details] did not pass mac-wk2-ews (mac-wk2): Output: https://webkit-queues.webkit.org/results/8841494 New failing tests: http/wpt/resource-timing/rt-initiatorType.worker.html
EWS Watchlist
Comment 24 2018-08-13 00:49:41 PDT
Created attachment 347000 [details] Archive of layout-test-results from ews107 for mac-sierra-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-sierra-wk2 Platform: Mac OS X 10.12.6
EWS Watchlist
Comment 25 2018-08-13 01:06:46 PDT
Comment on attachment 346996 [details] WIP Attachment 346996 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: https://webkit-queues.webkit.org/results/8841492 New failing tests: http/wpt/resource-timing/rt-initiatorType.worker.html
EWS Watchlist
Comment 26 2018-08-13 01:06:48 PDT
Created attachment 347001 [details] Archive of layout-test-results from ews126 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews126 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.4
EWS Watchlist
Comment 27 2018-08-13 03:11:28 PDT
Comment on attachment 346996 [details] WIP Attachment 346996 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: https://webkit-queues.webkit.org/results/8842144 New failing tests: http/wpt/resource-timing/rt-initiatorType.worker.html
EWS Watchlist
Comment 28 2018-08-13 03:11:30 PDT
Created attachment 347005 [details] Archive of layout-test-results from ews125 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews125 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.4
Saam Barati
Comment 29 2018-08-13 11:46:59 PDT
EWS Watchlist
Comment 30 2018-08-13 11:49:05 PDT
Attachment 347021 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:94: This { should be at the end of the previous line [whitespace/braces] [4] ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:203: This { should be at the end of the previous line [whitespace/braces] [4] Total errors found: 2 in 19 files If any of these errors are false positives, please file a bug against check-webkit-style.
EWS Watchlist
Comment 31 2018-08-13 13:06:13 PDT
Comment on attachment 347021 [details] WIP Attachment 347021 [details] did not pass mac-ews (mac): Output: https://webkit-queues.webkit.org/results/8846401 New failing tests: http/wpt/resource-timing/rt-initiatorType.worker.html
EWS Watchlist
Comment 32 2018-08-13 13:06:15 PDT
Created attachment 347029 [details] Archive of layout-test-results from ews103 for mac-sierra The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-sierra Platform: Mac OS X 10.12.6
Saam Barati
Comment 33 2018-08-13 13:07:55 PDT
EWS Watchlist
Comment 34 2018-08-13 13:09:49 PDT
Attachment 347030 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:136: This { should be at the end of the previous line [whitespace/braces] [4] ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:241: This { should be at the end of the previous line [whitespace/braces] [4] Total errors found: 2 in 19 files If any of these errors are false positives, please file a bug against check-webkit-style.
Saam Barati
Comment 35 2018-08-13 13:16:35 PDT
EWS Watchlist
Comment 36 2018-08-13 13:17:50 PDT
Attachment 347031 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:136: This { should be at the end of the previous line [whitespace/braces] [4] ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:241: This { should be at the end of the previous line [whitespace/braces] [4] Total errors found: 2 in 19 files If any of these errors are false positives, please file a bug against check-webkit-style.
Saam Barati
Comment 37 2018-08-13 13:22:36 PDT
EWS Watchlist
Comment 38 2018-08-13 13:25:23 PDT
Attachment 347032 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:135: This { should be at the end of the previous line [whitespace/braces] [4] ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:240: This { should be at the end of the previous line [whitespace/braces] [4] Total errors found: 2 in 19 files If any of these errors are false positives, please file a bug against check-webkit-style.
Saam Barati
Comment 39 2018-08-13 17:27:42 PDT
Saam Barati
Comment 40 2018-08-13 23:01:32 PDT
EWS Watchlist
Comment 41 2018-08-13 23:03:58 PDT
Attachment 347065 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:135: This { should be at the end of the previous line [whitespace/braces] [4] ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:233: This { should be at the end of the previous line [whitespace/braces] [4] Total errors found: 2 in 21 files If any of these errors are false positives, please file a bug against check-webkit-style.
Geoffrey Garen
Comment 42 2018-08-14 10:17:03 PDT
Comment on attachment 347065 [details] patch Can't a JSRunLoopTimer just ref() itself inside JSRunLoopTimer::scheduleTimer() and deref() itself inside JSRunLoopTimer::cancelTimer()?
Keith Miller
Comment 43 2018-08-14 10:35:14 PDT
(In reply to Geoffrey Garen from comment #42) > Comment on attachment 347065 [details] > patch > > Can't a JSRunLoopTimer just ref() itself inside > JSRunLoopTimer::scheduleTimer() and deref() itself inside > JSRunLoopTimer::cancelTimer()? No, if another thread calls cancelTimer() after CFRunLoop has called our callback but before the JSRunLoopTimers has managed to ref itself. You'd now be running a function with a freed this.
Geoffrey Garen
Comment 44 2018-08-14 11:06:56 PDT
(In reply to Keith Miller from comment #43) > (In reply to Geoffrey Garen from comment #42) > > Comment on attachment 347065 [details] > > patch > > > > Can't a JSRunLoopTimer just ref() itself inside > > JSRunLoopTimer::scheduleTimer() and deref() itself inside > > JSRunLoopTimer::cancelTimer()? > > No, if another thread calls cancelTimer() after CFRunLoop has called our > callback but before the JSRunLoopTimers has managed to ref itself. You'd now > be running a function with a freed this. If you ref() inside JSRunLoopTimer::scheduleTimer(), you already hold a ref when CFRunLoop calls your callback.
Mark Lam
Comment 45 2018-08-14 11:11:58 PDT
(In reply to Geoffrey Garen from comment #44) > (In reply to Keith Miller from comment #43) > > (In reply to Geoffrey Garen from comment #42) > > > Comment on attachment 347065 [details] > > > patch > > > > > > Can't a JSRunLoopTimer just ref() itself inside > > > JSRunLoopTimer::scheduleTimer() and deref() itself inside > > > JSRunLoopTimer::cancelTimer()? > > > > No, if another thread calls cancelTimer() after CFRunLoop has called our > > callback but before the JSRunLoopTimers has managed to ref itself. You'd now > > be running a function with a freed this. > > If you ref() inside JSRunLoopTimer::scheduleTimer(), you already hold a ref > when CFRunLoop calls your callback. I believe the issue is that the run loop's timer callback may have started but got pre-empted before it can ref the Timer object. Meanwhile on the main thread, the VM shutdowns and frees the Timer object. Eventually, the timer thread gets to run and ref's the now freed Timer object and reads its fields ... which is what we're trying to prevent.
Mark Lam
Comment 46 2018-08-14 11:14:03 PDT
Comment on attachment 347065 [details] patch The mac-debug EWS build failures looks legit.
Mark Lam
Comment 47 2018-08-14 11:15:51 PDT
(In reply to Mark Lam from comment #45) > (In reply to Geoffrey Garen from comment #44) > > (In reply to Keith Miller from comment #43) > > > (In reply to Geoffrey Garen from comment #42) > > > > Comment on attachment 347065 [details] > > > > patch > > > > > > > > Can't a JSRunLoopTimer just ref() itself inside > > > > JSRunLoopTimer::scheduleTimer() and deref() itself inside > > > > JSRunLoopTimer::cancelTimer()? > > > > > > No, if another thread calls cancelTimer() after CFRunLoop has called our > > > callback but before the JSRunLoopTimers has managed to ref itself. You'd now > > > be running a function with a freed this. > > > > If you ref() inside JSRunLoopTimer::scheduleTimer(), you already hold a ref > > when CFRunLoop calls your callback. > > I believe the issue is that the run loop's timer callback may have started > but got pre-empted before it can ref the Timer object. Meanwhile on the > main thread, the VM shutdowns and frees the Timer object. Eventually, the > timer thread gets to run and ref's the now freed Timer object and reads its > fields ... which is what we're trying to prevent. Nevermind. You said "inside JSRunLoopTimer::scheduleTimer()" as in asymmetric ref/deref. I believe Saam identified some complications with cancelTimer() when doing that, but we'll take a second look.
Saam Barati
Comment 48 2018-08-14 11:30:48 PDT
(In reply to Geoffrey Garen from comment #44) > (In reply to Keith Miller from comment #43) > > (In reply to Geoffrey Garen from comment #42) > > > Comment on attachment 347065 [details] > > > patch > > > > > > Can't a JSRunLoopTimer just ref() itself inside > > > JSRunLoopTimer::scheduleTimer() and deref() itself inside > > > JSRunLoopTimer::cancelTimer()? > > > > No, if another thread calls cancelTimer() after CFRunLoop has called our > > callback but before the JSRunLoopTimers has managed to ref itself. You'd now > > be running a function with a freed this. > > If you ref() inside JSRunLoopTimer::scheduleTimer(), you already hold a ref > when CFRunLoop calls your callback. Yes, this would work. But the tricky thing is getting cancel correct. The goal being: we want to be able to deref() in cancel so we don't leak. If you want to see what I came up with to solve this, the patch for that is here: https://bugs.webkit.org/attachment.cgi?id=346929&action=review I think that approach turned out to be a lot more complicated than what I have up for review now. The main issue I ran into with the prior approach is this: 1. I schedule a timer. I ref() b/c of this. 2. I cancel a timer. I'd like to deref() here. However, how do I know I've done that in a way that's safe. E.g, have I cancelled before the timer fired? This turns out to be trickier than I expected.
Saam Barati
Comment 49 2018-08-14 12:25:37 PDT
Created attachment 347102 [details] patch Fix debug build.
EWS Watchlist
Comment 50 2018-08-14 12:28:52 PDT
Attachment 347102 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:135: This { should be at the end of the previous line [whitespace/braces] [4] ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:233: This { should be at the end of the previous line [whitespace/braces] [4] Total errors found: 2 in 21 files If any of these errors are false positives, please file a bug against check-webkit-style.
Geoffrey Garen
Comment 51 2018-08-14 13:03:57 PDT
> 2. I cancel a timer. I'd like to deref() here. However, how do I know I've > done that in a way that's safe. E.g, have I cancelled before the timer > fired? Under what conditions do you cancel a timer, and then it fires afterward anyway?
EWS Watchlist
Comment 52 2018-08-14 13:32:21 PDT
Comment on attachment 347102 [details] patch Attachment 347102 [details] did not pass mac-debug-ews (mac): Output: https://webkit-queues.webkit.org/results/8858423 Number of test failures exceeded the failure limit.
EWS Watchlist
Comment 53 2018-08-14 13:32:23 PDT
Created attachment 347105 [details] Archive of layout-test-results from ews117 for mac-sierra The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews117 Port: mac-sierra Platform: Mac OS X 10.12.6
Saam Barati
Comment 54 2018-08-14 13:37:23 PDT
(In reply to Geoffrey Garen from comment #51) > > 2. I cancel a timer. I'd like to deref() here. However, how do I know I've > > done that in a way that's safe. E.g, have I cancelled before the timer > > fired? > > Under what conditions do you cancel a timer, and then it fires afterward > anyway? When it's imminently about to fire. For example, let's say we have threads T1 and T2. T1 is running the runloop we're firing on, and T2 is the thread that will cancel the timer. T1. Timer fires T1. CF issues a call to our callback. T1. Before an any callback code is run, T1 gets context switched T2. Cancels In such a scenario, T2 and T1 can't *both* deref.
Geoffrey Garen
Comment 55 2018-08-14 13:47:39 PDT
(In reply to Saam Barati from comment #54) > (In reply to Geoffrey Garen from comment #51) > > > 2. I cancel a timer. I'd like to deref() here. However, how do I know I've > > > done that in a way that's safe. E.g, have I cancelled before the timer > > > fired? > > > > Under what conditions do you cancel a timer, and then it fires afterward > > anyway? > > When it's imminently about to fire. For example, let's say we have threads > T1 and T2. T1 is running the runloop we're firing on, and T2 is the thread > that will cancel the timer. > > T1. Timer fires > T1. CF issues a call to our callback. > T1. Before an any callback code is run, T1 gets context switched > T2. Cancels > > In such a scenario, T2 and T1 can't *both* deref. Is it a requirement that you should be able to call JSRunLoopTimer::cancelTimer() from any thread at any time? Who does this?
Saam Barati
Comment 56 2018-08-14 13:55:11 PDT
(In reply to Geoffrey Garen from comment #55) > (In reply to Saam Barati from comment #54) > > (In reply to Geoffrey Garen from comment #51) > > > > 2. I cancel a timer. I'd like to deref() here. However, how do I know I've > > > > done that in a way that's safe. E.g, have I cancelled before the timer > > > > fired? > > > > > > Under what conditions do you cancel a timer, and then it fires afterward > > > anyway? > > > > When it's imminently about to fire. For example, let's say we have threads > > T1 and T2. T1 is running the runloop we're firing on, and T2 is the thread > > that will cancel the timer. > > > > T1. Timer fires > > T1. CF issues a call to our callback. > > T1. Before an any callback code is run, T1 gets context switched > > T2. Cancels > > > > In such a scenario, T2 and T1 can't *both* deref. > > Is it a requirement that you should be able to call > JSRunLoopTimer::cancelTimer() from any thread at any time? Who does this? API users could cause this to happen in arbitrary ways: - We grab the runloop at VM construction. - We then can run code on that VM from any other thread. That code can do many things that will cancelTimer()
Saam Barati
Comment 57 2018-08-14 14:40:04 PDT
Looks like HashMap<Ref<P>, SomeVaueWhereEmptyValueIsNotZero> is broken...
EWS Watchlist
Comment 58 2018-08-15 17:42:26 PDT
Comment on attachment 347102 [details] patch Attachment 347102 [details] did not pass win-ews (win): Output: https://webkit-queues.webkit.org/results/8873312 New failing tests: http/tests/security/contentSecurityPolicy/userAgentShadowDOM/allow-video.html
EWS Watchlist
Comment 59 2018-08-15 17:42:38 PDT
Created attachment 347232 [details] Archive of layout-test-results from ews206 for win-future The attached test failures were seen while running run-webkit-tests on the win-ews. Bot: ews206 Port: win-future Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Mark Lam
Comment 60 2018-08-16 13:59:21 PDT
Comment on attachment 347102 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=347102&action=review r=me. I agree that this patch is much easier to reason about in terms of synchronization because it's now reduced to sync'ing on the Manager's lock, and the Manager is immortal. > Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:194 > + EpochTime fireEpochTime = epochTime(delay); At first, I was concerned that delay may be negative, and that we'll have some underflow issue with fireEpochTime. However, I see that EpochTime is Seconds, and therefore uses a double as storage. I still think that it's strange to have a negative epoch time but I suppose the math can't be messed up because we're doing doubles math. > Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:208 > + if (entry.first.ptr() == &timer) { > + entry.second = fireEpochTime; > + found = true; > + } > + scheduleTime = std::min(scheduleTime, entry.second); Shouldn't we break out of the loop if we've found the timer? Each timer may only be enqueued in data.timers once, right? Shouldn't the setting of "scheduleTime = std::min(scheduleTime, entry.second);" only be done if the timer is found? Oh, I see. You want to find the smallest time of all the JSRunLoopTimers. Not sure if it's worth it but if you just cache this smallest time value in a PerVMData field, then you can bail out as soon as you've found the matching timer. ... On second thought, probably not worth it. This is just an optimization which we can apply later if needed. For now, symmetry with the cancelTimer() case is probably more important to ensure correctness. > Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:325 > + Manager::shared().scheduleTimer(*this, intervalInSeconds); nit: pass the locker to document the contract that we must first lock the JSRunLoopTimer? > Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:337 > + Manager::shared().cancelTimer(*this); nit: pass the locker to document the contract that we must first lock the JSRunLoopTimer? > Source/JavaScriptCore/runtime/JSRunLoopTimer.h:64 > + void willDestroyVM(VM&); nit: name this unregisterVM for symmetry?
Saam Barati
Comment 61 2018-08-20 17:30:51 PDT
Comment on attachment 347102 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=347102&action=review >> Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:325 >> + Manager::shared().scheduleTimer(*this, intervalInSeconds); > > nit: pass the locker to document the contract that we must first lock the JSRunLoopTimer? That feels weird since we'd be passing in the locker for a JSRunLoopTimer and not the Manager. Other things could use Manager and not lock around calls to schedule/cancel >> Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:337 >> + Manager::shared().cancelTimer(*this); > > nit: pass the locker to document the contract that we must first lock the JSRunLoopTimer? ditto >> Source/JavaScriptCore/runtime/JSRunLoopTimer.h:64 >> + void willDestroyVM(VM&); > > nit: name this unregisterVM for symmetry? sounds good.
Mark Lam
Comment 62 2018-08-20 17:33:13 PDT
(In reply to Saam Barati from comment #61) > Comment on attachment 347102 [details] > patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=347102&action=review > > >> Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:325 > >> + Manager::shared().scheduleTimer(*this, intervalInSeconds); > > > > nit: pass the locker to document the contract that we must first lock the JSRunLoopTimer? > > That feels weird since we'd be passing in the locker for a JSRunLoopTimer > and not the Manager. Other things could use Manager and not lock around > calls to schedule/cancel OK.
Saam Barati
Comment 63 2018-08-20 17:35:18 PDT
Created attachment 347584 [details] patch for landing
EWS Watchlist
Comment 64 2018-08-20 18:33:39 PDT
Attachment 347584 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:135: This { should be at the end of the previous line [whitespace/braces] [4] ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:233: This { should be at the end of the previous line [whitespace/braces] [4] Total errors found: 2 in 21 files If any of these errors are false positives, please file a bug against check-webkit-style.
Saam Barati
Comment 65 2018-08-20 18:40:21 PDT
Created attachment 347591 [details] patch for landing
EWS Watchlist
Comment 66 2018-08-20 18:43:35 PDT
Attachment 347591 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:135: This { should be at the end of the previous line [whitespace/braces] [4] ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:235: This { should be at the end of the previous line [whitespace/braces] [4] Total errors found: 2 in 21 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Commit Bot
Comment 67 2018-08-21 00:27:34 PDT
Comment on attachment 347591 [details] patch for landing Clearing flags on attachment: 347591 Committed r235107: <https://trac.webkit.org/changeset/235107>
WebKit Commit Bot
Comment 68 2018-08-21 00:27:36 PDT
All reviewed patches have been landed. Closing bug.
Truitt Savell
Comment 69 2018-08-21 09:13:43 PDT
It looks like https://trac.webkit.org/changeset/235107/webkit Has caused this crash to occur in test: https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=http%2Ftests%2Fwebsocket%2Ftests%2Fhybi%2Fset-protocol.html Crash: https://build.webkit.org/results/Apple%20iOS%2011%20Simulator%20Debug%20WK2%20(Tests)/r235107%20(5999)/http/tests/websocket/tests/hybi/set-protocol-crash-log.txt Excerpt: stderr: ASSERTION FAILED: mapIterator->value.contains(url) /Volumes/Data/slave/ios-simulator-11-debug/build/Source/WebKit/NetworkProcess/FileAPI/NetworkBlobRegistry.cpp(125) : void WebKit::NetworkBlobRegistry::unregisterBlobURL(WebKit::NetworkConnectionToWebProcess *, const WebCore::URL &) 1 0x11ccd3be9 WTFCrash 2 0x10e72271b WTFCrashWithInfo(int, char const*, char const*, int) 3 0x10ec24200 WebKit::NetworkBlobRegistry::unregisterBlobURL(WebKit::NetworkConnectionToWebProcess*, WebCore::URL const&) 4 0x10eb49aed WebKit::NetworkConnectionToWebProcess::unregisterBlobURL(WebCore::URL const&) Looks like one of the assertions added in this patch is causing the crash.
Saam Barati
Comment 70 2018-08-21 13:04:10 PDT
(In reply to Truitt Savell from comment #69) > It looks like https://trac.webkit.org/changeset/235107/webkit > > Has caused this crash to occur in test: > https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard. > html#showAllRuns=true&tests=http%2Ftests%2Fwebsocket%2Ftests%2Fhybi%2Fset- > protocol.html > > Crash: > https://build.webkit.org/results/ > Apple%20iOS%2011%20Simulator%20Debug%20WK2%20(Tests)/r235107%20(5999)/http/ > tests/websocket/tests/hybi/set-protocol-crash-log.txt > > Excerpt: > stderr: > ASSERTION FAILED: mapIterator->value.contains(url) > /Volumes/Data/slave/ios-simulator-11-debug/build/Source/WebKit/ > NetworkProcess/FileAPI/NetworkBlobRegistry.cpp(125) : void > WebKit::NetworkBlobRegistry::unregisterBlobURL(WebKit:: > NetworkConnectionToWebProcess *, const WebCore::URL &) > 1 0x11ccd3be9 WTFCrash > 2 0x10e72271b WTFCrashWithInfo(int, char const*, char const*, int) > 3 0x10ec24200 > WebKit::NetworkBlobRegistry::unregisterBlobURL(WebKit:: > NetworkConnectionToWebProcess*, WebCore::URL const&) > 4 0x10eb49aed > WebKit::NetworkConnectionToWebProcess::unregisterBlobURL(WebCore::URL const&) > > Looks like one of the assertions added in this patch is causing the crash. This is really weird. Do we even make a VM in the network process?
Saam Barati
Comment 71 2018-08-21 14:55:11 PDT
I'm looking into this now.
Saam Barati
Comment 72 2018-08-21 15:15:21 PDT
I can't reproduce this.
Ryan Haddad
Comment 73 2018-08-21 15:33:32 PDT
I can reproduce the crash with a ToT build with the following: run-webkit-tests http/tests/websocket/tests/hybi --debug --exit-after-n-crashes-or-timeouts 1 --no-retry -fg --iter 100 I have not yet tried to repro with a build before https://trac.webkit.org/changeset/235107
Saam Barati
Comment 74 2018-08-21 15:46:25 PDT
(In reply to Ryan Haddad from comment #73) > I can reproduce the crash with a ToT build with the following: > run-webkit-tests http/tests/websocket/tests/hybi --debug > --exit-after-n-crashes-or-timeouts 1 --no-retry -fg --iter 100 > > I have not yet tried to repro with a build before > https://trac.webkit.org/changeset/235107 When I do this, every single test fails. Is that your experience as well?
Saam Barati
Comment 75 2018-08-21 16:03:09 PDT
(In reply to Saam Barati from comment #74) > (In reply to Ryan Haddad from comment #73) > > I can reproduce the crash with a ToT build with the following: > > run-webkit-tests http/tests/websocket/tests/hybi --debug > > --exit-after-n-crashes-or-timeouts 1 --no-retry -fg --iter 100 > > > > I have not yet tried to repro with a build before > > https://trac.webkit.org/changeset/235107 > > When I do this, every single test fails. Is that your experience as well? This is because I had a different apache installed...
Saam Barati
Comment 76 2018-08-21 16:22:32 PDT
(In reply to Ryan Haddad from comment #73) > I can reproduce the crash with a ToT build with the following: > run-webkit-tests http/tests/websocket/tests/hybi --debug > --exit-after-n-crashes-or-timeouts 1 --no-retry -fg --iter 100 > > I have not yet tried to repro with a build before > https://trac.webkit.org/changeset/235107 Did you really run this on the parent directory and not the individual test?
Saam Barati
Comment 77 2018-08-21 16:30:35 PDT
I still can't reproduce this. I need help from y'all on how you're doing it. What OS were you on Ryan? It would be really helpful if someone can confirm that it can repro at my revision and can't repro on the revision before.
Saam Barati
Comment 78 2018-08-21 16:34:42 PDT
(In reply to Saam Barati from comment #77) > I still can't reproduce this. I need help from y'all on how you're doing it. > What OS were you on Ryan? > > It would be really helpful if someone can confirm that it can repro at my > revision and can't repro on the revision before. I spoke too soon. I just got it to reproduce.
Saam Barati
Comment 79 2018-08-21 17:52:56 PDT
This appears to be revealing a bug in the blob code. I've made it so that no JSC timers run, and this crash still happens.
Saam Barati
Comment 80 2018-08-21 18:14:19 PDT
Created attachment 347747 [details] attaching logs of stack traces This code seems super innocuous. We're calling a GC from some JS code injected into the test runner.
Saam Barati
Comment 81 2018-08-21 18:20:58 PDT
This patch is also a performance regression :(
WebKit Commit Bot
Comment 82 2018-08-21 18:21:56 PDT
Re-opened since this is blocked by bug 188832
Saam Barati
Comment 83 2018-08-21 22:06:01 PDT
I can’t reproduce the performance regression on my laptop 🤔 I’ll need to see if I can reproduce on other HW
Saam Barati
Comment 84 2018-08-22 11:58:20 PDT
Created attachment 347825 [details] patch for landing
Saam Barati
Comment 85 2018-08-22 11:59:25 PDT
Comment on attachment 347591 [details] patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=347591&action=review > Source/JavaScriptCore/heap/GCActivityCallback.cpp:72 > + setTimeUntilFire(delta); The perf bug is here. This should be: setTimeUntilFire(newDelay)
EWS Watchlist
Comment 86 2018-08-22 12:00:25 PDT
Attachment 347825 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:135: This { should be at the end of the previous line [whitespace/braces] [4] ERROR: Source/JavaScriptCore/runtime/JSRunLoopTimer.cpp:235: This { should be at the end of the previous line [whitespace/braces] [4] Total errors found: 2 in 21 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Commit Bot
Comment 87 2018-08-23 16:57:10 PDT
Comment on attachment 347825 [details] patch for landing Clearing flags on attachment: 347825 Committed r235261: <https://trac.webkit.org/changeset/235261>
WebKit Commit Bot
Comment 88 2018-08-23 16:57:13 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.