Bug 36646 - REGRESSION: Worker termination via JS timeout may cause worker tests like fast/workers/worker-terminate.html fail.
Summary: REGRESSION: Worker termination via JS timeout may cause worker tests like fas...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P1 Normal
Assignee: Nobody
URL:
Keywords:
: 36585 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-03-26 00:43 PDT by Eric Seidel (no email)
Modified: 2010-04-06 21:38 PDT (History)
12 users (show)

See Also:


Attachments
Patch (24.66 KB, patch)
2010-04-05 03:26 PDT, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (24.64 KB, patch)
2010-04-05 11:36 PDT, Adam Barth
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2010-03-26 00:43:14 PDT
fast/workers/worker-terminate.html failed (timeout) on Leopard Debug Bot

http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r56607%20(11979)/fast/workers/worker-terminate-diffs.txt
--- /Volumes/Big/WebKit-BuildSlave/leopard-intel-debug-tests/build/layout-test-results/fast/workers/worker-terminate-expected.txt	2010-03-26 00:28:17.000000000 -0700
+++ /Volumes/Big/WebKit-BuildSlave/leopard-intel-debug-tests/build/layout-test-results/fast/workers/worker-terminate-actual.txt	2010-03-26 00:28:17.000000000 -0700
@@ -1,3 +1,4 @@
+CONSOLE MESSAGE: line 3: JavaScript execution exceeded timeout.
 Test Worker.terminate().
 
 Received message from worker4.

Test is from 9 months ago:
http://trac.webkit.org/browser/trunk/LayoutTests/fast/workers/worker-terminate.html

I think we've seen several other failures related to "JavaScript execution exceeded timeout."  They seem to be recent.
Comment 1 Dmitry Titov 2010-03-29 17:16:28 PDT
For JSC, workers are terminated like this:

void WorkerScriptController::forbidExecution()
{
   ...
    m_globalData->timeoutChecker.setTimeoutInterval(1); // 1ms is the smallest timeout that can be set.
}

So it seems the timeout exceeded coudl be a normal way for a worker to die. Perhaps we should prevent the console message in this case?
Comment 2 Dmitry Titov 2010-03-29 17:46:30 PDT
Note from Darin Adler in webkit-dev: 
"There is precedent in JavaScriptCore for using a special exception to terminate code. InterruptedExecutionError is used for this purpose. I could imagine having something similar for workers. A timeout sounds like a poor way to do it."

Indeed, this could be better. For workers to be able to use it, it should be possible to inject it from another thread.
Comment 4 Eric Seidel (no email) 2010-03-31 22:50:42 PDT
*** Bug 36585 has been marked as a duplicate of this bug. ***
Comment 5 Eric Seidel (no email) 2010-03-31 22:51:33 PDT
Also causing fast/workers/termination-with-port-messages.html to fail, as seen in Bug 36585.
Comment 6 Eric Seidel (no email) 2010-04-01 20:47:27 PDT
I think we should be able to add the Snow Leopard bots to the core builders once this bug is fixed.  This causes about one failure ever 5 builds or so right now.
Comment 10 Adam Barth 2010-04-02 22:49:38 PDT
Committed r57043: <http://trac.webkit.org/changeset/57043>
Comment 11 Adam Barth 2010-04-02 22:50:31 PDT
I've skipped this test because the failure rate is obscenely high.  That doesn't mean we should drop this test.  I'd like it back, but we can't have tests randomly failing on all the bots 25% of the time.
Comment 12 Adam Barth 2010-04-03 12:09:51 PDT
termination-with-port-messages.html this time:
http://build.webkit.org/builders/Tiger%20Intel%20Release/builds/10385/steps/layout-test/logs/stdio
Comment 13 Eric Seidel (no email) 2010-04-05 00:38:12 PDT
Another one:
http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r57062%20(12579)/fast/workers/termination-with-port-messages-diffs.txt

I don't think skipping any single test will solve this problem.

However, this seems to be our #1 cause of flakiness at the moment.
Comment 14 Adam Barth 2010-04-05 00:42:38 PDT
Right.  The correct solution looks to be the one proposed in Comment #2.
Comment 15 Adam Barth 2010-04-05 03:26:33 PDT
Created attachment 52523 [details]
Patch
Comment 16 Adam Barth 2010-04-05 03:27:19 PDT
I'm not 100% happy with this patch, but it gets the job done (as far as I can tell).  Suggestions welcome, as always.  :)
Comment 17 Eric Seidel (no email) 2010-04-05 03:40:41 PDT
Attachment 52523 [details] did not build on mac:
Build output: http://webkit-commit-queue.appspot.com/results/1630248
Comment 18 WebKit Review Bot 2010-04-05 03:57:46 PDT
Attachment 52523 [details] did not build on gtk:
Build output: http://webkit-commit-queue.appspot.com/results/1677070
Comment 19 Early Warning System Bot 2010-04-05 08:23:41 PDT
Attachment 52523 [details] did not build on qt:
Build output: http://webkit-commit-queue.appspot.com/results/1651187
Comment 20 Darin Adler 2010-04-05 08:38:22 PDT
Comment on attachment 52523 [details]
Patch

> +        (JSC::Terminator::termianteSoon):

I took a brief look, and did not review everything -- I hope someone more expert than me at JavaScriptCore will review -- but I did spot a spelling error in the name of this function.

We should s/ermiant/erminat/g
Comment 21 Adam Barth 2010-04-05 11:36:39 PDT
Created attachment 52549 [details]
Patch
Comment 22 Adam Barth 2010-04-05 11:41:37 PDT
> We should s/ermiant/erminat/g

Thanks for looking.  I was vacillating on the names until the last minute (hence the build errors and misspellings).
Comment 23 Eric Seidel (no email) 2010-04-05 17:30:23 PDT
Comment on attachment 52549 [details]
Patch

This looks sane to me, but I am not much of a JavaScriptCore master these days, so I may have missed something subtle.

The CHECK_FOR_TIMEOUT() change and DEFINE_STUB_FUNCTION(int, timeout_check)
change may require a sunspider run, however it looks like all of this code is "slow-path", only run after "tickCount" iterations, or in the case of actual failures, and thus should not affect performance.
Comment 24 Eric Seidel (no email) 2010-04-05 17:31:05 PDT
CCing the real JSC masters in case they spot something wrong that I've missed.
Comment 25 Eric Seidel (no email) 2010-04-06 19:10:07 PDT
Comment on attachment 52549 [details]
Patch

This looks good to me.  As far as I can tell these are all slow-paths, so this should have not affect on performance.
Comment 26 WebKit Commit Bot 2010-04-06 21:38:31 PDT
Comment on attachment 52549 [details]
Patch

Clearing flags on attachment: 52549

Committed r57192: <http://trac.webkit.org/changeset/57192>
Comment 27 WebKit Commit Bot 2010-04-06 21:38:39 PDT
All reviewed patches have been landed.  Closing bug.