Bug 90469

Summary: storage tests are flaky (crashing) on windows
Product: WebKit Reporter: Emil A Eklund <eae>
Component: Tools / TestsAssignee: wez
Status: RESOLVED INVALID    
Severity: Normal CC: abarth, alecflett, dgrogan, haraken, hayato, japhet, jochen, jsbell, rafaelw, webkit.review.bot, wez
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 90517, 91133, 91275, 91403, 91421, 91637, 91672    
Attachments:
Description Flags
Patch
none
Patch none

Description Emil A Eklund 2012-07-03 10:13:11 PDT
storage/indexeddb/mozilla/indexes.html
storage/indexeddb/mozilla/key-requirements-inline-and-passed.html
storage/websql/multiple-databases-garbage-collection.html
Comment 1 Joshua Bell 2012-07-03 10:39:47 PDT
Huh, more than just those - flakiness link:

http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=storage%2Findexeddb%2F

This is recent... something just prior to r121610.
Comment 2 Joshua Bell 2012-07-03 10:47:47 PDT
Partial stack from a debug builder: 

crash log for DumpRenderTree (pid 13452):
STDOUT: <empty>
STDERR: Backtrace:
STDERR: 	WebKit::WebFilterOperations::at [0x114A432F+10164239]
STDERR: 	WebKit::WebFilterOperations::at [0x114A440F+10164463]
STDERR: 	WebKit::WebFilterOperations::at [0x119EE2C1+15710113]
STDERR: 	v8::Locker::StopPreemption [0x01771EFC+420348]
...
STDERR: 	v8::Locker::StopPreemption [0x018E48EF+1938415]
STDERR: 	(No symbol) [0x0570A336]
...
STDERR: 	(No symbol) [0x05712B4A]
STDERR: 	v8::Locker::StopPreemption [0x0176B48C+393100]
STDERR: 	v8::Locker::StopPreemption [0x0176B214+392468]
STDERR: 	v8::Locker::StopPreemption [0x0176EBBD+407229]
STDERR: 	v8::FunctionTemplate::GetFunction [0x016F6A35+357]
STDERR: 	WebKit::WebFilterOperations::at [0x119E091A+15654394]
...
STDERR: 	WebKit::WebFilterOperations::at [0x10B01947+60967]
STDERR: 	v8::Locker::StopPreemption [0x017C240D+749325]
...
STDERR: 	v8::Locker::StopPreemption [0x019E1881+2974593]
STDERR: 	(No symbol) [0x0570A336]
...
STDERR: 	(No symbol) [0x05712B4A]
STDERR: 	v8::Locker::StopPreemption [0x0176B48C+393100]
STDERR: 	v8::Locker::StopPreemption [0x0176B214+392468]
STDERR: 	v8::Script::Run [0x016DF901+593]
STDERR: 	WebKit::WebFilterOperations::at [0x1148FD06+10080742]
...
STDERR: 	WebKit::WebFilterOperations::at [0x1191FFC7+14865575]
STDERR: 	std::_Init_locks::operator= [0x1225C3B0+2330128]
STDERR: 	webkit::npapi::PluginList::DebugPluginLoading [0x03E1FB30+721199]
STDERR: 	(No symbol) [0x0058D6A9]
...
STDERR: 	(No symbol) [0x005942EC]
STDERR: 	base::DelegateSimpleThreadPool::~DelegateSimpleThreadPool [0x0094518F+418550]
STDERR: 	base::DelegateSimpleThreadPool::~DelegateSimpleThreadPool [0x0094BB39+445600]

This stack pattern (crash in WebKit::WebFilterOperations::at coming from base::DelegateSimpleThreadPool destructor) is consistent among these crashes.
Comment 3 Joshua Bell 2012-07-03 10:52:38 PDT
Those stacks make no sense, though; something funky must be going on.
Comment 4 Joshua Bell 2012-07-03 10:57:26 PDT
The storage/websql test has a similar stack:

http://test-results.appspot.com/dashboards/flakiness_dashboard.html#showExpectations=true&tests=storage%2Fwebsql%2Fmultiple-databases-garbage-collection.html

There's basically no code in common between websql and indexeddb, so I don't think this is related to changed in either of those modules.
Comment 6 Emil A Eklund 2012-07-03 11:05:05 PDT
abarth, could this possibly be from the new bot set up?
Comment 7 Joshua Bell 2012-07-03 11:09:28 PDT
Given that the crashes do seem to be module specific (if you ignore the stacks) and a fairly frequent consistency per-test, I'm guessing something is awry in the backtrace generation. So likely real IDB crashes, but the stack symbols are useless.
Comment 8 Joshua Bell 2012-07-03 11:10:04 PDT
(In reply to comment #7)
> Given that the crashes do seem to be module specific (if you ignore the stacks) and a fairly frequent consistency per-test, I'm guessing something is awry in the backtrace generation. So likely real IDB crashes, but the stack symbols are useless.

Trying in English: "and the frequency is consistent per test"
Comment 9 Adam Barth 2012-07-03 11:48:59 PDT
(In reply to comment #6)
> abarth, could this possibly be from the new bot set up?

The new bot setup is only for the cr-linux-ews bots, not any of the buildbots.
Comment 10 Joshua Bell 2012-07-03 13:46:42 PDT
The crashes seem to start around r121610. http://trac.webkit.org/log/?verbose=on&rev=121612&stop_rev=121600 shows changes around that range. r121612 (an IDB change) would be an obvious candidate except there are crashes that occur before that change and it's pretty innocuous. (Many of the IDB crashes are in tests that don't go anywhere near cursors, either.)

http://trac.webkit.org/changeset/121610/ itself catches my eye as it is the earliest revision in all the crash blame ranges. It changes the NPAPI behavior in the ScriptController and webkit::npapi::PluginList::DebugPluginLoading on the stack is still catching my eye. Is there any chance it's running and tickling an underlying issue in the WebSQL and IndexedDB code because they are async tests that have objects that outlive the test itself?
Comment 11 wez 2012-07-03 14:04:23 PDT
(In reply to comment #10)
> The crashes seem to start around r121610. http://trac.webkit.org/log/?verbose=on&rev=121612&stop_rev=121600 shows changes around that range. r121612 (an IDB change) would be an obvious candidate except there are crashes that occur before that change and it's pretty innocuous. (Many of the IDB crashes are in tests that don't go anywhere near cursors, either.)
> 
> http://trac.webkit.org/changeset/121610/ itself catches my eye as it is the earliest revision in all the crash blame ranges. It changes the NPAPI behavior in the ScriptController and webkit::npapi::PluginList::DebugPluginLoading on the stack is still catching my eye. Is there any chance it's running and tickling an underlying issue in the WebSQL and IndexedDB code because they are async tests that have objects that outlive the test itself?

r121610 changes the way in which the window scriptable object is protected from being leaked by badly-behaved plugins - the NPObject passed to the plugin has its reference to the underlying V8 object Dispose()d, and ignores all subsequent attempts at scripting.  From the calling plugin's point of view the behaviour should be no different from the previous NPObjectWrapper behaviour - scripting the object should simply stop working once teardown reaches a clearScriptObjects, so if the CL causes crashes then I think I must have missed a check of the underlying V8 object's validity somewhere.
Comment 12 Joshua Bell 2012-07-09 15:18:31 PDT
*** Bug 90754 has been marked as a duplicate of this bug. ***
Comment 13 Joshua Bell 2012-07-09 15:18:40 PDT
*** Bug 90743 has been marked as a duplicate of this bug. ***
Comment 14 Rafael Weinstein 2012-07-10 10:35:04 PDT
Added another flaky test to TestExpectations in r122235 (storage/indexeddb/cursor-added-bug.html)
Comment 15 wez 2012-07-11 09:47:10 PDT
Created attachment 151717 [details]
Patch
Comment 16 Kentaro Hara 2012-07-12 02:06:55 PDT
Comment on attachment 151717 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=151717&action=review

> Source/WebCore/ChangeLog:10
> +        This patch is intended to resolve flakiness in the storage tests identified in the bug.

Would you list up the flaky tests here?

> Source/WebCore/bindings/v8/NPV8Object.cpp:189
> +    v8NpObject->rootObject = 0;

This line is not related to fixing the flakiness, right? (This line is fine. I just want to confirm my understanding.)
Comment 17 wez 2012-07-12 10:53:17 PDT
Created attachment 151998 [details]
Patch
Comment 18 wez 2012-07-12 10:56:33 PDT
(In reply to comment #16)
> (From update of attachment 151717 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=151717&action=review
> 
> > Source/WebCore/ChangeLog:10
> > +        This patch is intended to resolve flakiness in the storage tests identified in the bug.
> 
> Would you list up the flaky tests here?

I've copied the list from Emil's original description to the ChangeLog.

> > Source/WebCore/bindings/v8/NPV8Object.cpp:189
> > +    v8NpObject->rootObject = 0;
> 
> This line is not related to fixing the flakiness, right? (This line is fine. I just want to confirm my understanding.)

Yes, this is to make sure that if we're failing to correctly check validity prior to rootObject accesses then we'll fail with a NULL dereference.
Comment 19 Kentaro Hara 2012-07-12 10:57:53 PDT
Comment on attachment 151998 [details]
Patch

Thanks for the clarification. Looks OK.
Comment 20 Kentaro Hara 2012-07-12 11:08:50 PDT
Comment on attachment 151998 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=151998&action=review

> Source/WebCore/ChangeLog:8
> +        Add a missing check that the underlying V8 object reference in a V8 NPObject is valid, and zero the NPObject's rootObject member when disposing it, to ensure that it won't be mistakenly touched after that point.

Nit: Next time you write a ChangeLog, please wrap such long lines.
Comment 21 WebKit Review Bot 2012-07-12 11:31:24 PDT
Comment on attachment 151998 [details]
Patch

Clearing flags on attachment: 151998

Committed r122488: <http://trac.webkit.org/changeset/122488>
Comment 22 WebKit Review Bot 2012-07-12 11:31:29 PDT
All reviewed patches have been landed.  Closing bug.
Comment 23 Joshua Bell 2012-07-12 11:32:49 PDT
Thanks, wez@ - I'll keep an eye on the storage/ tests and report back here on the status of the flakiness. If it goes away I'll remove the TextExpectations lines.
Comment 24 Joshua Bell 2012-07-12 14:49:29 PDT
Initial signs show the crashiness is still there, but it will take a while for usable stacks to come in on the debug bots.
Comment 25 Joshua Bell 2012-07-20 11:09:07 PDT
For posterity, as various things point back to this bug:

In http://trac.webkit.org/changeset/123110 we rolled back http://trac.webkit.org/changeset/121610 and http://trac.webkit.org/changeset/122488. This resolved the flakiness seen in this and duplicate/dependent bugs.
Comment 26 Kentaro Hara 2012-07-20 11:14:53 PDT
(In reply to comment #25)
> For posterity, as various things point back to this bug:
> 
> In http://trac.webkit.org/changeset/123110 we rolled back http://trac.webkit.org/changeset/121610 and http://trac.webkit.org/changeset/122488. This resolved the flakiness seen in this and duplicate/dependent bugs.

Maybe the line 'v8NpObject->rootObject = 0' was the culprit? We inserted the line "just in case". I think that the v8NpObject->v8Object.IsEmpty() check was a correct fix.
Comment 27 Kentaro Hara 2012-07-20 11:15:27 PDT
The patch is reverted. Reopening the bug.
Comment 28 wez 2012-07-20 11:39:38 PDT
(In reply to comment #26)
> (In reply to comment #25)
> > For posterity, as various things point back to this bug:
> > 
> > In http://trac.webkit.org/changeset/123110 we rolled back http://trac.webkit.org/changeset/121610 and http://trac.webkit.org/changeset/122488. This resolved the flakiness seen in this and duplicate/dependent bugs.
> 
> Maybe the line 'v8NpObject->rootObject = 0' was the culprit? We inserted the line "just in case". I think that the v8NpObject->v8Object.IsEmpty() check was a correct fix.

That seems unlikely; we specifically added that to ensure that unexpected access to rootObject after Dispose'ing the V8Object reference would trigger well-formed crashes, whereas the flakes look to have trashed stacks.
Comment 29 Joshua Bell 2013-01-30 12:58:11 PST
I don't think this issue as written has anything actionable about it. Future work on npapi plugin cleanup will take place in another bug and the tests this bug talks about are no longer failing (since the triggering change was reverted).

Closing this for now.