Bug 65862 - [Chromium] storage/indexeddb tests sometimes timeout
Summary: [Chromium] storage/indexeddb tests sometimes timeout
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: Layout and Rendering (show other bugs)
Version: 528+ (Nightly build)
Hardware: All OS X 10.5
: P2 Normal
Assignee: Joshua Bell
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-08-08 10:56 PDT by Jian Li
Modified: 2013-04-08 14:27 PDT (History)
8 users (show)

See Also:


Attachments
Patch (1.60 KB, patch)
2012-02-06 11:37 PST, Joshua Bell
no flags Details | Formatted Diff | Diff
Patch (85.21 KB, patch)
2012-10-03 11:34 PDT, Joshua Bell
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jian Li 2011-08-08 10:56:38 PDT
storage/indexeddb/index-cursor.html timeouts on Mac Debug
Comment 1 Joshua Bell 2012-01-27 13:56:51 PST
Does this still occur? 

I'm fairly sure I know what this was, and it should be fixed now.
Comment 2 Joshua Bell 2012-02-06 11:37:47 PST
Created attachment 125677 [details]
Patch
Comment 3 Joshua Bell 2012-02-06 11:38:30 PST
tony@ - r?
Comment 4 Tony Chang 2012-02-06 11:49:17 PST
FWIW, changes that only touch test_expectations.txt can be landed unreviewed (but make sure to lint the test files!).
Comment 5 Joshua Bell 2012-02-06 11:54:19 PST
Comment on attachment 125677 [details]
Patch

Thanks!
Comment 6 WebKit Review Bot 2012-02-06 14:55:44 PST
Comment on attachment 125677 [details]
Patch

Clearing flags on attachment: 125677

Committed r106857: <http://trac.webkit.org/changeset/106857>
Comment 7 WebKit Review Bot 2012-02-06 14:55:48 PST
All reviewed patches have been landed.  Closing bug.
Comment 8 Joshua Bell 2012-02-07 10:53:37 PST
This is still timing out:

http://test-results.appspot.com/dashboards/flakiness_dashboard.html#showExpectations=true&tests=storage%2Findexeddb%2Findex-cursor.html

Will re-mark it as TIMEOUT and investigate further.
Comment 9 Joshua Bell 2012-02-07 12:01:44 PST
Marked as slow again:

BUGWK65862 MAC DEBUG SLOW : storage/indexeddb/index-cursor.html = PASS

From looking at the output, it appears this is truly just a slow test that trips the timeout on the MAC DEBUG combination. It does several iterations over a dataset, but it's not fetching enough data in each test to take advantage of the prefetch cache and the cache isn't saved between runs.

We'll probably have to keep this marked as SLOW until more performance work is done.
Comment 10 Joshua Bell 2012-02-07 12:02:15 PST
For reference, the re-marking was: http://trac.webkit.org/changeset/106971
Comment 11 Ojan Vafai 2012-10-01 11:52:12 PDT
Turning this into a meta bug for the indexeddb tests being slow. I'm marking them all as slow.

http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=storage%2Findexeddb

A large number of them timeout. It sounds like there are some performance problems that will be resolved and then we can unmark them as slow?
Comment 12 Joshua Bell 2012-10-01 11:59:55 PDT
(In reply to comment #11)
> Turning this into a meta bug for the indexeddb tests being slow. I'm marking them all as slow.
> 
> http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=storage%2Findexeddb
> 
> A large number of them timeout. It sounds like there are some performance problems that will be resolved and then we can unmark them as slow?

Thanks!

What seems to be common to the layout tests that are timing out (other than the index-cursor, which just does WAY too much work and produces a huge amount of output) is that they open/close multiple database connections. This seems to intermittently take much more time than expected, causing the timeout.

We're not sure at this point if it's something to do with running the tests in DRT or a real problem, and of course the issue disappears the minute we look closely at it.
Comment 13 Joshua Bell 2012-10-03 11:34:14 PDT
Created attachment 166929 [details]
Patch
Comment 14 Joshua Bell 2012-10-03 11:37:19 PDT
ojan@ - r? storage/indexeddb/index-cursor.html ?

(This is obviously not a comprehensive fix, but that particular test is atrocious.)
Comment 15 Joshua Bell 2012-10-03 11:38:44 PDT
(In reply to comment #14)
> ojan@ - r? storage/indexeddb/index-cursor.html ?

That should read:

ojan@ - r? on reducing the output of storage/indexeddb/index-cursor.html ?
Comment 16 WebKit Review Bot 2012-10-03 15:41:18 PDT
Comment on attachment 166929 [details]
Patch

Clearing flags on attachment: 166929

Committed r130334: <http://trac.webkit.org/changeset/130334>
Comment 17 WebKit Review Bot 2012-10-03 15:41:22 PDT
All reviewed patches have been landed.  Closing bug.
Comment 18 Joshua Bell 2012-10-03 15:47:00 PDT
Re-opening, as this has become a tracking bug for general storage/indexeddb test slowness.
Comment 19 Joshua Bell 2012-10-05 13:35:22 PDT
FYI, storage/indexeddb/index-cursor.html run times have dropped from 12s/16s/25s linux-debug/win-debug/mac-debug respectively to 2s/3s/2s. w00t!
Comment 20 Joshua Bell 2012-10-08 09:18:35 PDT
Root cause of the slow tests appears to be a leveldb issue:

http://code.google.com/p/leveldb/issues/detail?id=125

Each open() or deleteDatabase() could take an additional second(!) to execute. The persnickety tests that open/delete multiple times would be the most affected, which is what #c12 points to.
Comment 21 Joshua Bell 2012-10-15 12:24:52 PDT
If the leveldb patch helps, I would expect times like this:

http://test-results.appspot.com/dashboards/flakiness_dashboard.html#group=%40ToT%20-%20webkit.org&tests=storage%2Findexeddb%2Fintversion-encoding.html

...to drop once chromium r161754 rolls into WebKit.
Comment 22 Joshua Bell 2012-10-15 15:13:20 PDT
(In reply to comment #21)
> If the leveldb patch helps, I would expect times like this:
> 
> http://test-results.appspot.com/dashboards/flakiness_dashboard.html#group=%40ToT%20-%20webkit.org&tests=storage%2Findexeddb%2Fintversion-encoding.html
> 
> ...to drop once chromium r161754 rolls into WebKit.

... which it did in http://trac.webkit.org/changeset/131332 and the times are now all < 1 second. Fingers crossed...
Comment 23 Joshua Bell 2012-10-16 09:58:25 PDT
Ojan - I'll let you make the call on un-marking the tests as slow, but the flakiness dashboard looks promising.
Comment 24 Ojan Vafai 2012-10-16 10:52:26 PDT
Overall the performance is much much better. Thanks!

Unfortunately the upstream Chromium Linux Release bot seems to still have the occasional slow run for most tests. If they weren't marked slow, they would have timed out (any time >= 6 would be a timeout on release Chromium bots). Many of the timeouts are after http://trac.webkit.org/changeset/131332.

The vast majority of the time, all these tests take <1 second on this bot. So, it seems like there might be some fishy locking or something going on for each of these tests to intermittently get 10x slower.

http://test-results.appspot.com/dashboards/flakiness_dashboard.html#group=%40ToT%20-%20webkit.org&tests=storage%2Findexeddb
Comment 25 Joshua Bell 2012-10-17 09:34:43 PDT
By dropping the timeout to 3s locally and running all storage/indexeddb tests under fully-parallel (32 workers), I was able to semi-reliably get timeouts and add some instrumentation. 

It looks like the symptom for e.g. key-generator.html that the deleteDatabase() call can take around half a second when the system is loaded. This is all going into the commit of the delete write-batch, which is forced to be synchronous on disk via leveldb writeOptions.sync = true which forces an fflush. That would be the "lock" that ojan speculated about.

Setting writeOptions.sync = false drops the times significantly.

Two approaches come to mind:

* Add an Internals API that lets us turn off sync writes for tests
* Switch back to using in-memory backing stores but come up with a mechanism to retain them for the lifetime of a test.
Comment 26 Alec Flett 2012-10-17 12:08:31 PDT
I like 1) - and I wonder if we could synchronously delete the database metadata, but asynchronously delete the data? It may not actually help that much though.
Comment 27 Joshua Bell 2012-10-17 15:20:33 PDT
(In reply to comment #26)
> I wonder if we could synchronously delete the database metadata, but asynchronously delete the data? It may not actually help that much though.

When this is running in DumpRenderTree I don't think we gain much from it being async.
Comment 28 Dimitri Glazkov (Google) 2012-11-13 10:13:34 PST
Looks like now it's just flakey.
Comment 29 Joshua Bell 2013-02-13 12:41:41 PST
FWIW, I moved my /tmp dir to a ramdisk. All 200 IDB layout tests now take only about 2 seconds to run since disk seek time is no longer relevant.

Perhaps we should configure the bots this way?