Bug 79413

Summary: IndexedDB: Handle LevelDB database corruption
Product: WebKit Reporter: Joshua Bell <jsbell>
Component: WebKit Misc.Assignee: Joshua Bell <jsbell>
Status: RESOLVED FIXED    
Severity: Normal CC: dgrogan, hans, kinuko, michaeln, senorblanco, tony, tzik, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 80757    
Bug Blocks:    
Attachments:
Description Flags
Patch
none
Patch
none
Patch for landing
none
Patch
none
Patch for landing
none
Patch for landing
none
Patch for landing none

Description Joshua Bell 2012-02-23 15:54:32 PST
We open leveldb databases with the paranoid_checks option, which can detect some forms of corruption. Currently, the IDB implementation doesn't do anything in this case. We should do "something" here.

Recommendation is that if corruption is detected (db fails to open), use the DestroyDB to attempt resetting to a stable state. If this fails, give up. If the destroy succeeds, retry the open (just once, though, not a loop).

We should NOT use the RepairDB option, as this could leave the database in an unexpected state for applications, which is equivalent to corruption in most cases.
Comment 1 Joshua Bell 2012-02-23 16:04:48 PST
Created attachment 128585 [details]
Patch
Comment 2 Joshua Bell 2012-02-23 16:06:41 PST
Initial patch. Appears to work as intended when I hand corrupt the files (e.g. plaster some null bytes), although the "paranoid checks" didn't notice in one case when I truncated the main .log file.

I need to figure out where to wedge in tests for this.
Comment 3 David Grogan 2012-02-23 20:54:35 PST
Comment on attachment 128585 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=128585&action=review

> Source/WebCore/ChangeLog:8
> +        No new tests. (OOPS!)

Would testing this require a browser restart after writing a database but before corrupting it and trying to reopen?

> Source/WebCore/platform/leveldb/LevelDBDatabase.cpp:118
> +    options.env = leveldb::Env::Default();

I'd think you wouldn't need this line because of:
http://code.google.com/codesearch#OAMlx_jo-ck/src/third_party/leveldatabase/src/util/options.cc&exact_package=chromium&q=file:leveld%20case:yes%20Default&type=cs&l=17

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:166
> +                return 0;

Is there any significance to returning 0 here but returning PassRefPtr<LevelDBDatabase>() below?
Comment 4 Joshua Bell 2012-02-23 21:13:10 PST
(In reply to comment #3)
> (From update of attachment 128585 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=128585&action=review
> 
> > Source/WebCore/ChangeLog:8
> > +        No new tests. (OOPS!)
> 
> Would testing this require a browser restart after writing a database but before corrupting it and trying to reopen?

I was thinking a unit test rather than a layout test, since we'll need access to the file path. That should let us do it within one process since handles are closed when the LevelDBDatabase object is released.

> 
> > Source/WebCore/platform/leveldb/LevelDBDatabase.cpp:118
> > +    options.env = leveldb::Env::Default();
> 
> I'd think you wouldn't need this line because of:
> http://code.google.com/codesearch#OAMlx_jo-ck/src/third_party/leveldatabase/src/util/options.cc&exact_package=chromium&q=file:leveld%20case:yes%20Default&type=cs&l=17

Ah, good point.

> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:166
> > +                return 0;
> 
> Is there any significance to returning 0 here but returning PassRefPtr<LevelDBDatabase>() below?

Thanks, I meant to tidy that up. I put in a 0 because I thought that was more succinct, and was going to simplify the other cases but forgot.
Comment 5 Joshua Bell 2012-03-05 15:08:20 PST
Created attachment 130211 [details]
Patch
Comment 6 Tony Chang 2012-03-05 15:24:40 PST
Comment on attachment 130211 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=130211&action=review

> Source/WebCore/Modules/indexeddb/IDBLevelDBBackingStore.cpp:166
> +                LOG_ERROR("IndexedDB backing store cleanup failed");
> +                return PassRefPtr<IDBBackingStore>();

Nit: The flow might be a bit more clear if you early returned on failure and didn't indent the success case.  *shrug*

> Source/WebKit/chromium/tests/LevelDBTest.cpp:44
> +    virtual int compare(const LevelDBSlice& a, const LevelDBSlice& b) const

Nit: OVERRIDE?

> Source/WebKit/chromium/tests/LevelDBTest.cpp:49
> +    virtual const char* name() const { return "temp_comparator"; }

Nit: OVERRIDE?

> Source/WebKit/chromium/tests/LevelDBTest.cpp:89
> +    PlatformFileHandle handle = openFile(filepath, OpenForWrite);
> +    truncateFile(handle, 0);
> +    closeFile(handle);

Do we want to add tests for more subtle corruption like 0 bytes somewhere in the file?
Comment 7 Joshua Bell 2012-03-05 15:32:36 PST
(In reply to comment #6)
> (From update of attachment 130211 [details])
> 
> > Source/WebKit/chromium/tests/LevelDBTest.cpp:89
> > +    PlatformFileHandle handle = openFile(filepath, OpenForWrite);
> > +    truncateFile(handle, 0);
> > +    closeFile(handle);
> 
> Do we want to add tests for more subtle corruption like 0 bytes somewhere in the file?

I tried that; the openFile appears to truncate immediately (the truncateFile line above is redundant, actually, but I kept it in for clarity) - the file is opened with O_WRONLY on Posix, for example. It did not appear that the FileSystem API supported writing into an existing file. I actually started off investigating ways to enumerate the files and plunk 0s throughout, but directory enumeration turned out to be non-trivial. 

That said, the main point of the test is to ensure that when LevelDB does detect corruption (1) it is reported and (2) recovery can be attempted. More extensive tests around the forms of corruption (random bit flips, file truncations, etc) should properly be added into the leveldb project itself.

(I'll take care of the nits, though!)
Comment 8 Joshua Bell 2012-03-05 15:38:00 PST
Created attachment 130214 [details]
Patch for landing
Comment 9 WebKit Review Bot 2012-03-05 17:33:52 PST
Comment on attachment 130214 [details]
Patch for landing

Clearing flags on attachment: 130214

Committed r109825: <http://trac.webkit.org/changeset/109825>
Comment 10 WebKit Review Bot 2012-03-05 17:33:56 PST
All reviewed patches have been landed.  Closing bug.
Comment 11 Stephen White 2012-03-06 12:13:53 PST
Reverted r109825 for reason:

Broke webkit_unit_tests on Chromium Win

Committed r109946: <http://trac.webkit.org/changeset/109946>
Comment 12 Joshua Bell 2012-03-06 12:17:48 PST
"Seems to be only Win"

http://build.chromium.org/p/chromium.webkit/builders/Webkit%20Win/builds/12119/steps/webkit_unit_tests/logs/stdio

Relevant bits:


[----------] 1 test from LevelDBDatabaseTest
[ RUN      ] LevelDBDatabaseTest.CorruptionTest
.\tests\LevelDBTest.cpp(72): error: Value of: leveldb
  Actual: false
Expected: true
Backtrace:
	(No symbol) [0x00863E6E]
	(No symbol) [0x0050FCE3]

Line 72 is testing the initial open, before the corruption test itself runs. So something is not happy with LevelDB on Win - I'm guessing webkit_support::CreateScopedTempDirectory since that's not used elsewhere.

Will need to build/test on Windows to sort this out. I may re-land the patch with the test disabled on OS_WIN in the meantime if that proves problematic.
Comment 13 Joshua Bell 2012-03-08 11:56:31 PST
This turned out to be two problems. First, an encoding issue with the patch.

-    const char* path = tempDirectory->path().c_str();
+    const String path = String::fromUTF8(tempDirectory->path().c_str());

Second, what appears to be a LevelDB bug that would prevent leveldb::DestroyDB from succeeding on Windows:

http://code.google.com/p/leveldb/issues/detail?id=74

I'll re-upload the patch with the former issue fixed, but we can't land it until the LevelDB issue is resolved.
Comment 14 Joshua Bell 2012-03-08 12:04:40 PST
Created attachment 130869 [details]
Patch
Comment 15 Joshua Bell 2012-03-09 14:02:11 PST
Created attachment 131102 [details]
Patch for landing
Comment 16 WebKit Review Bot 2012-03-09 19:10:26 PST
Comment on attachment 131102 [details]
Patch for landing

Clearing flags on attachment: 131102

Committed r110363: <http://trac.webkit.org/changeset/110363>
Comment 17 WebKit Review Bot 2012-03-09 19:10:32 PST
All reviewed patches have been landed.  Closing bug.
Comment 18 Joshua Bell 2012-03-10 07:39:27 PST
Broke the chromium build, see: https://bugs.webkit.org/show_bug.cgi?id=80757
Comment 19 Joshua Bell 2012-03-10 07:54:26 PST
The chromium component build can't link to webkit_support, so webkit_unit_test files that do use webkit_support are excluded from the component build, in Source/WebKit/chromium/WebKit.gyp: 

 'sources!': [
                                # We should not include files depending on webkit_support.
                                # These tests depend on webkit_support and
                                # functions defined only in !WEBKIT_IMPLEMENTATION.
                                'tests/AssociatedURLLoaderTest.cpp',
                                'tests/FrameTestHelpers.cpp',
                                'tests/PopupMenuTest.cpp',
                                'tests/RenderTableCellTest.cpp',
                                'tests/WebFrameTest.cpp',
                                'tests/WebPageNewSerializerTest.cpp',
                                'tests/WebPageSerializerTest.cpp',
                                'tests/WebViewTest.cpp',
                            ],

Need to add a tests/LevelDBTest.cpp entry here. I'll wait until Monday when I'm actually around to watch, though, and see about a try-bot run.
Comment 20 Joshua Bell 2012-03-12 11:51:20 PDT
Created attachment 131369 [details]
Patch for landing
Comment 21 Joshua Bell 2012-03-12 11:56:10 PDT
Created attachment 131371 [details]
Patch for landing
Comment 22 WebKit Review Bot 2012-03-12 12:57:15 PDT
Comment on attachment 131371 [details]
Patch for landing

Clearing flags on attachment: 131371

Committed r110463: <http://trac.webkit.org/changeset/110463>
Comment 23 WebKit Review Bot 2012-03-12 12:57:21 PDT
All reviewed patches have been landed.  Closing bug.