Bug 57372

Summary: LevelDB backend for IndexedDB
Product: WebKit Reporter: Jeremy Orlow <jorlow>
Component: New BugsAssignee: Hans Wennborg <hans>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, andreip, dglazkov, dgrogan, dgrogan, eric, jorlow, steveblock, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Bug Depends on: 58343    
Bug Blocks:    
Attachments:
Description Flags
patch by hans
none
patch by hans
none
full patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch steveblock: review+

Description Jeremy Orlow 2011-03-29 11:58:31 PDT
LevelDB backend for IndexedDB
Comment 1 Jeremy Orlow 2011-03-29 11:59:00 PDT
Created attachment 87384 [details]
patch by hans
Comment 2 Jeremy Orlow 2011-03-29 12:02:16 PDT
Created attachment 87386 [details]
patch by hans
Comment 3 WebKit Review Bot 2011-03-29 12:03:04 PDT
Attachment 87384 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files']" exit_code: 1

Total errors found: 0 in 0 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 4 David Grogan 2011-03-29 12:38:54 PDT
Just a few bike-sheddy questions.

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:84
> +static int64_t decodeInt(const std::string& s)

It appears leveldb only stores strings and these functions convert our data types to types that leveldb can store?

If these are good encoding routines would it be appropriate to migrate them to the leveldb side?

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:128
> +        unsigned char hi = (u >> 8) & 0xff;

what does &ing with 0xff do?
Comment 5 Jeremy Orlow 2011-03-29 17:22:52 PDT
Comment on attachment 87386 [details]
patch by hans

View in context: https://bugs.webkit.org/attachment.cgi?id=87386&action=review

ok...that's enough for one day

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:29
> +#if ENABLE(INDEXED_DATABASE)

This should have some additional feature define for leveldb.

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:41
> +

This file needs a big block comment describing the file format and a couple sentences about the design goals.  There's no way the code can be the documentation for something like this...

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:63
> +    for (size_t i = 0; i < s.length(); i++) {

++i

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:64
> +        unsigned char c = s[i];

this seems slightly unnecessary, but *shrug*

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:65
> +        printf("%02x", (int)c);

static_cast?

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:73
> +    std::string ret;

maybe resize?

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:88
> +    for (size_t i = 0; i  < s.length(); i++) {

++i

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:90
> +        ret = ret << 8 | c;

<<=

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:101
> +        unsigned char c = n & (128 - 1);

0x7F maybe is a bit more clear?

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:111
> +static const char* decodeVarInt(const char *p, int64_t &foundInt)

Anywhere that we pass a pointer, we need to pass the max size.  If this crosses the size, it should probably return NULL.  And the caller should probably handle it.

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:116
> +        foundInt = (foundInt << 7) | (*p & (128 - 1));

0x7F seems more clear/standard

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:129
> +        unsigned char lo = u & 0xff;

earlier you skipped the & 0xff...maybe be consistent?

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:144
> +    for (size_t i = 0; i < s.length(); i+=2) {

i += 2

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:193
> +    std::string ret;

Please document with FIXMEs all the places we should be sizing a string ahead of time.

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:241
> +static int countBytes(int64_t n)

The name isn't super clear here.

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:268
> +static const char* decodePrefix(const char* p, int64_t& databaseId, int64_t& objectStoreId, int64_t& indexId)

Here and elsewhere, we need to be more careful to ensure that we're not reading invalid p memory.

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:274
> +    int indexBytes = (firstByte & 0x7) + 1;

this is 0x3 right?

Anything else you can think of that might be off like this?  Can you check?  Cause that'd be bad...

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:308
> +static std::string makeKey(int64_t databaseId, int64_t objectStoreId, int64_t indexId, int nbr, const String& str)

s/nbr/number/...and what number?

These make key methods are kind of confusing and seem somewhat ad-hoc...  Like, if you do a number and then a string, the string is encoded with its length.  Otherwise not.  I feel like you should probably name these proper names rather than just overloading makeKey since subtleties creep in with each overloaded version.

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:356
> +    else if (!s.ok()) {

if you just returned, then do an if, not else if

of course, this coudl jsut be an ASSERT

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:379
> +static bool getString(leveldb::DB* db, const std::string& key, String* foundString)

I feel like the names might not be descriptive enough?  It's not clear that, for example, I wouldn't use this to get a normal key out of the object store data that happens to be a string.

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:389
> +    else if (!s.ok()) {

ASSERT

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:414
> +class MyComparator : public leveldb::Comparator {

Couldn't we just call it Comparator?

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:445
> +        if (*p != *q) return *p - *q; // FIXME: Think this through.

Yeah, this will screw up locality....  It seems worth doing the initial decode always

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:457
> +        if (a1 == 0) {

This logic would be so much easier to follow if you had good variable names.

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:618
> +    void FindShortestSeparator(std::string*, const leveldb::Slice&) const {} // FIXME: Ignore?

No, we should definitely implement this and the following.  Unfortunately, I don't have any great ideas about how not to nearly duplicate all the above logic for each one.  A lot of the ways I could think of to make it cleaner would likely make it slower as well.  Maybe just ignore for now, but "fix" it once we're sure our file format is stable and add a comment saying that whenever one is changed, all need to be?  It's going to be hard to catch errors in this with layout tests tho....

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:620
> +} myComparator;

couldn't we just call it comparator?

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:651
> +        pathBase2 = "/tmp/in-mem-leveldb";

ASSERT_NOT_REACHED here until we figure out what we should do.

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:655
> +        // FIXME: Is there any other thing we could possibly do to recover at this point? If so, do it rather than just erroring out.

I think thi sis fine.

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:656
> +        LOG_ERROR("Unabled to create LocalStorage database path %s", pathBase.utf8().data());

LocalStorage?

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:659
> +    String path = pathByAppendingComponent(pathBase2, securityOrigin->databaseIdentifier() + ".indexeddb.leveldb");

We're going to have one big one shared eventually...but I guess this is OK for now.

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:665
> +    // FIXME: We need a custom comparator. Byte-wise comparison won't work :(

delete

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:1574
> +    IndexCursorImpl(leveldb::DB* db, const std::string& lowKey, bool lowOpen, const std::string& highKey, bool highOpen, bool forward)

I know this class is private, but you shoudl probably hide the constructor and add a create method.

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:1594
> +    //printf("IndexCursor::loadCurrentRow()\n");

If you think this debug code will still be useful, I'm OK with it staying in a bit longer as long as you have a FIXME at the top to remove it soon.  Ideally remove it before landing though or make it use WebKit's standard logging methods (...which I think exist somewhere...?).

> /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:1599
> +    const char* p = m_iterator->key().data();

not a fan of these variable names.  or p or q
Comment 6 Hans Wennborg 2011-03-30 03:25:04 PDT
Created attachment 87495 [details]
full patch
Comment 7 Hans Wennborg 2011-03-30 03:26:46 PDT
Thanks for reviewing, Jeremy. I've uploaded the full patch so it's easier to see how it all fits together.

Will start addressing your comments during the day.
Comment 8 WebKit Review Bot 2011-03-30 03:27:35 PDT
Attachment 87495 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1

Last 3072 characters of output:
nd comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1389:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1390:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1391:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1392:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1406:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1459:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1460:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1461:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1462:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1463:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1464:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1515:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1516:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1517:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1553:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1554:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1555:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1594:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1595:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1596:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1624:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1639:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
Source/WebCore/storage/IDBObjectStoreBackendImpl.cpp:301:  One line control clauses should not use braces.  [whitespace/braces] [4]
Total errors found: 189 in 17 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 9 Hans Wennborg 2011-03-30 10:30:49 PDT
Adding andreip to keep him in the loop to.


(In reply to comment #4)
> Just a few bike-sheddy questions.
> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:84
> > +static int64_t decodeInt(const std::string& s)
> 
> It appears leveldb only stores strings and these functions convert our data types to types that leveldb can store?
> 
> If these are good encoding routines would it be appropriate to migrate them to the leveldb side?

Yeah, we could look into that when this code is more stable (the functions might still change a bit).

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:128
> > +        unsigned char hi = (u >> 8) & 0xff;
> 
> what does &ing with 0xff do?

I think that was just me being paranoid :) Removed.

(In reply to comment #5)
> (From update of attachment 87386 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=87386&action=review
> 
> ok...that's enough for one day
> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:29
> > +#if ENABLE(INDEXED_DATABASE)
> 
> This should have some additional feature define for leveldb.
I've now put it behind ENABLE(LEVELDB).

> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:41
> > +
> 
> This file needs a big block comment describing the file format and a couple sentences about the design goals.  There's no way the code can be the documentation for something like this...

Yup, will do this.

I'd also like to write some class that can represent, encode, and decode all keys rather than having that stuff spread out everywhere. 

It should make the code much more clear, and it could probably be made more efficient too.

I'll see if I can get something together tomorrow.

> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:63
> > +    for (size_t i = 0; i < s.length(); i++) {
> 
> ++i
Done.

> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:64
> > +        unsigned char c = s[i];
> 
> this seems slightly unnecessary, but *shrug*
Yeah, won't land that anyway.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:65
> > +        printf("%02x", (int)c);
> 
> static_cast?
Ditto.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:73
> > +    std::string ret;
> 
> maybe resize?
Yes, I'll look into this.

Also, does anyone know which is fastest, or otherwise preferred?

string s;
s.reserve(some_size);

Or:

string s(some_size, '\0');

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:88
> > +    for (size_t i = 0; i  < s.length(); i++) {
> 
> ++i
Done.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:90
> > +        ret = ret << 8 | c;
> 
> <<=
Nah, I want to or in c at the same time..?
Adding parentheses to make this more clear.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:101
> > +        unsigned char c = n & (128 - 1);
> 
> 0x7F maybe is a bit more clear?
Done.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:111
> > +static const char* decodeVarInt(const char *p, int64_t &foundInt)
> 
> Anywhere that we pass a pointer, we need to pass the max size.  If this crosses the size, it should probably return NULL.  And the caller should probably handle it.

Adding the max size (but in form of a "limit" pointer). Checking return values in some places, but will add more checks.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:116
> > +        foundInt = (foundInt << 7) | (*p & (128 - 1));
> 
> 0x7F seems more clear/standard
Done.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:129
> > +        unsigned char lo = u & 0xff;
> 
> earlier you skipped the & 0xff...maybe be consistent?
Done.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:144
> > +    for (size_t i = 0; i < s.length(); i+=2) {
> 
> i += 2
Rewrote that code.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:193
> > +    std::string ret;
> 
> Please document with FIXMEs all the places we should be sizing a string ahead of time.
Done.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:241
> > +static int countBytes(int64_t n)
> 
> The name isn't super clear here.
Okay, trying countNonZeroBytes.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:268
> > +static const char* decodePrefix(const char* p, int64_t& databaseId, int64_t& objectStoreId, int64_t& indexId)
> 
> Here and elsewhere, we need to be more careful to ensure that we're not reading invalid p memory.
> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:274
> > +    int indexBytes = (firstByte & 0x7) + 1;
> 
> this is 0x3 right?
Ouch, you're right. Good catch.

> 
> Anything else you can think of that might be off like this?  Can you check?  Cause that'd be bad...
Not that I can think of, but I will keep looking. Btw, is there any way we could test this in our existing suite? Can webkit_unittests reach into WebCore, for example?

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:308
> > +static std::string makeKey(int64_t databaseId, int64_t objectStoreId, int64_t indexId, int nbr, const String& str)
> 
> s/nbr/number/...and what number?
> 
> These make key methods are kind of confusing and seem somewhat ad-hoc...  Like, if you do a number and then a string, the string is encoded with its length.  Otherwise not.  I feel like you should probably name these proper names rather than just overloading makeKey since subtleties creep in with each overloaded version.

Yeah, this needs to be structured better.. it should all go into that key representation/encoding/decoding thing I'm thinking of.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:356
> > +    else if (!s.ok()) {
> 
> if you just returned, then do an if, not else if
> 
> of course, this coudl jsut be an ASSERT
Fixed this.

I think the style bot checks for this.. I'll address all those issues before uploading.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:379
> > +static bool getString(leveldb::DB* db, const std::string& key, String* foundString)
> 
> I feel like the names might not be descriptive enough?  It's not clear that, for example, I wouldn't use this to get a normal key out of the object store data that happens to be a string.

Yeah, I'm not sure it's worth keeping these (getString/putString/getInt/putInt) around actually..

Would renaming to getWTFString be better for now?

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:389
> > +    else if (!s.ok()) {
> 
> ASSERT
Done.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:414
> > +class MyComparator : public leveldb::Comparator {
> 
> Couldn't we just call it Comparator?

Sure. Done.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:445
> > +        if (*p != *q) return *p - *q; // FIXME: Think this through.
> 
> Yeah, this will screw up locality....  It seems worth doing the initial decode always

How do you mean screw up locality?

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:457
> > +        if (a1 == 0) {
> 
> This logic would be so much easier to follow if you had good variable names.

Yes. I want to rewrite this in the context of a class that actually knows what the data means. That would make it easier to assign variable names.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:618
> > +    void FindShortestSeparator(std::string*, const leveldb::Slice&) const {} // FIXME: Ignore?
> 
> No, we should definitely implement this and the following.  Unfortunately, I don't have any great ideas about how not to nearly duplicate all the above logic for each one.  A lot of the ways I could think of to make it cleaner would likely make it slower as well.  Maybe just ignore for now, but "fix" it once we're sure our file format is stable and add a comment saying that whenever one is changed, all need to be?  It's going to be hard to catch errors in this with layout tests tho....

Changing the FIXME.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:620
> > +} myComparator;
> 
> couldn't we just call it comparator?

Done.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:651
> > +        pathBase2 = "/tmp/in-mem-leveldb";
> 
> ASSERT_NOT_REACHED here until we figure out what we should do.

But I need this to be able to run DumpRenderTree. Adding a FIXME.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:655
> > +        // FIXME: Is there any other thing we could possibly do to recover at this point? If so, do it rather than just erroring out.
> 
> I think thi sis fine.
Ok.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:656
> > +        LOG_ERROR("Unabled to create LocalStorage database path %s", pathBase.utf8().data());
> 
> LocalStorage?
Heh, I think this has been copied multiple times :) Fixing.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:659
> > +    String path = pathByAppendingComponent(pathBase2, securityOrigin->databaseIdentifier() + ".indexeddb.leveldb");
> 
> We're going to have one big one shared eventually...but I guess this is OK for now.

Added FIXME.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:665
> > +    // FIXME: We need a custom comparator. Byte-wise comparison won't work :(
> 
> delete
Done.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:1574
> > +    IndexCursorImpl(leveldb::DB* db, const std::string& lowKey, bool lowOpen, const std::string& highKey, bool highOpen, bool forward)
> 
> I know this class is private, but you shoudl probably hide the constructor and add a create method.

Happy to do it, but I thought that was mostly a Chromium WebKit-layer thing? Or does it apply all over?

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:1594
> > +    //printf("IndexCursor::loadCurrentRow()\n");
> 
> If you think this debug code will still be useful, I'm OK with it staying in a bit longer as long as you have a FIXME at the top to remove it soon.  Ideally remove it before landing though or make it use WebKit's standard logging methods (...which I think exist somewhere...?).

Nah, removing. Keeping the printString function around for a little while, though.

> 
> > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:1599
> > +    const char* p = m_iterator->key().data();
> 
> not a fan of these variable names.  or p or q

Yeah, I need to do something about the key handling.
Comment 10 Hans Wennborg 2011-03-30 10:31:29 PDT
Created attachment 87566 [details]
Patch
Comment 11 Jeremy Orlow 2011-03-30 10:52:08 PDT
(In reply to comment #9)
> (In reply to comment #4)
> > This should have some additional feature define for leveldb.
> I've now put it behind ENABLE(LEVELDB).

You need both LEVELDB and INDEXED_DATABASE

> > > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cp
> Also, does anyone know which is fastest, or otherwise preferred?
> 
> string s;
> s.reserve(some_size);

I'd be surprised if this weren't the same or faster than the other.  It seems cleaner too.

> > > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:241
> > > +static int countBytes(int64_t n)
> > 
> > The name isn't super clear here.
> Okay, trying countNonZeroBytes.

But 0 returns 1...so the name is not quite right.  Maybe just usedBytes with a comment that 0 uses 1?


> > Anything else you can think of that might be off like this?  Can you check?  Cause that'd be bad...
> Not that I can think of, but I will keep looking. Btw, is there any way we could test this in our existing suite? Can webkit_unittests reach into WebCore, for example?

I don't see why not.  We should only use this for stuff that layout tests can't cover, but this is definitely one of them.

> > > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:379
> > > +static bool getString(leveldb::DB* db, const std::string& key, String* foundString)
> > 
> > I feel like the names might not be descriptive enough?  It's not clear that, for example, I wouldn't use this to get a normal key out of the object store data that happens to be a string.
> 
> Yeah, I'm not sure it's worth keeping these (getString/putString/getInt/putInt) around actually..
> 
> Would renaming to getWTFString be better for now?

I think you're right that all of this just needs to be in its own class (or namespace) and such.  That should help make it more clear.

> > > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:445
> > > +        if (*p != *q) return *p - *q; // FIXME: Think this through.
> > 
> > Yeah, this will screw up locality....  It seems worth doing the initial decode always
> 
> How do you mean screw up locality?

Hm...last night I thought I had come up with an example where this would make the sort order different than if you just compared the following 3 numbers. But now that I think about it, it seems like it shouldn't affect it.  As long as it doesn't, this is fine.


> > 
> > > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:651
> > > +        pathBase2 = "/tmp/in-mem-leveldb";
> > 
> > ASSERT_NOT_REACHED here until we figure out what we should do.
> 
> But I need this to be able to run DumpRenderTree. Adding a FIXME.

Hmm....well we definitely want DRT to be able to run the file backed version as well as the in-memory version if there's any substantial differences.  We may need a layoutTestController method to control this?

> > 
> > > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:1574
> > > +    IndexCursorImpl(leveldb::DB* db, const std::string& lowKey, bool lowOpen, const std::string& highKey, bool highOpen, bool forward)
> > 
> > I know this class is private, but you shoudl probably hide the constructor and add a create method.
> 
> Happy to do it, but I thought that was mostly a Chromium WebKit-layer thing? Or does it apply all over?

It's a protect ourselves from memory leaks thing.  It's too easy to forget adoptRef.
Comment 12 Hans Wennborg 2011-03-31 10:38:25 PDT
It's still a work in progress, but I've started working on refactoring the key handling stuff.

The idea is to have a class for every key type, that knows how to encode, decode, and compare keys of its type.

I've implemented the classes, and rewritten the Comparator to use it, but still need to replace all the places in the code where keys are created. Hoping to get this done tomorrow.

Please let me know what you think.



(In reply to comment #11)
> (In reply to comment #9)
> > (In reply to comment #4)
> > > This should have some additional feature define for leveldb.
> > I've now put it behind ENABLE(LEVELDB).
> 
> You need both LEVELDB and INDEXED_DATABASE
Done.

> 
> > > > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cp
> > Also, does anyone know which is fastest, or otherwise preferred?
> > 
> > string s;
> > s.reserve(some_size);
> 
> I'd be surprised if this weren't the same or faster than the other.  It seems cleaner too.
> 
> > > > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:241
> > > > +static int countBytes(int64_t n)
> > > 
> > > The name isn't super clear here.
> > Okay, trying countNonZeroBytes.
> 
> But 0 returns 1...so the name is not quite right.  Maybe just usedBytes with a comment that 0 uses 1?
Will do.

> 
> 
> > > Anything else you can think of that might be off like this?  Can you check?  Cause that'd be bad...
> > Not that I can think of, but I will keep looking. Btw, is there any way we could test this in our existing suite? Can webkit_unittests reach into WebCore, for example?
> 
> I don't see why not.  We should only use this for stuff that layout tests can't cover, but this is definitely one of them.

Good, I'll write some tests for this.

> 
> > > > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:379
> > > > +static bool getString(leveldb::DB* db, const std::string& key, String* foundString)
> > > 
> > > I feel like the names might not be descriptive enough?  It's not clear that, for example, I wouldn't use this to get a normal key out of the object store data that happens to be a string.
> > 
> > Yeah, I'm not sure it's worth keeping these (getString/putString/getInt/putInt) around actually..
> > 
> > Would renaming to getWTFString be better for now?
> 
> I think you're right that all of this just needs to be in its own class (or namespace) and such.  That should help make it more clear.
> 
> > > > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:445
> > > > +        if (*p != *q) return *p - *q; // FIXME: Think this through.
> > > 
> > > Yeah, this will screw up locality....  It seems worth doing the initial decode always
> > 
> > How do you mean screw up locality?
> 
> Hm...last night I thought I had come up with an example where this would make the sort order different than if you just compared the following 3 numbers. But now that I think about it, it seems like it shouldn't affect it.  As long as it doesn't, this is fine.
> 
> 
> > > 
> > > > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:651
> > > > +        pathBase2 = "/tmp/in-mem-leveldb";
> > > 
> > > ASSERT_NOT_REACHED here until we figure out what we should do.
> > 
> > But I need this to be able to run DumpRenderTree. Adding a FIXME.
> 
> Hmm....well we definitely want DRT to be able to run the file backed version as well as the in-memory version if there's any substantial differences.  We may need a layoutTestController method to control this?

Yeah, I'll have to look into this.

> 
> > > 
> > > > /Users/jorlow/Desktop/IDBLevelDBBackingStore.cpp:1574
> > > > +    IndexCursorImpl(leveldb::DB* db, const std::string& lowKey, bool lowOpen, const std::string& highKey, bool highOpen, bool forward)
> > > 
> > > I know this class is private, but you shoudl probably hide the constructor and add a create method.
> > 
> > Happy to do it, but I thought that was mostly a Chromium WebKit-layer thing? Or does it apply all over?
> 
> It's a protect ourselves from memory leaks thing.  It's too easy to forget adoptRef.

Ok, will fix.
Comment 13 Hans Wennborg 2011-03-31 10:38:54 PDT
Created attachment 87761 [details]
Patch
Comment 14 Hans Wennborg 2011-04-01 10:15:47 PDT
Created attachment 87877 [details]
Patch

Uploading the latest changes
Comment 15 Hans Wennborg 2011-04-12 08:35:27 PDT
Created attachment 89206 [details]
Patch
Comment 16 Hans Wennborg 2011-04-12 08:38:45 PDT
I've just uploaded a new version of this patch. It's significantly smaller, as it's rebased on top of the patch from Bug 57827.

All encoding, decoding, and comparison of keys are now handled by designated key classes, and I've tried to clean it up as much as possible in general.

I know it's a huge patch, but your input is very much appreciated.
Comment 17 WebKit Review Bot 2011-04-12 08:54:22 PDT
Attachment 89206 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/8396010
Comment 18 Hans Wennborg 2011-04-12 09:14:33 PDT
(In reply to comment #17)
> Attachment 89206 [details] did not build on chromium:
> Build output: http://queues.webkit.org/results/8396010

It fails to build because the leveldb library hasn't been added to the WebKit/chromium/DEPS file.
Comment 19 Eric Seidel (no email) 2011-04-12 10:25:07 PDT
Attachment 89206 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/8392478
Comment 20 Jeremy Orlow 2011-04-12 10:26:19 PDT
(In reply to comment #18)
> (In reply to comment #17)
> > Attachment 89206 [details] [details] did not build on chromium:
> > Build output: http://queues.webkit.org/results/8396010
> 
> It fails to build because the leveldb library hasn't been added to the WebKit/chromium/DEPS file.

It probably should be added in this patch.
Comment 21 Hans Wennborg 2011-04-12 10:30:50 PDT
(In reply to comment #20)
> (In reply to comment #18)
> > (In reply to comment #17)
> > > Attachment 89206 [details] [details] [details] did not build on chromium:
> > > Build output: http://queues.webkit.org/results/8396010
> > 
> > It fails to build because the leveldb library hasn't been added to the WebKit/chromium/DEPS file.
> 
> It probably should be added in this patch.

I already uploaded a patch in Bug 58343.
Comment 22 Jeremy Orlow 2011-04-12 11:41:14 PDT
Comment on attachment 89206 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=89206&action=review

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:46
> +// FIXME: Do this more properly.

Why is this necessary?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:53
> +// LevelDB stores key/value pairs. Keys and values are strings of bytes, normally of type std::string.

Explicitly mention that the std::strings don't have any particular encoding and usually aren't null terminated.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:56
> +// of fields. Each key in the backing store starts with a ternary prefix: (database id, object store id, index id). For each, 0 is reserved for meta-data.

Explain that ordering is important for locality.  Maybe briefly talk about the groupings that are most important to preserve and why.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:67
> +// Database meta-data:

Mention the 4th entry is a byte.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:74
> +// Object store meta-data:

Mention the 4th entry is a byte, 5th is variable length, 6th is byte.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:83
> +// Index meta-data:

Mention the 4th entry is a byte, 5th is variable length, 6th is variable length, 7th is byte.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:91
> +//     <database id, 0, 0, 150, object store id>                   => existence implies the object store id is in the free list [ObjectStoreFreeListKey]

Mention the 4th entry is a byte, 5th is variable length.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:98
> +//     <database id, object store id, 1, user key> => "version", serialized script value [ObjectStoreDataKey]

Mention the 4th entry is a byte.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:102
> +//     <database id, object store id, 2, user key> => "version" [ExistsEntryKey]

Mention the 4th is a byte.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:106
> +//     <database id, object store id, index id, user key, sequence number> => "version", user key [IndexDataKey]

4th is variable length encoded....so this means that the variable length encoding must always place 0-30 (or however many we reserve) before the first index id.

It might be worth changing the schema a bit so this isn't required though.  Given that 99.99% of the data will be object store data or index data, maybe instead of every key having <var, var, var> every field should have <var, var, byte, var>?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:114
> +//     collected.)

Explain how they're GCed.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:136
> +static std::string maxIDBKey()

Not sure this name is good.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:138
> +    return encodeByte(5); // Null type.

Why not 39?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:141
> +static std::string minIDBKey()

ditto

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:173
> +static std::string encodeVarInt(int64_t n)

Once you've done an optimizing pass on this stuff, you should ask Jeff/Sanjay to take a look.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:188
> +static const char* decodeVarInt(const char *p, const char* limit, int64_t* foundInt)

s/*/&/ is the webkit way

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:206
> +    for (unsigned i = 0; i < s.length(); ++i) {

Maybe comment that this attempts to be endian independent?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:242
> +static const char* decodeStringWithLen(const char* p, const char* limit, String* foundString)

s/*/&/ is the webkit way

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:260
> +    char* p = reinterpret_cast<char*>(&x);

Will this be different on different platforms?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:262
> +        ret.append(1, p[i]);

Can you append more than one at a time?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:271
> +    char* x = reinterpret_cast<char*>(d);

I wonder how the compiler handles this register wise.  I wonder if it'd be faster to create an array, decode into that, and then do one cast at the end?  (Same above?)

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:283
> +        return encodeByte(5);

Probably better to make these things constants?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:296
> +    }

Does this compile on all platforms?  You may need an assert not reached and a dummy return.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:299
> +static const char* decodeIDBKey(const char* p, const char* limit, RefPtr<IDBKey>* foundKey)

s/*/&/ is the webkit way

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:306
> +    String s;

You could use {}'s within your case for scoping

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:310
> +    case 5:

This should be non-sparse numbers starting from 0.  If we need to add more in-between, we can use a lookup table.  If its sparse, the switch will be slower.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:336
> +        ASSERT_NOT_REACHED();

Move out of switch, i think.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:353
> +    case 40:

What's 40?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:426
> +static int countNonZeroBytes(int64_t n)

Probably not necessary.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:483
> +    KeyPrefix() {}

Maybe this should do an implicit decode and decode should go away?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:518
> +        int databaseIdBytes = countNonZeroBytes(m_databaseId);

You should probably just encode them and then extract the bytes used.  Should be faster and safer.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:568
> +        if (m_indexId >= 30)

It's probably worth making these constants.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:585
> +        return prefix.encode() + encodeByte(kTypeByte);

I think this would be more clear with a global constant (grouped with other similar ones) that has a more descriptive name.  Same below.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:602
> +class DatabaseFreeListKey {

Everything up until this point r=me with comments.  Don't have time to look at the rest closely tho...

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1453
> +        return false;

don't return a bool

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1560
> +        return false;

ditto

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1586
> +        return false;

Without write batches, this would leave us inconsistent...I guess we're fine though..?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1646
> +    m_db->Delete(leveldb::WriteOptions(), ObjectStoreMetaDataKey::encode(databaseId, objectStoreId, 0));

Just delete the range..?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1653
> +    putString(m_db.get(), ObjectStoreFreeListKey::encode(databaseId, objectStoreId), "");

Why do you need to do a put with the free list key?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1658
> +    if (!deleteRange(m_db.get(), IndexMetaDataKey::encode(databaseId, objectStoreId, 0, 0), IndexMetaDataKey::encode(databaseId, objectStoreId, INT64_MAX, 0)))

Not super fond of ranges with stuff like INT64_MAX.  Would rather deleteRange(<db, os, 0>, CLOSED, <db, os+1, 0>, OPEN)

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1717
> +    // FIXME: Have a think about what happens when this wraps around. It shouldn't be a problem, I think.

What if there's overlap though..?

I'd just assert on it not wrapping for now.
Comment 23 Jeremy Orlow 2011-04-12 11:41:57 PDT
I posted some comments, but I doubt I'm going to be able to finish the review.
Comment 24 Hans Wennborg 2011-04-13 11:55:35 PDT
Thanks very much for the review, Jeremy!

(In reply to comment #22)
> (From update of attachment 89206 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=89206&action=review
> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:46
> > +// FIXME: Do this more properly.
> 
> Why is this necessary?
You mean the FIXME or the #defines themselves?

The #defines are there because I don't think MSVC defines those macros. The FIXME is there because it felt kind of hacky.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:53
> > +// LevelDB stores key/value pairs. Keys and values are strings of bytes, normally of type std::string.
> 
> Explicitly mention that the std::strings don't have any particular encoding and usually aren't null terminated.
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:56
> > +// of fields. Each key in the backing store starts with a ternary prefix: (database id, object store id, index id). For each, 0 is reserved for meta-data.
> 
> Explain that ordering is important for locality.  Maybe briefly talk about the groupings that are most important to preserve and why.

Done. Which groupings are you considering to be more important than others?

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:67
> > +// Database meta-data:
> 
> Mention the 4th entry is a byte.
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:74
> > +// Object store meta-data:
> 
> Mention the 4th entry is a byte, 5th is variable length, 6th is byte.
Done. (Actually, the 6th is a varint.. i have a FIXME for that)

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:83
> > +// Index meta-data:
> 
> Mention the 4th entry is a byte, 5th is variable length, 6th is variable length, 7th is byte.
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:91
> > +//     <database id, 0, 0, 150, object store id>                   => existence implies the object store id is in the free list [ObjectStoreFreeListKey]
> 
> Mention the 4th entry is a byte, 5th is variable length.
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:98
> > +//     <database id, object store id, 1, user key> => "version", serialized script value [ObjectStoreDataKey]
> 
> Mention the 4th entry is a byte.
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:102
> > +//     <database id, object store id, 2, user key> => "version" [ExistsEntryKey]
> 
> Mention the 4th is a byte.
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:106
> > +//     <database id, object store id, index id, user key, sequence number> => "version", user key [IndexDataKey]
> 
> 4th is variable length encoded....so this means that the variable length encoding must always place 0-30 (or however many we reserve) before the first index id.
> 

I'm not sure I understand, please elaborate.

> It might be worth changing the schema a bit so this isn't required though.  Given that 99.99% of the data will be object store data or index data, maybe instead of every key having <var, var, var> every field should have <var, var, byte, var>?

I'm very happy to do any schema changes to make things simpler :) But I'd prefer to do it in later patches.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:114
> > +//     collected.)
> 
> Explain how they're GCed.
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:136
> > +static std::string maxIDBKey()
> 
> Not sure this name is good.

Hmm, I was thinking that it's the maximum value key (in terms of sort order).. name suggestions are welcome.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:138
> > +    return encodeByte(5); // Null type.
> 
> Why not 39?

I use the value 10 for string keys, and these "null" keys have even higher precedence, so I used 5. It's at the other end of the scale from the minIDBKey(), so to speak.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:141
> > +static std::string minIDBKey()
> 
> ditto
> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:173
> > +static std::string encodeVarInt(int64_t n)
> 
> Once you've done an optimizing pass on this stuff, you should ask Jeff/Sanjay to take a look.
Will do.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:188
> > +static const char* decodeVarInt(const char *p, const char* limit, int64_t* foundInt)
> 
> s/*/&/ is the webkit way
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:206
> > +    for (unsigned i = 0; i < s.length(); ++i) {
> 
> Maybe comment that this attempts to be endian independent?
Well, it doesn't really.. and like I think you point out below, the encoding of double's are not endian independent. Do you think we should go for being endian independent? Are LevelDB files themselves endian independent?

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:242
> > +static const char* decodeStringWithLen(const char* p, const char* limit, String* foundString)
> 
> s/*/&/ is the webkit way
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:260
> > +    char* p = reinterpret_cast<char*>(&x);
> 
> Will this be different on different platforms?
Yes, I believe the byte order will be different.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:262
> > +        ret.append(1, p[i]);
> 
> Can you append more than one at a time?
Ah, yeah we probably don't need to loop here, but can call the string constructor directly. Doing that.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:271
> > +    char* x = reinterpret_cast<char*>(d);
> 
> I wonder how the compiler handles this register wise.  I wonder if it'd be faster to create an array, decode into that, and then do one cast at the end?  (Same above?)

In this case, I expect the compiler to unroll it and generate something beautiful :) I'll be happy to revisit when I start to profile and tune.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:283
> > +        return encodeByte(5);
> 
> Probably better to make these things constants?
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:296
> > +    }
> 
> Does this compile on all platforms?  You may need an assert not reached and a dummy return.
Adding that to be sure.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:299
> > +static const char* decodeIDBKey(const char* p, const char* limit, RefPtr<IDBKey>* foundKey)
> 
> s/*/&/ is the webkit way
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:306
> > +    String s;
> 
> You could use {}'s within your case for scoping
> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:310
> > +    case 5:
> 
> This should be non-sparse numbers starting from 0.  If we need to add more in-between, we can use a lookup table.  If its sparse, the switch will be slower.
Ok. Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:336
> > +        ASSERT_NOT_REACHED();
> 
> Move out of switch, i think.
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:353
> > +    case 40:
> 
> What's 40?
It's now kIDBKeyMinKeyTypeByte. It's a key that sorts as lower than any other key, i.e. what minIDBKey() returns.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:426
> > +static int countNonZeroBytes(int64_t n)
> 
> Probably not necessary.
Ah, this correlates with your comment at :518.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:483
> > +    KeyPrefix() {}
> 
> Maybe this should do an implicit decode and decode should go away?
But then we can't return an error status.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:518
> > +        int databaseIdBytes = countNonZeroBytes(m_databaseId);
> 
> You should probably just encode them and then extract the bytes used.  Should be faster and safer.

But I use this for reserving space in the string before-hand. I'm happy to change this if you think it's really better.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:568
> > +        if (m_indexId >= 30)
> 
> It's probably worth making these constants.
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:585
> > +        return prefix.encode() + encodeByte(kTypeByte);
> 
> I think this would be more clear with a global constant (grouped with other similar ones) that has a more descriptive name.  Same below.
I'm not sure what you mean. Do you mean just replace the SchemaVersionKey class with a global std::string constant?

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:602
> > +class DatabaseFreeListKey {
> 
> Everything up until this point r=me with comments.  Don't have time to look at the rest closely tho...
> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1453
> > +        return false;
> 
> don't return a bool
Oops. Thanks for noticing.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1560
> > +        return false;
> 
> ditto
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1586
> > +        return false;
> 
> Without write batches, this would leave us inconsistent...I guess we're fine though..?
hmm yeah, we'll handle this nicely when we have transactions.. but I'll add a FIXME to make this more robust.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1646
> > +    m_db->Delete(leveldb::WriteOptions(), ObjectStoreMetaDataKey::encode(databaseId, objectStoreId, 0));
> 
> Just delete the range..?
Yeah, doing that.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1653
> > +    putString(m_db.get(), ObjectStoreFreeListKey::encode(databaseId, objectStoreId), "");
> 
> Why do you need to do a put with the free list key?
When we delete an object store, we should stick that object store id back in the free list.. doesn't that make sense?

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1658
> > +    if (!deleteRange(m_db.get(), IndexMetaDataKey::encode(databaseId, objectStoreId, 0, 0), IndexMetaDataKey::encode(databaseId, objectStoreId, INT64_MAX, 0)))
> 
> Not super fond of ranges with stuff like INT64_MAX.  Would rather have deleteRange(<db, os, 0>, CLOSED, <db, os+1, 0>, OPEN)

Yeah, I'm not super happy either. But I would rather have e.g. the IndexMetaDataKey class to know the right range to delete.. it feels a little brittle to use explicit numbers in the ranges, because it's hard for the reader to see if there may be anything unintentionally caught in between.

But I can see your point.. because of the way the schema is set up, deleting an object store should ideally be as simple as deleting a single range.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1717
> > +    // FIXME: Have a think about what happens when this wraps around. It shouldn't be a problem, I think.
> 
> What if there's overlap though..?
> 
> I'd just assert on it not wrapping for now.
Done.
Comment 25 Hans Wennborg 2011-04-13 11:56:17 PDT
Created attachment 89423 [details]
Patch
Comment 26 Jeremy Orlow 2011-04-13 12:46:31 PDT
(In reply to comment #24)
> Thanks very much for the review, Jeremy!
> 
> (In reply to comment #22)
> > (From update of attachment 89206 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=89206&action=review
> > 
> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:46
> > > +// FIXME: Do this more properly.
> > 
> > Why is this necessary?
> You mean the FIXME or the #defines themselves?
> 
> The #defines are there because I don't think MSVC defines those macros. The FIXME is there because it felt kind of hacky.

FWIW, the couple places I saw these used were

> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:56
> > > +// of fields. Each key in the backing store starts with a ternary prefix: (database id, object store id, index id). For each, 0 is reserved for meta-data.
> > 
> > Explain that ordering is important for locality.  Maybe briefly talk about the groupings that are most important to preserve and why.
> 
> Done. Which groupings are you considering to be more important than others?

They were mentioned in the design doc.

> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:106
> > > +//     <database id, object store id, index id, user key, sequence number> => "version", user key [IndexDataKey]
> > 
> > 4th is variable length encoded....so this means that the variable length encoding must always place 0-30 (or however many we reserve) before the first index id.
> > 
> 
> I'm not sure I understand, please elaborate.

Never mind.  If you're always decoding the numbers and comparing that way, this isn't an issue.  For some reason, I thought you might do binary compares.

> > It might be worth changing the schema a bit so this isn't required though.  Given that 99.99% of the data will be object store data or index data, maybe instead of every key having <var, var, var> every field should have <var, var, byte, var>?
> 
> I'm very happy to do any schema changes to make things simpler :) But I'd prefer to do it in later patches.

Actually I'm not sure this matters much.

> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:206
> > > +    for (unsigned i = 0; i < s.length(); ++i) {
> > 
> > Maybe comment that this attempts to be endian independent?
> Well, it doesn't really.. and like I think you point out below, the encoding of double's are not endian independent. Do you think we should go for being endian independent? Are LevelDB files themselves endian independent?

If it's easy, sure.  But I guess the chances of people migrating from one endianness to another are probably not large.  On the other hand, the rest of leveldb supports it, so it probably makes sense.

> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:271
> > > +    char* x = reinterpret_cast<char*>(d);
> > 
> > I wonder how the compiler handles this register wise.  I wonder if it'd be faster to create an array, decode into that, and then do one cast at the end?  (Same above?)
> 
> In this case, I expect the compiler to unroll it and generate something beautiful :) I'll be happy to revisit when I start to profile and tune.

Yeah, that's likely.

> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:483
> > > +    KeyPrefix() {}
> > 
> > Maybe this should do an implicit decode and decode should go away?
> But then we can't return an error status.

Hm.  At least add some boolean that keeps track of whether it's initialized or not and add ASSERTs.

> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:518
> > > +        int databaseIdBytes = countNonZeroBytes(m_databaseId);
> > 
> > You should probably just encode them and then extract the bytes used.  Should be faster and safer.
> 
> But I use this for reserving space in the string before-hand. I'm happy to change this if you think it's really better.

Sure.  Save it to a temp, then reserve, then append.  I think it's important for the safety aspect.  It's a bad idea to assume both functions will return the same number when we can simply look at the results' length.

> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:585
> > > +        return prefix.encode() + encodeByte(kTypeByte);
> > 
> > I think this would be more clear with a global constant (grouped with other similar ones) that has a more descriptive name.  Same below.
> I'm not sure what you mean. Do you mean just replace the SchemaVersionKey class with a global std::string constant?

kTypeByte is not clear.  It should be kWhatIsThisByteFor.  And they should all be defined in a block at the top of the file, I think.

> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1653
> > > +    putString(m_db.get(), ObjectStoreFreeListKey::encode(databaseId, objectStoreId), "");
> > 
> > Why do you need to do a put with the free list key?
> When we delete an object store, we should stick that object store id back in the free list.. doesn't that make sense?

Ohhh yeahhhhh.

> 
> > 
> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1658
> > > +    if (!deleteRange(m_db.get(), IndexMetaDataKey::encode(databaseId, objectStoreId, 0, 0), IndexMetaDataKey::encode(databaseId, objectStoreId, INT64_MAX, 0)))
> > 
> > Not super fond of ranges with stuff like INT64_MAX.  Would rather have deleteRange(<db, os, 0>, CLOSED, <db, os+1, 0>, OPEN)
> 
> Yeah, I'm not super happy either. But I would rather have e.g. the IndexMetaDataKey class to know the right range to delete.. it feels a little brittle to use explicit numbers in the ranges, because it's hard for the reader to see if there may be anything unintentionally caught in between.
> 
> But I can see your point.. because of the way the schema is set up, deleting an object store should ideally be as simple as deleting a single range.

I guess you can leave it as is, but it feels dirty.
Comment 27 WebKit Review Bot 2011-04-13 13:26:15 PDT
Attachment 89423 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/8399382
Comment 28 WebKit Review Bot 2011-04-13 20:17:06 PDT
Attachment 89423 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/8401550
Comment 29 Andrei Popescu 2011-04-14 06:28:25 PDT
Comment on attachment 89423 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=89423&action=review

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:45
> +// FIXME: Do this more properly.

Can you include <limits.h>? That is used elsewhere in WebCore and has these defines already, I think.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:52
> +// LevelDB stores key/value pairs. Keys and values are strings of bytes, normally of type std::string.

std::string in WebCore? That's strongly discouraged as far as I can tell. I do realize that std::string is required by leveldb but is this really the right thing to do here? Are there precedents in WebCore where std::string is used?
Comment 30 Hans Wennborg 2011-04-14 06:31:27 PDT
(In reply to comment #26)
> > > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:56
> > > > +// of fields. Each key in the backing store starts with a ternary prefix: (database id, object store id, index id). For each, 0 is reserved for meta-data.
> > > 
> > > Explain that ordering is important for locality.  Maybe briefly talk about the groupings that are most important to preserve and why.
> > 
> > Done. Which groupings are you considering to be more important than others?
> 
> They were mentioned in the design doc.
Done.

> > > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:206
> > > > +    for (unsigned i = 0; i < s.length(); ++i) {
> > > 
> > > Maybe comment that this attempts to be endian independent?
> > Well, it doesn't really.. and like I think you point out below, the encoding of double's are not endian independent. Do you think we should go for being endian independent? Are LevelDB files themselves endian independent?
> 
> If it's easy, sure.  But I guess the chances of people migrating from one endianness to another are probably not large.  On the other hand, the rest of leveldb supports it, so it probably makes sense.
Adding a FIXME for this.

> 
> > > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:483
> > > > +    KeyPrefix() {}
> > > 
> > > Maybe this should do an implicit decode and decode should go away?
> > But then we can't return an error status.
> 
> Hm.  At least add some boolean that keeps track of whether it's initialized or not and add ASSERTs.
Done. I also noticed GCC warns about this in Release builds because of inlining.

> > > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:518
> > > > +        int databaseIdBytes = countNonZeroBytes(m_databaseId);
> > > 
> > > You should probably just encode them and then extract the bytes used.  Should be faster and safer.
> > 
> > But I use this for reserving space in the string before-hand. I'm happy to change this if you think it's really better.
> 
> Sure.  Save it to a temp, then reserve, then append.  I think it's important for the safety aspect.  It's a bad idea to assume both functions will return the same number when we can simply look at the results' length.
You're right. Done.

> > > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:585
> > > > +        return prefix.encode() + encodeByte(kTypeByte);
> > > 
> > > I think this would be more clear with a global constant (grouped with other similar ones) that has a more descriptive name.  Same below.
> > I'm not sure what you mean. Do you mean just replace the SchemaVersionKey class with a global std::string constant?
> 
> kTypeByte is not clear.  It should be kWhatIsThisByteFor.  And they should all be defined in a block at the top of the file, I think.
Done.
Comment 31 Hans Wennborg 2011-04-14 06:31:56 PDT
Created attachment 89574 [details]
Patch
Comment 32 WebKit Review Bot 2011-04-14 09:23:23 PDT
Attachment 89574 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/8401720
Comment 33 Jeremy Orlow 2011-04-14 19:36:24 PDT
(In reply to comment #29)
> (From update of attachment 89423 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=89423&action=review
> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:45
> > +// FIXME: Do this more properly.
> 
> Can you include <limits.h>? That is used elsewhere in WebCore and has these defines already, I think.
> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:52
> > +// LevelDB stores key/value pairs. Keys and values are strings of bytes, normally of type std::string.
> 
> std::string in WebCore? That's strongly discouraged as far as I can tell. I do realize that std::string is required by leveldb but is this really the right thing to do here? Are there precedents in WebCore where std::string is used?

These are bindings to a library which uses it, so avoiding it here isn't going to help anything but will make the code much more complicated and ugly.

I believe angle uses some basic STL types.
Comment 34 Andrei Popescu 2011-04-14 22:57:18 PDT
(In reply to comment #33)
> (In reply to comment #29)
> > (From update of attachment 89423 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=89423&action=review
> > 
> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:45
> > > +// FIXME: Do this more properly.
> > 
> > Can you include <limits.h>? That is used elsewhere in WebCore and has these defines already, I think.
> > 
> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:52
> > > +// LevelDB stores key/value pairs. Keys and values are strings of bytes, normally of type std::string.
> > 
> > std::string in WebCore? That's strongly discouraged as far as I can tell. I do realize that std::string is required by leveldb but is this really the right thing to do here? Are there precedents in WebCore where std::string is used?
> 
> These are bindings to a library which uses it, so avoiding it here isn't going to help anything but will make the code much more complicated and ugly.
> 
> I believe angle uses some basic STL types.

Ok, if there's a precedent,  then fine. I was also thinking that if LevelDB ends up being used in other places, it'd be neater to have a wrapper that only uses Webcore types, similar to how we wrap SQLite. But that can be done once there is another user for LevelDB.
Comment 35 Jeremy Orlow 2011-04-14 23:44:24 PDT
(In reply to comment #34)
> (In reply to comment #33)
> > (In reply to comment #29)
> > > (From update of attachment 89423 [details] [details] [details])
> > > View in context: https://bugs.webkit.org/attachment.cgi?id=89423&action=review
> > > 
> > > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:45
> > > > +// FIXME: Do this more properly.
> > > 
> > > Can you include <limits.h>? That is used elsewhere in WebCore and has these defines already, I think.
> > > 
> > > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:52
> > > > +// LevelDB stores key/value pairs. Keys and values are strings of bytes, normally of type std::string.
> > > 
> > > std::string in WebCore? That's strongly discouraged as far as I can tell. I do realize that std::string is required by leveldb but is this really the right thing to do here? Are there precedents in WebCore where std::string is used?
> > 
> > These are bindings to a library which uses it, so avoiding it here isn't going to help anything but will make the code much more complicated and ugly.
> > 
> > I believe angle uses some basic STL types.
> 
> Ok, if there's a precedent,  then fine. I was also thinking that if LevelDB ends up being used in other places, it'd be neater to have a wrapper that only uses Webcore types, similar to how we wrap SQLite. But that can be done once there is another user for LevelDB.

std::string is only used internally to talk to levelDB.  Much like how WebCore types are used in the WebKit layer but not exposed to the public api.  Anything else that's using levelDB will need to do something similar though.
Comment 36 Hans Wennborg 2011-04-15 05:50:18 PDT
Created attachment 89772 [details]
Patch

Uploading new patch that introduces a wrapper for LevelDB, and makes sure that std::string is only used in WebCore/platform/leveldb. The wrapper is not complete, but I figure we can add to it later.
Comment 37 WebKit Review Bot 2011-04-15 13:45:40 PDT
Attachment 89772 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/8450220
Comment 38 Hans Wennborg 2011-04-18 04:23:21 PDT
Comment on attachment 89772 [details]
Patch

(These are comments from looking over the code with Steve.)

View in context: https://bugs.webkit.org/attachment.cgi?id=89772&action=review

> Source/WebCore/WebCore.gypi:864
> +            'platform/leveldb/LevelDBSlice.h',

probably need to add to all build files...

> Source/WebCore/platform/leveldb/LevelDBDatabase.cpp:42
> +    : m_db(0)

Let's use an OwnPtr

> Source/WebCore/platform/leveldb/LevelDBIterator.cpp:40
> +    delete m_iterator;

OwnPtr

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:41
> +#include <leveldb/slice.h>

Drop these...

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:53
> +// The std::strings don't have any particular encoding, and aren't null terminated.

s/std::string/Vector<char>/

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:358
> +        p = decodeStringWithLen(p, limit, s);

s/WithLen/WithLength/, this probably applies here and there

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:506
> +        : m_databaseId(-1)

have a constant instead of -1

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:571
> +            ASSERT(p.m_indexId == m_indexId);

debug stuff.. should not be landed

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1393
> +            ptrB = IndexMetaDataKey::decode(b.data(), endB, &metaDataKeyB);

could we do some templating to avoid repeating this all the time? possibly a FIXME

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1518
> +class Comparator : public leveldb::Comparator {

Wrap this one too, and then we don't need to ever expose leveldb::Slice.

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1679
> +            LOG_ERROR("it not valid\n");

nicer messages
Comment 39 Hans Wennborg 2011-04-18 07:49:14 PDT
Thanks very much for your help Steve! Uploading new version of the patch...

(In reply to comment #38)
> (From update of attachment 89772 [details])
> (These are comments from looking over the code with Steve.)
> 
> View in context: https://bugs.webkit.org/attachment.cgi?id=89772&action=review
> 
> > Source/WebCore/WebCore.gypi:864
> > +            'platform/leveldb/LevelDBSlice.h',
> 
> probably need to add to all build files...
Done.

> 
> > Source/WebCore/platform/leveldb/LevelDBDatabase.cpp:42
> > +    : m_db(0)
> 
> Let's use an OwnPtr
Done.

> 
> > Source/WebCore/platform/leveldb/LevelDBIterator.cpp:40
> > +    delete m_iterator;
> 
> OwnPtr
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:41
> > +#include <leveldb/slice.h>
> 
> Drop these...
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:53
> > +// The std::strings don't have any particular encoding, and aren't null terminated.
> 
> s/std::string/Vector<char>/
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:358
> > +        p = decodeStringWithLen(p, limit, s);
> 
> s/WithLen/WithLength/, this probably applies here and there
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:506
> > +        : m_databaseId(-1)
> 
> have a constant instead of -1
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:571
> > +            ASSERT(p.m_indexId == m_indexId);
> 
> debug stuff.. should not be landed
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1393
> > +            ptrB = IndexMetaDataKey::decode(b.data(), endB, &metaDataKeyB);
> 
> could we do some templating to avoid repeating this all the time? possibly a FIXME
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1518
> > +class Comparator : public leveldb::Comparator {
> 
> Wrap this one too, and then we don't need to ever expose leveldb::Slice.
Done.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1679
> > +            LOG_ERROR("it not valid\n");
> 
> nicer messages
Done.
Comment 40 Hans Wennborg 2011-04-18 07:49:58 PDT
Created attachment 90035 [details]
Patch
Comment 41 Hans Wennborg 2011-04-18 08:06:17 PDT
Created attachment 90036 [details]
Patch

Rebase.
Comment 42 Steve Block 2011-04-18 08:18:19 PDT
Comment on attachment 90036 [details]
Patch

r=me
Comment 43 Hans Wennborg 2011-04-18 09:13:52 PDT
Committed r84149: <http://trac.webkit.org/changeset/84149>
Comment 44 WebKit Review Bot 2011-04-18 12:20:41 PDT
http://trac.webkit.org/changeset/84149 might have broken GTK Linux 32-bit Debug
Comment 45 David Grogan 2011-04-18 15:27:29 PDT
Comment on attachment 89772 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=89772&action=review

Less of a review and more just basic questions.  I've given up on understanding this in one go.

Is there an option in about:flags to use leveldb?  How did you test this as you were developing it?  Manually loading layout tests in your local build of chromium?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:42
> +#include <string>

The idea is to flesh out the wrapper such that these last 3 includes aren't needed here?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:109
> +//     <database id, 0, 0, 201, object store id, utf16 index name> => index id [IndexNamesKey]

It seems as if 0 in the objectstore id field implies 0 in the index id field.  If that's true, can we leave out writing the second 0?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:309
> +    for (size_t i = 0; i < sizeof(*d); ++i)

Can we not use memcpy here?  Is stuff from <cstring> forbidden?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1527
> +    void FindShortestSeparator(std::string*, const leveldb::Slice&) const {} // FIXME: Implement eventually.

What are these two methods for?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1568
> +        ASSERT_NOT_REACHED(); // FIXME: We need to handle this case for incognito and DumpRenderTree.

This is what ":memory:" is used for in IDBSQLiteBackingStore?  It's a magic value that something up the callstack detects?

> Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1576
> +    // FIXME: We should eventually use the same LevelDB database for all origins.

Why?
Comment 46 Hans Wennborg 2011-04-19 02:18:33 PDT
(In reply to comment #45)
> (From update of attachment 89772 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=89772&action=review
> 
> Less of a review and more just basic questions.  I've given up on understanding this in one go.
> 
> Is there an option in about:flags to use leveldb?  How did you test this as you were developing it?  Manually loading layout tests in your local build of chromium?

There is a command-line flag: --indexeddb-use-leveldb.
To test this, I had a local hack that made DumpRenderTree use leveldb, and ran the layout tests, as well as just opening them with the browser.

But we're going to need something more automated. I'll probably write a browser test that runs the layout tests in the browser with --indexeddb-use-leveldb set.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:42
> > +#include <string>
> 
> The idea is to flesh out the wrapper such that these last 3 includes aren't needed here?

Crap, the whole point of doing the wrappers was to be able to remove this include.. and then I forgot to remove it :-p Fixing at Bug 58872.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:109
> > +//     <database id, 0, 0, 201, object store id, utf16 index name> => index id [IndexNamesKey]
> 
> It seems as if 0 in the objectstore id field implies 0 in the index id field.  If that's true, can we leave out writing the second 0?

Yeah, but I'd rather try and re-think the key coding schema to be more easy to implement (beacuse it's quite a big chunk of code know), rather than packing it tighter.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:309
> > +    for (size_t i = 0; i < sizeof(*d); ++i)
> 
> Can we not use memcpy here?  Is stuff from <cstring> forbidden?

We probably could, but I'm thinking the compiler will unroll this loop, and it will just become one 64-bit load/store pair.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1527
> > +    void FindShortestSeparator(std::string*, const leveldb::Slice&) const {} // FIXME: Implement eventually.
> 
> What are these two methods for?

I don't remember the details. As far as I understand, they're optional to implement, but beneficial to performance, so we really should implement them eventually.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1568
> > +        ASSERT_NOT_REACHED(); // FIXME: We need to handle this case for incognito and DumpRenderTree.
> 
> This is what ":memory:" is used for in IDBSQLiteBackingStore?  It's a magic value that something up the callstack detects?

Yes, I believe SQLite has a flag or something that allows for an in-memory database. LevelDB doesn't have anything like that, so we'll have to build something. Possibly, we could hack up the Chromium env_ or port_ files to support a similar ":memory" prefix that will cause the database to be in-memory, or possibly in /tmp.

> 
> > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1576
> > +    // FIXME: We should eventually use the same LevelDB database for all origins.
> 
> Why?

For performance: having one LevelDB database for each origin is going to cause more fragmentation and worse performance than having them all in one.

I think we'll keep them separate for now though, until we're confident that we can handle error cases and corruption more gracefully.
Comment 47 Jeremy Orlow 2011-04-19 02:59:46 PDT
From the penut gallery:

(In reply to comment #46)
> (In reply to comment #45)
> > (From update of attachment 89772 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=89772&action=review
> > 
> > Less of a review and more just basic questions.  I've given up on understanding this in one go.
> > 
> > Is there an option in about:flags to use leveldb?  How did you test this as you were developing it?  Manually loading layout tests in your local build of chromium?
> 
> There is a command-line flag: --indexeddb-use-leveldb.
> To test this, I had a local hack that made DumpRenderTree use leveldb, and ran the layout tests, as well as just opening them with the browser.
> 
> But we're going to need something more automated. I'll probably write a browser test that runs the layout tests in the browser with --indexeddb-use-leveldb set.
> 
> > 
> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:42
> > > +#include <string>
> > 
> > The idea is to flesh out the wrapper such that these last 3 includes aren't needed here?
> 
> Crap, the whole point of doing the wrappers was to be able to remove this include.. and then I forgot to remove it :-p Fixing at Bug 58872.
> 
> > 
> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:109
> > > +//     <database id, 0, 0, 201, object store id, utf16 index name> => index id [IndexNamesKey]
> > 
> > It seems as if 0 in the objectstore id field implies 0 in the index id field.  If that's true, can we leave out writing the second 0?
> 
> Yeah, but I'd rather try and re-think the key coding schema to be more easy to implement (beacuse it's quite a big chunk of code know), rather than packing it tighter.

Totally agree.  Remember that this data will be a VERY small fraction of all data, so packing it slightly tighter doesn't matter much.  Fast and correct comparing will be more important than a few bytes.

Actually, what might be best overall is to look at how to adjust the schema so it's not super hard to implement the optional methods on the comparator.  Because those will make things faster and more compact.

> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:309
> > > +    for (size_t i = 0; i < sizeof(*d); ++i)
> > 
> > Can we not use memcpy here?  Is stuff from <cstring> forbidden?
> 
> We probably could, but I'm thinking the compiler will unroll this loop, and it will just become one 64-bit load/store pair.
> 
> > 
> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1527
> > > +    void FindShortestSeparator(std::string*, const leveldb::Slice&) const {} // FIXME: Implement eventually.
> > 
> > What are these two methods for?
> 
> I don't remember the details. As far as I understand, they're optional to implement, but beneficial to performance, so we really should implement them eventually.

These are what I'm talking about in my previous comment, btw...

> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1568
> > > +        ASSERT_NOT_REACHED(); // FIXME: We need to handle this case for incognito and DumpRenderTree.
> > 
> > This is what ":memory:" is used for in IDBSQLiteBackingStore?  It's a magic value that something up the callstack detects?
> 
> Yes, I believe SQLite has a flag or something that allows for an in-memory database. LevelDB doesn't have anything like that, so we'll have to build something. Possibly, we could hack up the Chromium env_ or port_ files to support a similar ":memory" prefix that will cause the database to be in-memory, or possibly in /tmp.

If at all possible, we shouldn't write anything to disk...even temporarily.  And if you do, you should go to great effort to make sure it gets deleted even in the event of crashes and such.

> > > Source/WebCore/storage/IDBLevelDBBackingStore.cpp:1576
> > > +    // FIXME: We should eventually use the same LevelDB database for all origins.
> > 
> > Why?
> 
> For performance: having one LevelDB database for each origin is going to cause more fragmentation and worse performance than having them all in one.

Relatedly, it also means that we only need to optimize for the case of big databases rather than also optimizing for small (or uncommonly accessed) databases.  (Hm.  One could even imagine more agressive compression on higher leveldb levels to that effect...)

> I think we'll keep them separate for now though, until we're confident that we can handle error cases and corruption more gracefully.

Sounds sane.
Comment 48 Adam Barth 2011-04-27 16:18:30 PDT
This patch contains many OwnPtr infelicities.  It's almost never correct in WebKit to call delete.  Whenever you're calling delete, you probably meant to use an OwnPtr.  Also, when transferring ownership of memory, please use PassOwnPtr to document that ownership is being transfered and to allow the compiler to help us avoid memory safety bugs.
Comment 49 Adam Barth 2011-04-27 16:21:24 PDT
Also, this patch contains incorrect header includes.  Headers in WTF should be included as follows:

#include <wtf/OwnPtr.h>
Comment 50 Adam Barth 2011-04-27 16:23:49 PDT
Methods that allocate memory should be named with "create" not with "new".  For example:

newIterator => createIterator
Comment 51 Eric Seidel (no email) 2011-04-27 16:25:47 PDT
Sounds like we should roll this out, given Adam's complaints, unless a follow-up bug is underway?
Comment 52 Adam Barth 2011-04-27 16:26:13 PDT
(In reply to comment #51)
> Sounds like we should roll this out, given Adam's complaints, unless a follow-up bug is underway?

I'm fixing it.
Comment 53 Hans Wennborg 2011-04-28 03:14:44 PDT
(In reply to comment #52)
> (In reply to comment #51)
> > Sounds like we should roll this out, given Adam's complaints, unless a follow-up bug is underway?
> 
> I'm fixing it.

Thanks very much for fixing this, Adam.

Living and learning. (For future reference, the fix was in Bug 59656.)