RESOLVED FIXED 62284
IndexedDB: implement compound (array) key support
https://bugs.webkit.org/show_bug.cgi?id=62284
Summary IndexedDB: implement compound (array) key support
Mark Pilgrim (Google)
Reported 2011-06-08 08:35:38 PDT
Created attachment 96426 [details] test case Note in http://dvcs.w3.org/hg/IndexedDB/raw-file/tip/Overview.html#key-construct states "Infinite float values are valid keys. As are empty Arrays." Expected behavior: add a record with an empty array as key Actual behavior: FAIL request = objectStore.add([], []); threw exception Error: TYPE_MISMATCH_ERR: DOM Exception 17 Layout test attached.
Attachments
test case (2.31 KB, text/html)
2011-06-08 08:35 PDT, Mark Pilgrim (Google)
no flags
Patch (70.06 KB, patch)
2011-10-31 17:20 PDT, Joshua Bell
no flags
Patch (81.63 KB, patch)
2011-11-02 13:10 PDT, Joshua Bell
no flags
Patch (81.66 KB, patch)
2011-11-02 17:59 PDT, Joshua Bell
no flags
Patch (122.72 KB, patch)
2011-11-02 22:07 PDT, Joshua Bell
no flags
Patch (123.38 KB, patch)
2011-11-03 13:43 PDT, Joshua Bell
no flags
Patch (126.38 KB, patch)
2011-11-04 16:19 PDT, Joshua Bell
no flags
Patch for landing (126.18 KB, patch)
2011-11-08 11:59 PST, Joshua Bell
no flags
Mark Pilgrim (Google)
Comment 1 2011-06-08 09:00:15 PDT
Turns out we don't accept any kind of array as a key. [1,2] and [[]] should also be valid keys, but they each throw.
Hans Wennborg
Comment 2 2011-06-08 09:08:55 PDT
(In reply to comment #1) > Turns out we don't accept any kind of array as a key. [1,2] and [[]] should also be valid keys, but they each throw. Yep, support for arrays as keys is on the TODO list (but not very high on it).
Joshua Bell
Comment 3 2011-10-31 17:20:23 PDT
Joshua Bell
Comment 4 2011-10-31 17:22:10 PDT
Needs more tests before a proper review, but if hans@ or dgrogan@ want to take a look I'd appreciate it. Several of the inline array-populating methods should probably be factored out.
WebKit Review Bot
Comment 5 2011-10-31 17:23:32 PDT
Please wait for approval from fishd@chromium.org before submitting because this patch contains changes to the Chromium public API.
Darin Fisher (:fishd, Google)
Comment 6 2011-10-31 21:22:46 PDT
Comment on attachment 113111 [details] Patch WebKit API changes LGTM
Hans Wennborg
Comment 7 2011-11-01 10:15:54 PDT
Comment on attachment 113111 [details] Patch Good stuff! Very nice. Some comments below. And you should extend IDBLevelDBCodingTest to cover this. In fact, if you'd like to split up this patch to make it easier to review and land, that would probably be a natural division point: i.e. put the IDBKey, LevelDBCoding, LevelDBCodingTest in one patch, and the rest in another. View in context: https://bugs.webkit.org/attachment.cgi?id=113111&action=review > Source/WebCore/ChangeLog:3 > + IndexedDB add() should accept array as key maybe we should rename the bug to talk about compound keys instead > Source/WebCore/bindings/v8/IDBBindingUtilities.cpp:79 > + return subkey; hmm, why? > Source/WebCore/storage/IDBKey.cpp:52 > + if (result) you could do "if (int result = m_array[i]->compare...)" and save a line :) > Source/WebCore/storage/IDBKey.cpp:56 > + (m_array.size() > other->m_array.size()) ? 1 : 0; hmm, probably easier on the eyes to just do if-elseif-else here > Source/WebCore/storage/IDBLevelDBCoding.cpp:368 > + encodeIDBKey(*((*it).get()), into); I'd probably split *((*it).get()) into two lines: "const RefPtr<IDBKey>& subKey = *it; encodeIdbKey(*subkey, into);" Or maybe you could just use **it? > Source/WebCore/storage/IDBLevelDBCoding.cpp:481 > + p = extractEncodedIDBKey(p, limit, &subkey); ah, nice :) > Source/WebCore/storage/IDBLevelDBCoding.cpp:537 > + if (cmp) again, feel free to save a line with "if (int cmp = ...)" here if you prefer. > Source/WebCore/storage/IDBLevelDBCoding.cpp:541 > + (lenA > lenB) ? 1 : 0; nested ?:s make me scared, i'd just write it out as if-elseif-else. > Source/WebCore/storage/IDBLevelDBCoding.h:59 > +void encodeIDBKey(const IDBKey&, Vector<char>&); nit: I'd give a name to the vector parameter here; it's not obvious that it's "into". > Source/WebKit/chromium/src/WebIDBKey.cpp:133 > +void convertToWebIDBKeyArray(WebVector<WebIDBKey>& result, const IDBKey::KeyArray& array) i think it's more common to have the "output" parameter last. > Source/WebKit/chromium/src/WebIDBKey.cpp:222 > + // FIXME: This is probably sketchy with compound keys why?
Joshua Bell
Comment 8 2011-11-01 10:35:40 PDT
Comment on attachment 113111 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=113111&action=review Thanks, all good stuff. I'm working on the chromium/tests and browser_tests next. >> Source/WebCore/ChangeLog:3 >> + IndexedDB add() should accept array as key > > maybe we should rename the bug to talk about compound keys instead Done >> Source/WebCore/bindings/v8/IDBBindingUtilities.cpp:79 >> + return subkey; > > hmm, why? I didn't want to throw away and re-create an invalid key during each step of the stack unwinding since there's a perfectly serviceable one at hand. I changed the if (!subkey) to ASSERT_NOT_REACHED. Alternately, I could avoid creating an invalid key at all in this method, let a NULL propagating up the stack flag invalid, and let the public create() method create an invalid key for the final result. Opinions? >> Source/WebCore/storage/IDBLevelDBCoding.cpp:481 >> + p = extractEncodedIDBKey(p, limit, &subkey); > > ah, nice :) As an aside, I'm slightly concerned about the about of Vector<char> churn (i.e. lots of small allocs/copies/frees) in the encode/decode/extract methods, but I suppose we'll fix that if/when it shows up on a profile. >> Source/WebKit/chromium/src/WebIDBKey.cpp:222 >> + // FIXME: This is probably sketchy with compound keys > > why? Early misunderstanding about how WebIDBKeys related to WebCore::IDBKeys; removed these notes-to-self.
Joshua Bell
Comment 9 2011-11-02 13:10:39 PDT
Joshua Bell
Comment 10 2011-11-02 17:59:41 PDT
Joshua Bell
Comment 11 2011-11-02 18:01:44 PDT
I think this is "done", review at leisure - I have corresponding chromium-side browser_tests ready to go as well. Better name suggestions for the "codingToOrder" function in IDBLevelDBCoding.cpp appreciated
Joshua Bell
Comment 12 2011-11-02 22:07:48 PDT
Joshua Bell
Comment 13 2011-11-02 22:08:44 PDT
Comment on attachment 113433 [details] Patch Added one more layout test which validates cursor result ordering and that IDBCursor.key yields correct values for array-type keys.
Hans Wennborg
Comment 14 2011-11-03 09:38:00 PDT
Comment on attachment 113433 [details] Patch LGTM with nits View in context: https://bugs.webkit.org/attachment.cgi?id=113433&action=review > Source/WebCore/storage/IDBLevelDBCoding.cpp:520 > { Or maybe KeyTypeByteToKeyType? > Source/WebCore/storage/IDBLevelDBCoding.cpp:549 > + if (int x = codingToOrder(typeB) - codingToOrder(typeA)) Maybe IDBKey should have a "static int compareTypes(IDBKey::Type a, IDBKey::Type b)" that does this.. as it is now, one needs to know what the enum inside IDBKey looks like to see why this subtraction does the right thing. > Source/WebKit/chromium/src/WebIDBKey.cpp:96 > +PassRefPtr<IDBKey> convertFromWebIDBKeyArray(const WebVector<WebIDBKey>& array) should this be static? > Source/WebKit/chromium/src/WebIDBKey.cpp:122 > +void convertToWebIDBKeyArray(const IDBKey::KeyArray& array, WebVector<WebIDBKey>& result) should this be static? > Source/WebKit/chromium/src/WebIDBKey.cpp:124 > + WebVector<WebIDBKey> keys(array.size()), subkeys; I'd probably declare them on one line each > LayoutTests/storage/indexeddb/cursor-key-order.html:31 > + evalAndLog("openreq = indexedDB.open('db')"); we normally use the name of the layout test to name the database
Joshua Bell
Comment 15 2011-11-03 13:43:29 PDT
Hans Wennborg
Comment 16 2011-11-04 04:18:14 PDT
(In reply to comment #15) > Created an attachment (id=113551) [details] > Patch LGTM
Tony Chang
Comment 17 2011-11-04 10:59:44 PDT
Comment on attachment 113551 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=113551&action=review > Source/WebCore/bindings/v8/IDBBindingUtilities.cpp:59 > + PassRefPtr<IDBKey> createInternal(v8::Handle<v8::Value> value) > + { > + if (value->IsNumber() && !isnan(value->NumberValue())) What's the benefit of having a KeyCreator class? Couldn't it just be a static function? > Source/WebCore/bindings/v8/IDBBindingUtilities.cpp:74 > + for (uint32_t i = 0; i < length; i++) { Nit: ++i > Source/WebCore/bindings/v8/IDBBindingUtilities.cpp:91 > + Vector<v8::Handle<v8::Array> > m_stack; Why is this a member variable? > Source/WebCore/bindings/v8/custom/V8IDBKeyCustom.cpp:63 > + case IDBKey::ArrayType: > + array = v8::Array::New(key->array().size()); > + for (size_t i = 0; i < key->array().size(); ++i) > + array->Set(i, toV8(key->array()[i].get())); > + return array; Nit: I would declare array in the case block. You'll need to use {} to scope the variable declartion. > Source/WebCore/storage/IDBLevelDBCoding.cpp:305 > +int compareEncodedStringsWithLength(const char*& p, const char* limitP, const char*& q, const char* limitQ) The pass by reference here seems unnecessary. It looks like the changed values are only used in the unit test. Can you instead ASSERT in this function that p + lenP*2 == limitP? > Source/WebCore/storage/IDBLevelDBCoding.cpp:365 > + size_t prevSize = into.size(); > + size_t len; Nit: Avoid abbreviations for variable names. prevSize -> previousSize and len -> length. > Source/WebCore/storage/IDBLevelDBCoding.cpp:379 > + case IDBKey::ArrayType: > + into.append(encodeByte(kIDBKeyArrayTypeByte)); > + len = key.array().size(); > + into.append(encodeVarInt(len)); > + for (size_t i = 0; i < len; ++i) > + encodeIDBKey(*key.array()[i], into); > + ASSERT_UNUSED(prevSize, into.size() > prevSize); > + return; I would also put this in {} and scope the declaration of length. > Source/WebCore/storage/IDBLevelDBCoding.cpp:410 > + int64_t len; Nit: len -> length > Source/WebCore/storage/IDBLevelDBCoding.cpp:420 > + case kIDBKeyArrayTypeByte: {} and scoped variables > Source/WebCore/storage/IDBLevelDBCoding.cpp:421 > + // Array. I would remove this comment. > Source/WebCore/storage/IDBLevelDBCoding.cpp:471 > + int64_t len; len -> length > Source/WebCore/storage/IDBLevelDBCoding.cpp:480 > + case kIDBKeyArrayTypeByte: > + // Array. Same comments as above. > Source/WebCore/storage/IDBLevelDBCoding.cpp:498 > case kIDBKeyStringTypeByte: > // String. > - p = decodeVarInt(p, limit, stringLen); > + p = decodeVarInt(p, limit, len); I would scope stringLength and get rid of the comment. > Source/WebCore/storage/IDBLevelDBCoding.cpp:534 > + default: No default: on switches so the compiler will give an error if you forget a case. You'll need to move the ASSERT_NOT_REACHED and return InvalidType outside the switch to make some versions of gcc happy. > Source/WebCore/storage/IDBLevelDBCoding.cpp:547 > + int64_t lenA, lenB; lenA -> lengthA, lenB -> lengthB. > Source/WebCore/storage/IDBLevelDBCoding.cpp:558 > + case kIDBKeyArrayTypeByte: > + // Array type. Scoped local variables, no comment. > Source/WebCore/storage/IDBLevelDBCoding.cpp:566 > + if (int cmp = compareEncodedIDBKeys(p, limitA, q, limitB)) Is it possible to nest arrays in arrays? If so, is there a limit? Otherwise I think you can overflow the stack (too much recursion) if you nest lots of arrays. > Source/WebKit/chromium/src/WebIDBKey.cpp:116 > + default: > + ASSERT_NOT_REACHED(); > + break; No default: > Source/WebKit/chromium/src/WebIDBKey.cpp:144 > + default: > + ASSERT_NOT_REACHED(); > + break; No default:
Joshua Bell
Comment 18 2011-11-04 11:13:16 PDT
Comment on attachment 113551 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=113551&action=review All the nits/style feedback is awesome, thanks. >> Source/WebCore/bindings/v8/IDBBindingUtilities.cpp:59 >> + if (value->IsNumber() && !isnan(value->NumberValue())) > > What's the benefit of having a KeyCreator class? Couldn't it just be a static function? This class exists to support cycle detection, i.e. ensuring that arrays don't contain themselves. A class is used to contain the m_stack state, which tracks parent arrays during the recursion. If we want to impose a depth limit on nested arrays we'd do so here (if m_stack.size() > SOME_LIMIT ... return Invalid) since this is where script data enters the IDBKey system. > Source/WebCore/bindings/v8/IDBBindingUtilities.cpp:68 > + if (m_stack.contains(array)) ^^^^ That's the the bit where we prevent cycles. >> Source/WebCore/bindings/v8/custom/V8IDBKeyCustom.cpp:63 >> + return array; > > Nit: I would declare array in the case block. You'll need to use {} to scope the variable declartion. Excellent, I much prefer that style, I just didn't see it used in nearby examples or the style guide. >> Source/WebCore/storage/IDBLevelDBCoding.cpp:305 >> +int compareEncodedStringsWithLength(const char*& p, const char* limitP, const char*& q, const char* limitQ) > > The pass by reference here seems unnecessary. It looks like the changed values are only used in the unit test. Can you instead ASSERT in this function that p + lenP*2 == limitP? They're used when decoding arrays; the string may exist within a larger buffer containing several concatenated strings (or other values). That assertion would be invalid, as limitP is the end of the overall buffer. (The inputs to this function end up being two start and two end pointers; the outputs are the comparison result plus the two updated pointers, for subsequent decoding/comparison operations.) >> Source/WebCore/storage/IDBLevelDBCoding.cpp:566 > > Is it possible to nest arrays in arrays? If so, is there a limit? Otherwise I think you can overflow the stack (too much recursion) if you nest lots of arrays. See above re: nesting and limits. Suggestions on a limit? I can borrow from what SerializedScriptValue uses. (SSV's explicitly support cycles but do have depth limits)
Tony Chang
Comment 19 2011-11-04 13:13:19 PDT
Comment on attachment 113551 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=113551&action=review >>> Source/WebCore/bindings/v8/IDBBindingUtilities.cpp:59 >>> + if (value->IsNumber() && !isnan(value->NumberValue())) >> >> What's the benefit of having a KeyCreator class? Couldn't it just be a static function? > > This class exists to support cycle detection, i.e. ensuring that arrays don't contain themselves. A class is used to contain the m_stack state, which tracks parent arrays during the recursion. > > If we want to impose a depth limit on nested arrays we'd do so here (if m_stack.size() > SOME_LIMIT ... return Invalid) since this is where script data enters the IDBKey system. Can we pass m_stack in as a parameter (pass-by-reference) to createInteral? You could either make 2 static functions or keep the class and have the stack created in the create() method above. As for a limit, it looks like SerializedScriptValue has a limit of 40,000 in JSC and 20,000 in V8. I'm not sure what's appropriate for you, but 20,000 is probably plenty. Make sure to add this case to your tests. >>> Source/WebCore/storage/IDBLevelDBCoding.cpp:305 >>> +int compareEncodedStringsWithLength(const char*& p, const char* limitP, const char*& q, const char* limitQ) >> >> The pass by reference here seems unnecessary. It looks like the changed values are only used in the unit test. Can you instead ASSERT in this function that p + lenP*2 == limitP? > > They're used when decoding arrays; the string may exist within a larger buffer containing several concatenated strings (or other values). That assertion would be invalid, as limitP is the end of the overall buffer. > > (The inputs to this function end up being two start and two end pointers; the outputs are the comparison result plus the two updated pointers, for subsequent decoding/comparison operations.) OK, that makes sense. It makes me a bit nervous because it looks like in release builds, if an assert fails, we'll happily start pointing into random memory. Should we be more cautious here or do we fully control all the serializing and unserializing code? Do we have to worry about files being corrupted on disk?
Joshua Bell
Comment 20 2011-11-04 15:20:15 PDT
(In reply to comment #19) > Can we pass m_stack in as a parameter (pass-by-reference) to createInteral? You could either make 2 static functions or keep the class and have the stack created in the create() method above. Ah, much cleaner, will do. > As for a limit, it looks like SerializedScriptValue has a limit of 40,000 in JSC and 20,000 in V8. I'm not sure what's appropriate for you, but 20,000 is probably plenty. Make sure to add this case to your tests. Agreed. > > (The inputs to this function end up being two start and two end pointers; the outputs are the comparison result plus the two updated pointers, for subsequent decoding/comparison operations.) > > OK, that makes sense. It makes me a bit nervous because it looks like in release builds, if an assert fails, we'll happily start pointing into random memory. Should we be more cautious here or do we fully control all the serializing and unserializing code? Do we have to worry about files being corrupted on disk? We control the code but agree we should be more paranoid. I've added an additional runtime check that p and q haven't gone past their limits, and bail if they have. This matches what the compareEncodedXXX() logic does if e.g. decodeVarInt wanders past the limit.
Joshua Bell
Comment 21 2011-11-04 16:19:48 PDT
Tony Chang
Comment 22 2011-11-04 16:43:36 PDT
Comment on attachment 113727 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=113727&action=review > LayoutTests/storage/indexeddb/key-type-array.html:207 > + evalAndExpectException("indexedDB.cmp(makeArrayOfDepth(100000), 0)", "IDBDatabaseException.DATA_ERR"); You may want to double check that this runs quickly in debug, but I think you'll be fine (looks like in release it's only like 50ms or so to create the structure).
Joshua Bell
Comment 23 2011-11-04 16:48:07 PDT
Comment on attachment 113727 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=113727&action=review >> LayoutTests/storage/indexeddb/key-type-array.html:207 >> + evalAndExpectException("indexedDB.cmp(makeArrayOfDepth(100000), 0)", "IDBDatabaseException.DATA_ERR"); > > You may want to double check that this runs quickly in debug, but I think you'll be fine (looks like in release it's only like 50ms or so to create the structure). Definitely slows down the test... looks like it takes ~650ms in Debug on my box. I'll drop the number to 50k (~350ms) which will still trip the depth check.
Joshua Bell
Comment 24 2011-11-08 11:59:20 PST
Created attachment 114128 [details] Patch for landing
WebKit Review Bot
Comment 25 2011-11-08 12:00:06 PST
Comment on attachment 114128 [details] Patch for landing Rejecting attachment 114128 [details] from commit-queue. jsbell@chromium.org does not have committer permissions according to http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/common/config/committers.py. - If you do not have committer rights please read http://webkit.org/coding/contributing.html for instructions on how to use bugzilla flags. - If you have committer rights please correct the error in Tools/Scripts/webkitpy/common/config/committers.py by adding yourself to the file (no review needed). The commit-queue restarts itself every 2 hours. After restart the commit-queue will correctly respect your committer rights.
Joshua Bell
Comment 26 2011-11-08 12:01:51 PST
Comment on attachment 114128 [details] Patch for landing drops the max depth to 50k @tony - one more look? (dgrogan and I tried "land-safely" to preserve the r+ but that does not appear to have done what we'd hoped)
Tony Chang
Comment 27 2011-11-08 12:21:55 PST
Comment on attachment 114128 [details] Patch for landing Note that the new patch has "Reviewed by Tony Chang." in it, so it doesn't need to be reviewed again. That's really all land-safely does (change the OOPS to the reviewer name).
WebKit Review Bot
Comment 28 2011-11-08 12:51:33 PST
Comment on attachment 114128 [details] Patch for landing Clearing flags on attachment: 114128 Committed r99609: <http://trac.webkit.org/changeset/99609>
WebKit Review Bot
Comment 29 2011-11-08 12:51:40 PST
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.