WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
104354
IndexedDB: Don't use strings to represent serialized values
https://bugs.webkit.org/show_bug.cgi?id=104354
Summary
IndexedDB: Don't use strings to represent serialized values
Michael Pruett
Reported
2012-12-07 01:16:44 PST
Strings are inappropriate for representing serialized values in IndexedDB; these serialized values are not Unicode character sequences and should instead be represented as byte arrays. While the V8 implementation of SerializedScriptValue uses String as its internal representation, the JSC implementation uses Vector<uint8_t>. Converting a Vector<uint8_t> to a String object only for IndexedDB to convert it back immediately to a Vector<char> is awkward and inefficient.
Attachments
Patch
(20.32 KB, patch)
2012-12-07 01:32 PST
,
Michael Pruett
no flags
Details
Formatted Diff
Diff
Patch
(20.32 KB, patch)
2012-12-07 01:38 PST
,
Michael Pruett
buildbot
: commit-queue-
Details
Formatted Diff
Diff
Patch
(20.60 KB, patch)
2012-12-11 15:27 PST
,
Michael Pruett
no flags
Details
Formatted Diff
Diff
Patch
(19.29 KB, patch)
2012-12-11 16:36 PST
,
Michael Pruett
haraken
: review-
Details
Formatted Diff
Diff
Patch
(22.19 KB, patch)
2012-12-15 07:44 PST
,
Michael Pruett
no flags
Details
Formatted Diff
Diff
Patch
(23.08 KB, patch)
2012-12-16 18:21 PST
,
Michael Pruett
no flags
Details
Formatted Diff
Diff
Show Obsolete
(5)
View All
Add attachment
proposed patch, testcase, etc.
Michael Pruett
Comment 1
2012-12-07 01:32:41 PST
Created
attachment 178182
[details]
Patch
WebKit Review Bot
Comment 2
2012-12-07 01:35:41 PST
Attachment 178182
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 Source/WebCore/bindings/v8/SerializedScriptValue.cpp:2257: Extra space before ( in function call [whitespace/parens] [4] Total errors found: 1 in 12 files If any of these errors are false positives, please file a bug against check-webkit-style.
Michael Pruett
Comment 3
2012-12-07 01:38:13 PST
Created
attachment 178184
[details]
Patch
Alec Flett
Comment 4
2012-12-07 09:45:09 PST
Comment on
attachment 178184
[details]
Patch Overall this is great - I was just saying to jsbell that this is an important one. (Note to self: we have to fix the V8 implementation to stop using String internally as well) View in context:
https://bugs.webkit.org/attachment.cgi?id=178184&action=review
> Source/WebCore/Modules/indexeddb/IDBLevelDBCoding.h:74 > +Vector<char> encodeIDBValue(const Vector<uint8_t>&);
I'm not sure I see why we need encodeIDBValue/decodeIDBValue now - seems like they're just introducing additional string copies to do conversion from char <-> uint8_t - is there a way to cleverly use static_cast and the like to make the conversion safe? Alternatively, can we just fix the LevelDB code to use uint8_t?
> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1844 > String SerializedScriptValue::toWireString() const
add FIXME's to say "remove when..." for whatever is stopping this method (preferably in the form of a webkit bug) from being completely removed.. is it bindings code? Chromium code? (and add the identical FIXME for all the methods that should be removed - I see a bunch)
Build Bot
Comment 5
2012-12-07 10:34:15 PST
Comment on
attachment 178184
[details]
Patch
Attachment 178184
[details]
did not pass mac-ews (mac): Output:
http://queues.webkit.org/results/15186463
Michael Pruett
Comment 6
2012-12-11 15:27:22 PST
Created
attachment 178894
[details]
Patch With the changes in
r137346
, the toWireString() and createFromWire(const String&) methods of the JSC SerializedScriptValue class can be removed in this patch.
Alec Flett
Comment 7
2012-12-11 15:54:22 PST
Comment on
attachment 178894
[details]
Patch Can you include changes to LevelDBSlice() to accept a Vector<uint8_t> to avoid the extra copy that this patch introduces?
Alec Flett
Comment 8
2012-12-11 15:55:12 PST
(I think you'll have to use reinterpret_cast)
Joshua Bell
Comment 9
2012-12-11 15:58:10 PST
Comment on
attachment 178894
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=178894&action=review
> Source/WebCore/Modules/indexeddb/IDBBackingStore.h:81 > + virtual bool putRecord(IDBBackingStore::Transaction*, int64_t databaseId, int64_t objectStoreId, const IDBKey&, const Vector<uint8_t>& value, RecordIdentifier*);
Should still have WARN_UNUSED_RETURN?
Michael Pruett
Comment 10
2012-12-11 16:36:41 PST
Created
attachment 178915
[details]
Patch I've added back WARN_UNUSED_RETURN which I erroneously removed from IDBBackingStore::putRecord(). I have also eliminated encodeIDBValue() and decodeIDBValue() and have replaced those with calls to Vector::appendVector() and Vector::appendRange(). I've also updated IDBBackingStore::Cursor::value() to return a const Vector<uint8_t>& rather than Vector<uint8_t>. I believe these changes should eliminate the extra memory copies introduced in my previous patch.
Alec Flett
Comment 11
2012-12-12 13:35:12 PST
Comment on
attachment 178915
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=178915&action=review
> Source/WebCore/bindings/js/SerializedScriptValue.h:110 > + return adopt(bytes);
I'm confused how/why this works - I'm not familiar with the JSC SSV implementation, but aren't you adopting a stack-allocated Vector? or maybe can you just adopt data instead? (ah.. it looks like the constructor is relying on swap(), but maybe it should just be copying it in SerializedScriptValue::SerializedScriptValue(Vector<uint8_t>& buffer) instead of swapping?
> Source/WebCore/bindings/v8/SerializedScriptValue.cpp:2262 > + dst[i] = ntohs(src[i]);
I'm actually now a little confused as to why the ntohs is necessary... This may be the most prudent fix for right now, but could this also be fixed by the V8 SSV parser/serializer simply unpacking/packing bytes in the right order? (it seems unfortunate that the act of creating or extracting an SSV requires this conversion, and that that cost should be paid by the parser/serializer in case the SSV is never deserialized)
> Source/WebCore/bindings/v8/SerializedScriptValue.cpp:2324 > + dst[i] = htons(src[i]);
(ditto for htons here)
Michael Pruett
Comment 12
2012-12-12 14:10:58 PST
(In reply to
comment #11
)
> (From update of
attachment 178915
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=178915&action=review
> > > Source/WebCore/bindings/js/SerializedScriptValue.h:110 > > + return adopt(bytes); > > I'm confused how/why this works - I'm not familiar with the JSC SSV implementation, but aren't you adopting a stack-allocated Vector? or maybe can you just adopt data instead? > > (ah.. it looks like the constructor is relying on swap(), but maybe it should just be copying it in SerializedScriptValue::SerializedScriptValue(Vector<uint8_t>& buffer) instead of swapping?
I agree that adding a constructor which takes a const Vector<uint8_t>& would be cleaner.
> > Source/WebCore/bindings/v8/SerializedScriptValue.cpp:2262 > > + dst[i] = ntohs(src[i]); > > I'm actually now a little confused as to why the ntohs is necessary... This may be the most prudent fix for right now, but could this also be fixed by the V8 SSV parser/serializer simply unpacking/packing bytes in the right order? (it seems unfortunate that the act of creating or extracting an SSV requires this conversion, and that that cost should be paid by the parser/serializer in case the SSV is never deserialized) > > > Source/WebCore/bindings/v8/SerializedScriptValue.cpp:2324 > > + dst[i] = htons(src[i]); > > (ditto for htons here)
I have added byte swapping to the V8 implementation of SSV's toWireBytes() and createFromWire(const Vector<uint8_t>&) in order to maintain binary compatibility with existing IDB databases. In the previous call path, the string in toWireString() and createFromWire(const String&) was swapped in IDBLevelDBCoding.h's encodeString() and decodeString(). Certainly swapping for deserialization could be performed at a later stage such as in SerializedScriptValue::deserialize() if performance is a concern, but that approach would be significantly more complex. I believe the approach in the proposed patch has equivalent performance to the status quo as every value which was written to or read from a LevelDB database was swapped.
Alec Flett
Comment 13
2012-12-12 14:35:57 PST
> Certainly swapping for deserialization could be performed at a later stage such as in SerializedScriptValue::deserialize() if performance is a concern, but that approach would be significantly more complex. I believe the approach in the proposed patch has equivalent performance to the status quo as every value which was written to or read from a LevelDB database was swapped.
Yeah, that's kind of what I figured - I just wanted to confirm. The parser is a total mess right now anyway (it manually puts the value into a String, pulls out 16-bit characters one at a time, and then parses them as a sequence of 8-bit characters. so no sense trying to make that better right now.
Kentaro Hara
Comment 14
2012-12-12 23:29:36 PST
Comment on
attachment 178915
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=178915&action=review
BTW, don't you have any performance concern about this change? (Is there any benchmark that we can use to make sure that this change won't regress performance?)
> Source/WebCore/Modules/indexeddb/IDBBackingStore.cpp:1406 > + Vector<uint8_t> m_emptyValue;
Do you need this member?
> Source/WebCore/Modules/indexeddb/IDBBackingStore.cpp:1528 > + Vector<uint8_t> m_emptyValue;
Ditto.
> Source/WebCore/Modules/indexeddb/IDBBackingStore.cpp:1610 > + Vector<uint8_t> m_value;
Nit: Rename to m_currentValue, as you're using m_currentValue above.
Kentaro Hara
Comment 15
2012-12-12 23:32:03 PST
Comment on
attachment 178915
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=178915&action=review
(It looks like a couple of my comments were gone away...)
> Source/WebCore/bindings/js/SerializedScriptValue.h:107 > + static PassRefPtr<SerializedScriptValue> createFromWire(const Vector<uint8_t>& data)
It's a bit confusing that there are two createFromWire()s. Shall we rename the method?
> Source/WebCore/bindings/js/SerializedScriptValue.h:112 > + const Vector<uint8_t>& toWireBytes() const { return m_data; }
Is this correct? You're implementing toWireBytes() in the cpp file too.
> Source/WebCore/bindings/v8/SerializedScriptValue.cpp:2326 > + return result;
Can we avoid copying the vector ?
Michael Pruett
Comment 16
2012-12-15 07:41:42 PST
(In reply to
comment #15
)
> (From update of
attachment 178915
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=178915&action=review
> > (It looks like a couple of my comments were gone away...) > > > Source/WebCore/bindings/js/SerializedScriptValue.h:107 > > + static PassRefPtr<SerializedScriptValue> createFromWire(const Vector<uint8_t>& data) > > It's a bit confusing that there are two createFromWire()s. Shall we rename the method?
Sure, I'll rename it to createFromWireBytes().
> > Source/WebCore/bindings/js/SerializedScriptValue.h:112 > > + const Vector<uint8_t>& toWireBytes() const { return m_data; } > > Is this correct? You're implementing toWireBytes() in the cpp file too.
Yes I believe this is correct. The toWireBytes() method is defined in the header file for the JSC SerializedScriptValue but in the source file for the V8 SerializedScriptValue.
> > Source/WebCore/bindings/v8/SerializedScriptValue.cpp:2326 > > + return result; > > Can we avoid copying the vector ?
I don't think so, but the copying that is now happening here used to happen in encodeString(), so I don't think there's any change.
Michael Pruett
Comment 17
2012-12-15 07:44:47 PST
Created
attachment 179599
[details]
Patch
Kentaro Hara
Comment 18
2012-12-16 16:38:27 PST
Comment on
attachment 179599
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=179599&action=review
> Source/WebCore/Modules/indexeddb/IDBBackingStore.cpp:1392 > + virtual const Vector<uint8_t>* value() const { ASSERT_NOT_REACHED(); return 0; }
I would prefer returning 'const Vector<uint8_t>&' because m_currentValue is a member variable. Here you can write: virtual const Vector<uint8_t>& value() const { ASSERT_NOT_REACHED(); return new Vector<uint8_t>(); } Performance of this code wouldn't be important as this is a code that must not be reached.
> Source/WebCore/Modules/indexeddb/IDBBackingStore.cpp:1447 > + virtual const Vector<uint8_t>* value() const { return &m_currentValue; }
Ditto.
> Source/WebCore/Modules/indexeddb/IDBBackingStore.cpp:1508 > + virtual const Vector<uint8_t>* value() const { ASSERT_NOT_REACHED(); return 0; }
Ditto.
> Source/WebCore/Modules/indexeddb/IDBBackingStore.cpp:1587 > + virtual const Vector<uint8_t>* value() const { return &m_currentValue; }
Ditto.
> Source/WebCore/Modules/indexeddb/IDBBackingStore.h:118 > + virtual const Vector<uint8_t>* value() const = 0;
Ditto.
> Source/WebCore/bindings/js/SerializedScriptValue.h:110 > + Vector<uint8_t> bytes(data); > + return adopt(bytes);
Let's apply the comment from Alec Flett.
> Source/WebCore/bindings/v8/SerializedScriptValue.cpp:2262 > + dst[i] = ntohs(src[i]);
Alec Flett: Does this ntohs() looks OK ?
Michael Pruett
Comment 19
2012-12-16 18:21:05 PST
Created
attachment 179678
[details]
Patch I've changed the return type of IDBBackingStore::Cursor::value() from const Vector<uint8_t>* to const Vector<uint8_t>& and have added a constructor for JSC SerializedScriptValue which takes const Vector<uint8_t>& rather than Vector<uint8_t>&.
Kentaro Hara
Comment 20
2012-12-16 18:23:09 PST
Comment on
attachment 179678
[details]
Patch LGTM. Let's wait for alec's comment before landing.
Alec Flett
Comment 21
2012-12-17 09:02:18 PST
yes this is much better. You'll notice I filed
bug 105186
to address my own note to self in
comment 8
.
WebKit Review Bot
Comment 22
2012-12-17 16:00:53 PST
Comment on
attachment 179678
[details]
Patch Clearing flags on attachment: 179678 Committed
r137954
: <
http://trac.webkit.org/changeset/137954
>
WebKit Review Bot
Comment 23
2012-12-17 16:00:58 PST
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug