WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED INVALID
99310
[JSC] Indexeddb unable to deserialize the string record
https://bugs.webkit.org/show_bug.cgi?id=99310
Summary
[JSC] Indexeddb unable to deserialize the string record
Charles Wei
Reported
2012-10-15 05:15:51 PDT
The indexeddb for jsc binding failed to work (tested with BlackBerry porting), with the log: Type error: unable to deserialize data.
Attachments
indexeddb test case
(15.57 KB, text/html)
2012-10-15 05:16 PDT
,
Charles Wei
no flags
Details
Patch
(2.35 KB, patch)
2012-10-15 05:59 PDT
,
Charles Wei
no flags
Details
Formatted Diff
Diff
Patch
(2.59 KB, patch)
2012-11-12 22:19 PST
,
Charles Wei
no flags
Details
Formatted Diff
Diff
Patch
(2.93 KB, patch)
2012-11-13 19:30 PST
,
Charles Wei
no flags
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Charles Wei
Comment 1
2012-10-15 05:16:35 PDT
Created
attachment 168683
[details]
indexeddb test case
Charles Wei
Comment 2
2012-10-15 05:59:34 PDT
Created
attachment 168694
[details]
Patch
Rob Buis
Comment 3
2012-10-15 06:47:33 PDT
Comment on
attachment 168694
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=168694&action=review
I think someone with more knowledge about IndexDB should review.
> Source/WebCore/ChangeLog:3 > + Indexeddb doesn't wrong with the attached test case and "Unable to deserialize data"
Title seems odd?
Charles Wei
Comment 4
2012-10-15 07:07:17 PDT
(In reply to
comment #3
)
> (From update of
attachment 168694
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=168694&action=review
> > I think someone with more knowledge about IndexDB should review. > > > Source/WebCore/ChangeLog:3 > > + Indexeddb doesn't wrong with the attached test case and "Unable to deserialize data" > > Title seems odd?
Sorry, that comes from the wrong title of the bug,which has been fixed. Will resubmit the patch with the correct title by s/wrong/work
Joshua Bell
Comment 5
2012-10-15 11:08:26 PDT
Comment on
attachment 168694
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=168694&action=review
> Source/WebCore/ChangeLog:15 > + The test case is attached at:
https://bugs.webkit.org/show_bug.cgi?id=99310
Please convert the test case to a LayoutTest in storage/indexeddb so regressions will be caught automatically when JSC ports enable IDB.
> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1008 > + // If the value contains odd number of content, append UndefinedTag at the end
Just to clarify what's going on: "content" => "bytes"
> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1010 > + // String is UChar (2 types) internal and takes even number of value.
"types" => "bytes" "value" => "bytes"
> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1012 > + writeLittleEndian<uint8_t>(const_cast<Vector<uint8_t>& >(value), UndefinedTag);
This is violating the "const" constraint and modifying the vector that is passed in, which is bad. Can you find another way to do this? I haven't traced through to see if adding an extra UndefinedTag at the end of the data is safe; the JSC toWireString(Vector<char>)/create(String) code is much more complicated than it seems like it needs to be, so there's stuff going on I'm not familiar with. Why doesn't this issue affect non-IDB use of SerializedScriptValue?
Charles Wei
Comment 6
2012-11-12 21:41:56 PST
Comment on
attachment 168694
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=168694&action=review
>> Source/WebCore/ChangeLog:15 >> + The test case is attached at:
https://bugs.webkit.org/show_bug.cgi?id=99310
> > Please convert the test case to a LayoutTest in storage/indexeddb so regressions will be caught automatically when JSC ports enable IDB.
Actually the existing test cases cover this case. This happens when IndexedDB.put('string', 'key'); The following existing test cases fail without the patch and succeed with the fix. LayoutTests/storage/indexeddb/cursor_delete.html LayoutTests/storage/indexeddb/cursor_inconsistency.html LayoutTests/storage/indexeddb/cursor_prev_no_duplicate.html because the internal of the SerializedScriptValue for String is: <version,2 bytes> <StringTag, 1 byte> <Length 2 bytes> <UChar 2*n bytes> That is an odd bytes. While when persist the value, we call String SerializedScriptValue::toWireString() to make it a String for persistence. if we don't pad with one byte, we lose the last byte of the SerializedScriptValue and can't convert it back with SerializedScriptValue::createFromWire() and SerializedScriptValue::deserialize().
>> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1008 >> + // If the value contains odd number of content, append UndefinedTag at the end > > Just to clarify what's going on: > > "content" => "bytes"
ok.
>> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1010 >> + // String is UChar (2 types) internal and takes even number of value. > > "types" => "bytes" > "value" => "bytes"
Ok.
>> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1012 >> + writeLittleEndian<uint8_t>(const_cast<Vector<uint8_t>& >(value), UndefinedTag); > > This is violating the "const" constraint and modifying the vector that is passed in, which is bad. > > Can you find another way to do this? > > I haven't traced through to see if adding an extra UndefinedTag at the end of the data is safe; the JSC toWireString(Vector<char>)/create(String) code is much more complicated than it seems like it needs to be, so there's stuff going on I'm not familiar with. > > Why doesn't this issue affect non-IDB use of SerializedScriptValue?
There's no other better way to cast away the const, another way is to remove the const from the argument. why doesn't this issue affect non-IDB use ? Because there's no others are using this except IDB. toWireString and createFromWire are inside INDEXEDDB macro.
Charles Wei
Comment 7
2012-11-12 22:19:08 PST
Created
attachment 173823
[details]
Patch
Yong Li
Comment 8
2012-11-13 07:10:55 PST
Comment on
attachment 173823
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=173823&action=review
> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1011 > + if (value.size() % sizeof(UChar))
I remember Darin told me before UChar could be 4 bytes?
Yong Li
Comment 9
2012-11-13 07:14:10 PST
Comment on
attachment 173823
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=173823&action=review
> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1012 > + writeLittleEndian<uint8_t>(const_cast<Vector<uint8_t>& >(value), NullTag);
I don't like this const_cast, nor the unneeded null in String. Why don't we change "length" instead?
Yong Li
Comment 10
2012-11-13 07:38:54 PST
Comment on
attachment 173823
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=173823&action=review
>> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1012 >> + writeLittleEndian<uint8_t>(const_cast<Vector<uint8_t>& >(value), NullTag); > > I don't like this const_cast, nor the unneeded null in String. Why don't we change "length" instead?
I was wrong about "length". But still, don't like the const_cast here.
Charles Wei
Comment 11
2012-11-13 08:24:18 PST
Comment on
attachment 173823
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=173823&action=review
>> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1011 >> + if (value.size() % sizeof(UChar)) > > I remember Darin told me before UChar could be 4 bytes?
In what scenario that Uchar is 4 bytes?
>>> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1012 >>> + writeLittleEndian<uint8_t>(const_cast<Vector<uint8_t>& >(value), NullTag); >> >> I don't like this const_cast, nor the unneeded null in String. Why don't we change "length" instead? > > I was wrong about "length". But still, don't like the const_cast here.
Just changing the length will leave an arbitrary byte at the end which might be interpreted differently. we intentionally put a NullTag at the end to make it even bytes, otherwise we will lose one bytes. The padding NullTag won't be interpreted to anything meaningful. I also don't like the const_cast,but if we don't use it, we have to make a copy of the data and operate on the copy which is an even worse solution I think. We have to use if if we have no other better solutions. Or what do you suggest, yong?
Joshua Bell
Comment 12
2012-11-13 08:55:40 PST
Note that
webkit.org/b/45110
has a fix for the same problem and doesn't do a const_cast - it does a data copy. While that's unfortunate I don't think a const_cast is acceptable because it breaks the API contract and could affect unrelated functions. Either doing a copy or changing the function signature is a better approach.
Yong Li
Comment 13
2012-11-13 09:01:30 PST
(In reply to
comment #12
)
> Note that
webkit.org/b/45110
has a fix for the same problem and doesn't do a const_cast - it does a data copy. While that's unfortunate I don't think a const_cast is acceptable because it breaks the API contract and could affect unrelated functions. Either doing a copy or changing the function signature is a better approach.
Or CloneDeserializer::readString(start, end, str, length - 1), and then make up an 1-char buffer for the last char? Also I'm not sure adding NullTag is safe. Charles, could we just ignore the last char? Or use "?"
Charles Wei
Comment 14
2012-11-13 19:30:40 PST
Created
attachment 174052
[details]
Patch
Charles Wei
Comment 15
2012-11-13 19:32:11 PST
(In reply to
comment #12
)
> Note that
webkit.org/b/45110
has a fix for the same problem and doesn't do a const_cast - it does a data copy. While that's unfortunate I don't think a const_cast is acceptable because it breaks the API contract and could affect unrelated functions. Either doing a copy or changing the function signature is a better approach.
Ok, the new patch just does a copy and operates on the copy instead of the original data, even though I don't like this solution.
Charles Wei
Comment 16
2012-11-13 19:33:35 PST
(In reply to
comment #13
)
> (In reply to
comment #12
) > > Note that
webkit.org/b/45110
has a fix for the same problem and doesn't do a const_cast - it does a data copy. While that's unfortunate I don't think a const_cast is acceptable because it breaks the API contract and could affect unrelated functions. Either doing a copy or changing the function signature is a better approach. > > Or CloneDeserializer::readString(start, end, str, length - 1), and then make up an 1-char buffer for the last char? > > Also I'm not sure adding NullTag is safe. Charles, could we just ignore the last char? Or use "?"
Adding NullTag is safer than anything else, I think. we can't just ignore the last char, otherwise, it will be an arbitrary byte and unpredictable.
Oliver Hunt
Comment 17
2012-11-14 09:43:44 PST
Let me have a look at this, I need to work out what's going on, and the serialization code shouldn't be having any IDB specific issues. For reference can you tell me what architecture and byte order blackberry uses? Also UChar are only 2 bytes, a character may be four bytes (two UChar's), this is one of those great character vs. code point things. We also like to try and store strings using a byte characters where possible. Oh, and all changes to serialization should bump the version number.
Joshua Bell
Comment 18
2012-11-14 09:49:20 PST
SerializedScriptValue serializes to a Vector<char> - an opaque binary blob - then tries to store that result in a String (in toWireString). If the length of the Vector<char> is odd the data is truncated by one byte. Nothing to do with code points vs. characters. It's purely about storing an opaque blob of binary data in a String and an issue with odd lengths. There's nothing IDB specific about this, but perhaps IDB is the only thing exercising the toWireString() method.
Joshua Bell
Comment 19
2012-11-14 09:54:19 PST
(In reply to
comment #18
)
> Nothing to do with code points vs. characters.
(er, code units vs. code points; sorry, caffeine is still kicking in.)
Oliver Hunt
Comment 20
2012-11-14 10:09:47 PST
(In reply to
comment #19
)
> (In reply to
comment #18
) > > Nothing to do with code points vs. characters. > > (er, code units vs. code points; sorry, caffeine is still kicking in.)
I was responding to the "UChar could be four bytes" comment. Bigger question: wtf is idb using Strings as its storage format? It's storing a binary blob, not a string.
Joshua Bell
Comment 21
2012-11-14 10:20:46 PST
(In reply to
comment #20
)
> Bigger question: wtf is idb using Strings as its storage format? It's storing a binary blob, not a string.
Now that's a good question... the code around that hasn't been touched in the time I've been working on IDB. It doesn't look like there's a good reason; it's just being used as a length + bytes mechanism for serialization. Would be easy to change, apart from back-compat (we still need to be able to read old data that's already on disk). From an IDB perspective it would make sense to remove the toWireString/createFromWire(String) methods in favor of producing/consuming Vector<char>.
Joshua Bell
Comment 22
2012-11-14 15:42:48 PST
***
Bug 102293
has been marked as a duplicate of this bug. ***
Charles Wei
Comment 23
2012-11-14 18:30:45 PST
(In reply to
comment #21
)
> (In reply to
comment #20
) > > Bigger question: wtf is idb using Strings as its storage format? It's storing a binary blob, not a string. > > Now that's a good question... the code around that hasn't been touched in the time I've been working on IDB. It doesn't look like there's a good reason; it's just being used as a length + bytes mechanism for serialization. Would be easy to change, apart from back-compat (we still need to be able to read old data that's already on disk). > > From an IDB perspective it would make sense to remove the toWireString/createFromWire(String) methods in favor of producing/consuming Vector<char>.
yes, agree. toWireString actually goes to a String and createFromWire's input is String, that's what it is now , for both JSC and V8 bindings, even though I don't think it's reasonable and what's the background of that design. What this patch is about is to make it work with this mechanism, refactor of this is out of the scope of this patch.
George Staikos
Comment 24
2012-11-19 21:43:06 PST
(In reply to
comment #23
)
> (In reply to
comment #21
) > > (In reply to
comment #20
) > > > Bigger question: wtf is idb using Strings as its storage format? It's storing a binary blob, not a string. > > > > Now that's a good question... the code around that hasn't been touched in the time I've been working on IDB. It doesn't look like there's a good reason; it's just being used as a length + bytes mechanism for serialization. Would be easy to change, apart from back-compat (we still need to be able to read old data that's already on disk). > > > > From an IDB perspective it would make sense to remove the toWireString/createFromWire(String) methods in favor of producing/consuming Vector<char>. > > yes, agree. toWireString actually goes to a String and createFromWire's input is String, that's what it is now , for both JSC and V8 bindings, even though I don't think it's reasonable and what's the background of that design. What this patch is about is to make it work with this mechanism, refactor of this is out of the scope of this patch.
I think Charles is right. This patch looks like it fixes the bug, but we all seem to agree that it should move away from String at some point. Any objections?
Charles Wei
Comment 25
2012-11-19 22:02:54 PST
Just ran the test on Mac , with and without this patch. The result shows that with the patch, no regressions found, instead, some failed test cases succeeded now. details: Tests that failed text/pixel/audio diff (13) : same as without the patch. Tests that had stderr output (1): It was 8 without the patch. Flaky tests (failed the first run and passed on retry) (13): it was 17 without the patch. Tests expected to fail but passed (5): same as without the patch.
Joe Mason
Comment 26
2013-03-15 06:19:50 PDT
(In reply to
comment #24
)
> I think Charles is right. This patch looks like it fixes the bug, but we all seem to agree that it should move away from String at some point. Any objections?
Seems to me that if this fixes the bug it should be committed ASAP and we should file a followup bug to convert away from String.
Michael Pruett
Comment 27
2013-03-15 10:04:42 PDT
(In reply to
comment #26
)
> Seems to me that if this fixes the bug it should be committed ASAP and we should file a followup bug to convert away from String.
As of the changes made in
bug 104354
, IndexedDB serialization no longer uses strings. This bug is no longer relevant and should be closed.
Charles Wei
Comment 28
2013-03-25 23:40:55 PDT
This patch doesn't seem to apply anymore with the change upstream.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug