Summary: | [EFL] Ewk_Intent does not expose WebCore::Intent::data() | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Chris Dumez <cdumez> | ||||||||
Component: | WebKit EFL | Assignee: | Chris Dumez <cdumez> | ||||||||
Status: | RESOLVED INVALID | ||||||||||
Severity: | Normal | CC: | abarth, gbillock, gyuyoung.kim, lucas.de.marchi, oliver, rakuco, webkit.review.bot | ||||||||
Priority: | P2 | ||||||||||
Version: | 528+ (Nightly build) | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Bug Depends on: | |||||||||||
Bug Blocks: | 86841, 86868 | ||||||||||
Attachments: |
|
Description
Chris Dumez
2012-05-23 03:58:25 PDT
Created attachment 143528 [details]
Patch
Comment on attachment 143528 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=143528&action=review > Source/WebCore/bindings/js/SerializedScriptValue.cpp:1791 > + characters.append(data.characters8(), data.length()); Do you need to check data.is8bit() here and create an intermediate representation if it isn't? Or should wire format strings always be 8-bit? Comment on attachment 143528 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=143528&action=review >> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1791 >> + characters.append(data.characters8(), data.length()); > > Do you need to check data.is8bit() here and create an intermediate representation if it isn't? Or should wire format strings always be 8-bit? This isn't right. You should create a CString temporary using data.utf8() and then use the accessors on the CString to get the chars out. Created attachment 143747 [details]
Patch
Use String::utf8() as advised by abarth.
Comment on attachment 143747 [details]
Patch
It's interesting that the JSC version of SerializedScriptValue uses a Vector<unsigned char> for m_data but the V8 version uses String. I wonder why they're different.
(In reply to comment #5) > (From update of attachment 143747 [details]) > It's interesting that the JSC version of SerializedScriptValue uses a Vector<unsigned char> for m_data but the V8 version uses String. I wonder why they're different. After looking at the GIT history for both implementation, I don't see any obvious reason for this difference. I'm adding Oliver in CC in case he has input of this. Does the patch look valid otherwise? Comment on attachment 143747 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=143747&action=review > Source/WebCore/bindings/js/SerializedScriptValue.cpp:1793 > + CString dataChars = data.utf8(); > + Vector<unsigned char> characters; > + characters.append(reinterpret_cast<const unsigned char*>(dataChars.data()), dataChars.length()); > + return adopt(characters); This looks like it has an extra copy in it. Here you're copying the UTF-8 data into the CString and then you're copying it into the Vector. Is there really no way to deserialize it without the extra copy? Created attachment 144351 [details]
Patch
After thinking a bit more about it, I went back to my initial proposal.
Note that WTF::String is used here purely has a container for our char buffer. In the JSC case, this is a vector of unsigned char (or LChar), so a 8-Bit buffer.
The JSC::SerializedScriptValue::toWireString() can only return a String that is 8-Bit (as per its implementation in the present patch). Only a String returned from JSC::SerializedScriptValue::toWireString() can be passed to JSC::SerializedScriptValue::createFromWire() and therefore the String passed as argument as to be 8-Bit or it is invalid. I added ASSERTIONS to be safe.
I believe that calling WTF::String::utf8() in this case is wrong because it may cause converting of the data stored in the buffer internally. We specifically don't want the char buffer to be altered since WTF::String is used purely as a container (which is slightly confusing but matching the V8 API). A binary image might be stored in that buffer and converting to a different encoding will lead to unexpected result.
For short, the "wire" String here is used merely as an exchange/storage format. It is used in the following scenario in IndexedDB for example:
- Store output of serializedData.toWireString() to the database
- Read data (aka "wire string") back from database
- Construct back a SerializedScriptValue from the wire string using SerializedScriptValue::createFromWire().
Comment on attachment 144351 [details]
Patch
Clearing flags. I'll provide the wire string code for JSC::SerializedScriptValue in a separate patch and not use it for the EFL Intent code. Instead, I plan to write an EFL wrapper for SerializedScriptValue (similar to WebSerializedScriptValue).
Relying on the internal representation of the WTF::String in the EFL Intent code is unsafe, especially considering JSC and V8 implementations of SerializedScriptValue use different representations (JSC uses 8bit, V8 appears to be using 16-Bit).
Ewk_Intent does not need to publicly expose WebCore::Intent::data(). |