Bug 87245

Summary: [EFL] Ewk_Intent does not expose WebCore::Intent::data()
Product: WebKit Reporter: Chris Dumez <cdumez>
Component: WebKit EFLAssignee: Chris Dumez <cdumez>
Status: RESOLVED INVALID    
Severity: Normal CC: abarth, gbillock, gyuyoung.kim, lucas.de.marchi, oliver, rakuco, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 86841, 86868    
Attachments:
Description Flags
Patch
abarth: review-
Patch
none
Patch none

Description Chris Dumez 2012-05-23 03:58:25 PDT
EFL's Ewk_Intent does not expose WebCore::Intent::data() yet. The issue is that JSC's SerializedScriptValue does not expose its wire string.

In order to expose WebCore::Intent::data() in Ewk_Intent, we need to implement SerializedScriptValue::toWireString() and make this method accessible even if INDEXED_DATABASE is disabled.
Comment 1 Chris Dumez 2012-05-23 04:16:50 PDT
Created attachment 143528 [details]
Patch
Comment 2 Greg Billock 2012-05-23 08:03:04 PDT
Comment on attachment 143528 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=143528&action=review

> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1791
> +    characters.append(data.characters8(), data.length());

Do you need to check data.is8bit() here and create an intermediate representation if it isn't? Or should wire format strings always be 8-bit?
Comment 3 Adam Barth 2012-05-23 13:19:34 PDT
Comment on attachment 143528 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=143528&action=review

>> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1791
>> +    characters.append(data.characters8(), data.length());
> 
> Do you need to check data.is8bit() here and create an intermediate representation if it isn't? Or should wire format strings always be 8-bit?

This isn't right.  You should create a CString temporary using data.utf8() and then use the accessors on the CString to get the chars out.
Comment 4 Chris Dumez 2012-05-23 23:59:36 PDT
Created attachment 143747 [details]
Patch

Use String::utf8() as advised by abarth.
Comment 5 Adam Barth 2012-05-24 08:35:38 PDT
Comment on attachment 143747 [details]
Patch

It's interesting that the JSC version of SerializedScriptValue uses a Vector<unsigned char> for m_data but the V8 version uses String.  I wonder why they're different.
Comment 6 Chris Dumez 2012-05-25 03:09:11 PDT
(In reply to comment #5)
> (From update of attachment 143747 [details])
> It's interesting that the JSC version of SerializedScriptValue uses a Vector<unsigned char> for m_data but the V8 version uses String.  I wonder why they're different.

After looking at the GIT history for both implementation, I don't see any obvious reason for this difference. I'm adding Oliver in CC in case he has input of this.

Does the patch look valid otherwise?
Comment 7 Adam Barth 2012-05-25 15:40:10 PDT
Comment on attachment 143747 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=143747&action=review

> Source/WebCore/bindings/js/SerializedScriptValue.cpp:1793
> +    CString dataChars = data.utf8();
> +    Vector<unsigned char> characters;
> +    characters.append(reinterpret_cast<const unsigned char*>(dataChars.data()), dataChars.length());
> +    return adopt(characters);

This looks like it has an extra copy in it.  Here you're copying the UTF-8 data into the CString and then you're copying it into the Vector.  Is there really no way to deserialize it without the extra copy?
Comment 8 Chris Dumez 2012-05-28 06:54:09 PDT
Created attachment 144351 [details]
Patch

After thinking a bit more about it, I went back to my initial proposal.

Note that WTF::String is used here purely has a container for our char buffer. In the JSC case, this is a vector of unsigned char (or LChar), so a 8-Bit buffer.
The JSC::SerializedScriptValue::toWireString() can only return a String that is 8-Bit (as per its implementation in the present patch). Only a String returned from JSC::SerializedScriptValue::toWireString() can be passed to JSC::SerializedScriptValue::createFromWire() and therefore the String passed as argument as to be 8-Bit or it is invalid. I added ASSERTIONS to be safe.

I believe that calling WTF::String::utf8() in this case is wrong because it may cause converting of the data stored in the buffer internally. We specifically don't want the char buffer to be altered since WTF::String is used purely as a container (which is slightly confusing but matching the V8 API). A binary image might be stored in that buffer and converting to a different encoding will lead to unexpected result.

For short, the "wire" String here is used merely as an exchange/storage format. It is used in the following scenario in IndexedDB for example:
- Store output of serializedData.toWireString() to the database
- Read data (aka "wire string") back from database
- Construct back a SerializedScriptValue from the wire string using SerializedScriptValue::createFromWire().
Comment 9 Chris Dumez 2012-05-28 07:40:50 PDT
Comment on attachment 144351 [details]
Patch

Clearing flags. I'll provide the wire string code for JSC::SerializedScriptValue in a separate patch and not use it for the EFL Intent code. Instead, I plan to write an EFL wrapper for SerializedScriptValue (similar to WebSerializedScriptValue).

Relying on the internal representation of the WTF::String in the EFL Intent code is unsafe, especially considering JSC and V8 implementations of SerializedScriptValue use different representations (JSC uses 8bit, V8 appears to be using 16-Bit).
Comment 10 Chris Dumez 2012-06-01 03:27:52 PDT
Ewk_Intent does not need to publicly expose WebCore::Intent::data().