Bug 31168 - Add [HintAtomic] to IDLs to remove unnecessary string conversions in generated V8 DOM bindings
Summary: Add [HintAtomic] to IDLs to remove unnecessary string conversions in generate...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P3 Enhancement
Assignee: Jens Alfke
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-11-05 09:29 PST by Jens Alfke
Modified: 2009-11-18 12:10 PST (History)
3 users (show)

See Also:


Attachments
patch (24.55 KB, patch)
2009-11-05 10:02 PST, Jens Alfke
no flags Details | Formatted Diff | Diff
patch 2 (29.58 KB, patch)
2009-11-10 16:32 PST, Jens Alfke
abarth: review-
Details | Formatted Diff | Diff
path with different implementation (13.73 KB, patch)
2009-11-17 14:34 PST, Jens Alfke
darin: review+
Details | Formatted Diff | Diff
patch with updated SVG headers (15.23 KB, patch)
2009-11-18 10:26 PST, Jens Alfke
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jens Alfke 2009-11-05 09:29:17 PST
The IDL files use the [HintAtomic] annotation on a DOMString parameter to tell the code generator that the corresponding native method takes an AtomicString, not a regular String, for that parameter. Depending on the generated code, this can save some overhead by not requiring an additional implicit conversion from String to AtomicString.

This does not appear to make a difference in the JSC bindings, which use the C++ UString class regardless, but it does for V8. If [HintAtomic] is present, the generated V8 binding will call a utility function that returns an AtomicString directly (if possible using one that's already been cached for that V8 string); if not, it calls a function that returns a String, and will incur an implicit AtomicString(String) constructor call.

I found 40 "in DOMString" parameters in the IDL files that should be marked with "[HintAtomic]" but aren't. This patch adds the annotations.

(How I found them: I edited AtomicString.h and added the keyword 'explicit' before the constructor that takes a String. The effect is to make it a compile error to implicitly convert a String to an AtomicString. Then I compiled the generated V8 DOM bindings, went through the resulting errors and added the appropriate annotations to the IDL, verifying that afterwards the errors went away.)
Comment 1 Jens Alfke 2009-11-05 10:02:53 PST
Created attachment 42573 [details]
patch
Comment 2 Eric Seidel (no email) 2009-11-05 15:26:43 PST
This doens't seem like a very good long term solution.  The optimal state you've found with this patch is fragile.  Can't we get to this state some other way?  At minimum with a script to reproduce the work you had to do to know which attributes need this property?
Comment 3 Jens Alfke 2009-11-06 11:07:39 PST
I agree this isn't optimal, but it's the way things are already implemented, and all I did was a bit of cleanup to bring the IDL annotations into sync with the current code. That's a win, if a small one.

I don't think there's any way to automatically keep in sync, because people can edit the native DOM headers whenever they want, and the code generator doesn't know anything about those headers, only what's described in the IDL.

One improvement would be to cause the bindings to fail to compile if this impedance mismatch occurred, i.e. after someone changed a parameter type from String to AtomicString in the DOM. Then at least we'd find out right away and update the IDL. The reason this compiles anyway is something I filed bug 30187 "implicit AtomicString constructor considered harmful" about earlier.

We could perhaps use an #if in AtomicString.h, then do something like this in DerivedSourcesAllInOne.cpp:
    #define PREVENT_IMPLICIT_ATOMICSTRING
    #include "AtomicString.h"
This #define would activate the 'explicit' keyword in AtomicString.h and cause compile errors on implicit String --> AtomicString coercions.

Does that seem reasonable?
Comment 4 Jens Alfke 2009-11-10 16:32:17 PST
Created attachment 42907 [details]
patch 2

I've implemented what I described the other day:
- AtomicString.h adds the 'explicit' keyword to the AtomicString(String) constructor, if NO_IMPLICIT_ATOMICSTRINGS is defined.
- DerivedSourcesAllInOne.cpp now #defines NO_IMPLICIT_ATOMICSTRINGS before #Including any headers.
The result is that missing any [HintAtomic] annotations in the IDL now causes a compile error.

I also had to add explicit AtomicString constructor calls to two headers that are included indirectly by DerivedSourcesAllInOne, to keep them from failing to compile in this new mode.
Comment 5 Eric Seidel (no email) 2009-11-10 16:46:02 PST
Comment on attachment 42907 [details]
patch 2

I think svn-create-patch will order the patches so the ChangeLog is at the top.  (It will also handle binary data if you ever need that.)

I think this change is a good idea.  I think we should consider getting rid of implicit String->AtomicString conversions all together.

I'm not sure the naming is perfect here. I'm not sure I have any better suggestion besides HintAtomic.

Maybe NO_IMPLICIT_ATOMICSTRING should be DISABLE_IMPLICIT_ATOMICSTRING?

ATOMICSTRING_CONVERSION seems like it might make more sense as EXPENSIVE_CONVERSION since it's hinting about the constructors.

Do we know how much work it would be to remove these implicit conversions all together?  Probably more work than it makes sense to try and do in one bug.

Clearly those SVG toString  methods should be changed to return String instead of AtomicString. :)  We should file a bug about that.

LGTM.  Please wait for a bit for Darin to see this and have a chance to comment if he wants to before you commit this.
Comment 6 Eric Seidel (no email) 2009-11-10 16:46:22 PST
CC folks who would have an opinion about this change.
Comment 7 Jens Alfke 2009-11-10 16:57:38 PST
Yeah, I was trying out "svn diff --cl ..." but it doesn't seem to order the files as nicely. I'll go back to prepare-changelog.

>I'm not sure I have any better suggestion besides HintAtomic.

It's a pre-existing flag; do you think it's worth the trouble of renaming it?

> Maybe NO_IMPLICIT_ATOMICSTRING should be DISABLE_IMPLICIT_ATOMICSTRING?
> ATOMICSTRING_CONVERSION seems like it might make more sense as
> EXPENSIVE_CONVERSION since it's hinting about the constructors.

I slightly prefer NO_, but it's not a big deal. I like EXPENSIVE_CONVERSION. If I end up needing to roll another revision of the patch due to Darin's feedback, I'll make these changes as well.
Comment 8 Sam Weinig 2009-11-10 19:00:56 PST
I am really not excited about this change and would rather it not go in.  Cluttering the IDLs does not seem like a good solution here.  Can you please go into more detail as to why v8 cannot use the same strategy JSC uses?
Comment 9 Jens Alfke 2009-11-10 22:00:52 PST
Sam - You misunderstand. The [HintAtomic] annotation went in a long time ago, probably in the initial implementation of V8 bindings; it would have been better to raise this objection back then. All I'm doing is correcting a glitch where the annotation is missing in _some_ of the places where it's supposed to go.

In other words, the bug here is the _inconsistent_ use of [HintAtomic]. If you think it shouldn't be used at all, that's a separate issue; you could file a bug on that.

> Can you please go
> into more detail as to why v8 cannot use the same strategy JSC uses?

It's harder for V8 and C++ to share string data because V8 has a different memory model than JSC: it uses a generational GC heap instead of malloc/free. This means V8 objects don't exist in the same heap as native objects; they're also relocatable, which doesn't play well with StringImpl.

This complicates the bridging between JavaScript strings and WebCore. One of the side effects is that it incurs extra instructions and runtime work to convert a JS string into first a String and then an AtomicString, than to convert it directly to an AtomicString.

I appreciate your interest in the V8 bindings, and understand that from your point of view the IDL annotations not used by JSC are noise. However, this is a fact of life when different technologies are in use; from Chrome's perspective, the JSC-specific stuff in, for example, StringImpl.cpp is noise. Let's all just get along, and let patches specific to one environment be reviewed by people who are skilled in that environment.
Comment 10 Darin Adler 2009-11-10 23:37:17 PST
(In reply to comment #9)
> One of
> the side effects is that it incurs extra instructions and runtime work to
> convert a JS string into first a String and then an AtomicString, than to
> convert it directly to an AtomicString.

The same is true of JavaScriptCore. This requirement is identical and the problem is similar. However, when we solved the problem for JavaScriptCore we did it without adding additional hints to the IDL file.

Perhaps the tradeoff is additional clutter in AtomicString.h and PlatformString.h vs. clutter in the IDL files? If that is the case, I think the right tradeoff is to put the complexity in one place rather than having to keep the IDL files in sync with the underlying DOM classes.

I did not review the patch that added these new attributes to IDL files for V8, and I feel they create an unnecessary maintenance burden; we should be able to move bindings from String to AtomicString without changing IDL files. V8's bindings should use the same approach the JavaScriptCore bindings use to avoid creating the need to keep the IDL files in sync.

Don't be so quick to attribute this to hostility to the V8 project. I just think it's a design mistake that makes WebCore changes harder to make in the future.

However, also please do not equate JavaScriptCore, WebKit’s own JavaScript engine which has been part of the project since the beginning and evolved in lock step with the rest of WebKit, with V8, an external engine that was retrofitted in to WebKit by Google and which is not used by most people who use WebKit.

I'm sorry we didn't notice this before. Last time Eric did not bring Sam and me in to the review process last time. That's the difference. We're not picking on you.
Comment 11 Sam Weinig 2009-11-11 11:36:36 PST
Firstly, I didn't misunderstand, I just don't want this to propagate.

Darin didn't mention it, but, there is a way to do this without cluttering the IDLs and PlatformString.h/AtomicString.h using C++ magic in the form of a StringConverter class.  A quick sketch of what this might look like is as follows:

class StringConverter {
    explicit StringConverter(const YourStringType&);

    operator String();
    operator AtomicString();

private:
    const YourStringType& m_string;
};

...

Then, when you want to pass a V8String to the WebCore DOM, you simply do (in this case for Document::getElementsByTagNameNS)

document->getElementsByTagNameNS(StringConverter(myV8String), StringConverter(myV8String));
Comment 12 Jens Alfke 2009-11-12 16:11:50 PST
Sam -- I was coming to a similar conclusion about how to handle it. I'll look into implementing it that way. (The problem I can see is that multiple uses of a StringConverter will result in multiple conversions, but that may not be an issue given the way the bindings are structured.)
Comment 13 Adam Barth 2009-11-12 18:49:31 PST
Comment on attachment 42907 [details]
patch 2

Sam wants a different implementation.
Comment 14 Jens Alfke 2009-11-17 14:34:59 PST
Created attachment 43381 [details]
path with different implementation

Here's a different implementation :)

No more IDL annotations. Instead, use a 'polymorphic return value' pattern to generate the appropriate conversion function when a V8 value is assigned to either a String or AtomicString parameter. The V8Parameter class in V8Binding.h implements this — it's a wrapper for a v8 value with two conversion operators, to String or AtomicString. Each inlines the appropriate conversion function.

V8Parameter is templatized because there are three possible modes for conversion, depending on how null/undefined JavaScript values are to be converted.

A note on CodeGeneratorPM: In a few functions like GetNativeTypeFromSignature I changed a boolean 'isParameter' parameter to a numeric 'parameterIndex'. This isn't strictly necessary for this patch, but in the future I want to do some further optimization that will require knowing the parameter index at that point; I didn't manage to get that ready for this patch but I want to leave the parameter change in.
Comment 15 Jens Alfke 2009-11-17 14:36:11 PST
Oh, FYI: this results in a 9% speedup in the Dromaeo DOM Query test. (Other Dromaeo tests are unaffected within margin of error.)
Comment 16 Darin Adler 2009-11-17 22:06:57 PST
Comment on attachment 43381 [details]
path with different implementation

Looks good. Good approach.

>      void setNameInternal(const String& name)
>      {   
> -        m_name = name;
> +        m_name = AtomicString(name);
>      }

This function is never called. It should be removed instead of modified.

> -        static AtomicString toString(bool type) { return type ? "true" : "false"; }
> +        static AtomicString toString(bool type) { return AtomicString(type ? "true" : "false"); }

> -        static AtomicString toString(int type) { return String::number(type); }
> +        static AtomicString toString(int type) { return AtomicString(String::number(type)); }

> -        static AtomicString toString(long type) { return String::number(type); }
> +        static AtomicString toString(long type) { return AtomicString(String::number(type)); }

> -        static AtomicString toString(const SVGLength& type) { return type.valueAsString(); }
> +        static AtomicString toString(const SVGLength& type) { return AtomicString(type.valueAsString()); }

> -        static AtomicString toString(float type) { return String::number(type); }
> +        static AtomicString toString(float type) { return AtomicString(String::number(type)); }

> -        static AtomicString toString(const FloatRect& type) { return String::format("%f %f %f %f", type.x(), type.y(), type.width(), type.height()); }
> +        static AtomicString toString(const FloatRect& type) { return AtomicString(String::format("%f %f %f %f", type.x(), type.y(), type.width(), type.height())); }

> -        static AtomicString toString(const String& type) { return type; }
> +        static AtomicString toString(const String& type) { return AtomicString(type); }

I do not understand why these functions return AtomicString objects! If an AtomicString is really needed here, then we should have versions of these functions that make the atomic string directly because making the String first is not good for cases where the AtomicString already exists.

I haven't reviewed V8 binding code before, but elsewhere WebKit coding style says the data member should be called m_v8Object, not _v8Object.

It also seems curious to me to use a template class instead of three different classes for V8Parameter, and also a bit strange that class name doesn't mention "string" at all.'

Otherwise seems fine. I'll say r=me but I would like some followup cleanup to the AtomicString issues if you're willing.
Comment 17 Jens Alfke 2009-11-18 10:24:34 PST
Thanks, Darin.

>This function is never called. It should be removed instead of modified.

I just tried deleting it, but it turns out it is called, at CSSGrammar.y:700.

>I do not understand why these functions return AtomicString objects!

Looks like they're only called by the PropertySynchronizer class, at SVGAnimatedProperty:477. The value gets stored into an Attribute as an AtomicString. So I can change those methods to return String and then just update PropertySynchronizer.

>If an
>AtomicString is really needed here, then we should have versions of these
>functions that make the atomic string directly

That would require extending AtomicString's API to add number() and format() factory methods. Doesn't seem worth the effort unless this is a known performance bottleneck; is it?

>It also seems curious to me to use a template class instead of three different
>classes for V8Parameter, and also a bit strange that class name doesn't mention
>"string" at all.'

Using templates was the most compact way to do it since all I had to repeat were the method implementations, not the rest of the class boilerplate. The name doesn't have "String" because I think this class could be extended in the future to handle more of the the parameter bindings, by adding typecast operators for other types like bool, int, etc.
Comment 18 Jens Alfke 2009-11-18 10:26:17 PST
Created attachment 43441 [details]
patch with updated SVG headers

Here are the changes to SVGAnimatedTemplate and SVGAnimatedProperty (and renaming the member variable). Look OK to you?
Comment 19 Jens Alfke 2009-11-18 11:58:49 PST
Committed revision 51125.
Comment 20 Eric Seidel (no email) 2009-11-18 12:10:21 PST
Comment on attachment 43441 [details]
patch with updated SVG headers

Thank you jens for fixing the SVG property mess.