Bug 25908 - Implement method calls
Summary: Implement method calls
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac OS X 10.5
: P2 Normal
Assignee: Gavin Barraclough
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-05-20 22:16 PDT by Gavin Barraclough
Modified: 2009-05-22 18:48 PDT (History)
1 user (show)

See Also:


Attachments
The patch (114.41 KB, patch)
2009-05-21 02:06 PDT, Gavin Barraclough
no flags Details | Formatted Diff | Diff
Moar gud speeling in teh ChangeLog. (Hi bdash!) (114.43 KB, patch)
2009-05-21 02:24 PDT, Gavin Barraclough
no flags Details | Formatted Diff | Diff
Add changes to WebCore (133.46 KB, patch)
2009-05-21 04:03 PDT, Gavin Barraclough
no flags Details | Formatted Diff | Diff
two tiny fixes to asserts - remove assert in interpreter, and fix assert in JITStubs.cpp (133.66 KB, patch)
2009-05-21 15:04 PDT, Gavin Barraclough
ggaren: review-
Details | Formatted Diff | Diff
Fix lack of structure ref', revert putDirect interface (add putDirectFunction). (93.43 KB, patch)
2009-05-22 16:48 PDT, Gavin Barraclough
ggaren: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Gavin Barraclough 2009-05-20 22:16:23 PDT
When making a method call (e.g. foo.bar()), we load the function from the base object with a regular get_by_id access, and then call to the function with op_call.

However if we add information to the Structure class to allow it to track where a Structure property is known to be a particular function (updating the Structure id if the property is modified), we could skip the load - just checking the object's structure would be sufficient.
Comment 1 Gavin Barraclough 2009-05-21 02:06:45 PDT
Created attachment 30532 [details]
The patch
Comment 2 Gavin Barraclough 2009-05-21 02:24:29 PDT
Created attachment 30534 [details]
Moar gud speeling in teh ChangeLog.  (Hi bdash!)
Comment 3 Gavin Barraclough 2009-05-21 04:03:11 PDT
Created attachment 30539 [details]
Add changes to WebCore
Comment 4 Gavin Barraclough 2009-05-21 15:04:04 PDT
Created attachment 30561 [details]
two tiny fixes to asserts - remove assert in interpreter, and fix assert in JITStubs.cpp
Comment 5 Geoffrey Garen 2009-05-21 19:09:58 PDT
+        static const int patchOffsetMethodCheckPunFunction = 50;

typo: Should be "patchOffsetMethodCheckPutFunction".


+        void setNewProperty(JSObject* base, size_t offset, JSFunction* function)
         {
-            m_type = NewProperty;
+            // If function is non-zero, then for now let's leave the type as
+            // uncachable.  We can come back and add these cases later if necessary -
+            // all cached transitions will just need to check the value being passed
+            // matches too.  But it is not clear that this is an important case,
+            // just not caching function property puts may be fine for now.
+            if (!function)
+                m_type = NewProperty;
+            m_base = base;
+            m_offset = offset;
+        }
+
+        void setDespecifyFunctionProperty(JSObject* base, size_t offset)
+        {
+            // Leave the type as uncachable for now.  'DespecifyFunction' is like
+            // an 'Existing' property write, except that it *does* require a structure
+            // change to record that we can no longer optimize calls to this property.

I think you could fold these two new PropertySlot functionalities into a simple "setUncacheable" function. The comment about why we would want to make the property uncacheable probably belongs in JSObject.cpp or Structure.cpp.

+    JSFunction* specificFunction = getJSFunction(globalData, value);
+
     if (m_structure->isDictionary()) {

I guess the value of checking getJSFunction inside JSObject::putDirect is that it automatically catches all cases of property putting, creating optimizable function properties if possible. 

But in my search through JavaScriptCore, I see only two places where we putDirect a property, not already knowing whether we care to record it as a specific function or not: JSObject::put and JSObject::putWithAttributes. (There may be one or two more cases inside WebCore -- I skimmed that.) 

To me, it's a shame to change so many callers, and to add two branches to all puts, just to make things automatic for these two callers.

I think I'd prefer to see JSObject::put and JSObject::putWithAttributes test for whether a property is a function, and pass an optional, defaulted to null, parameter to putDirect for "specificFunction".

+        size_t get(const UString::Rep* rep, unsigned& attributes, JSFunction*& specificFunction);
+        size_t get(const Identifier& propertyName, unsigned& attributes, JSFunction*& specificFunction)

+bool JSObject::getPropertySpecificFunction(ExecState*, const Identifier& propertyName, JSFunction*& specificFunction) const
+{
+    unsigned attributes;
+    if (m_structure->get(propertyName, attributes, specificFunction) != WTF::notFound)
+        return true;
+
+    // This could be a function within the static table? - should probably
+    // also look in the hash?  This currently should not be a problem, since
+    // we've currently always call 'get' first, which should have populated
+    // the normal storage.
+    return false;
+}

These functions are becoming an unwieldy and inefficient way to do things. In the future, we should probably move toward storing a PropertyMapEntry* in a PropertySlot, instead. That way, a single lookup can get you attributes, specificFunction, offset, or whatever else you may need. I'm not going to ask you to do that in this patch, though.

I'm going to say r+ because I don't see any show-stopper problems, but I'd really prefer to see this patch land with "getJSFunction" removed from JSObject::putDirect.
Comment 6 Geoffrey Garen 2009-05-21 20:01:33 PDT
This GC problem just occurred to me:

get_by_id instruction G references Structure S, which points to specificFunction F1.

All objects of type S are garbage collected, so F1 is garbage collected, too. (S is still around, though, since it's referenced by G.)

A new function, F2, is allocated at F1's address.

A new object, O, is allocated, and transitions along a path to Structure S, but this time storing F2 instead of F1.

get_by_id instruction G executes on O. G's structure check passes, since O has Structure S, and G attempts to invoke F1 => CRASH.
Comment 7 Geoffrey Garen 2009-05-21 20:02:50 PDT
Comment on attachment 30561 [details]
two tiny fixes to asserts - remove assert in interpreter, and fix assert in JITStubs.cpp

I should probably say r- until we sort out the GC issue.
Comment 8 Geoffrey Garen 2009-05-21 21:22:38 PDT
> gbarra: ggaren: if there are no objects that refer to that function, then there can be no objects left with that Structure id,
> since to have that structure you must point at that object.
> [8:11pm] ggaren: gbarra: true, but a new object could come along, and transition along a path to that same structure
> [8:11pm] gbarra: ggaren: but that could only happen if the new object was at exactly the same location...
> [8:12pm] gbarra: ggaren: in which case if you did a get by id for that property, you should get exactly the same value!
> [8:12pm] gbarra: the contract still holds, albeit in a weird ghostly way!

I don't completely understand what you were driving at. Maybe we should talk this over in person. But let me try to outline an example scenario more clearly, so we can discuss it:

function MyObject(x, y, f) {
    this.x = 1;
    this.y = 2;
    this.f = f;
}

function callF(myObject) {
    myObject.f();
}

First, a program does a bunch of "callF(new MyObject(1, 2, function f1() {...}))". This creates the following transition chain:

{} -> {x} => {x, y} => {x, y, f=0xAAAAAAAA}

callF specializes a op_get_by_id followed by a op_call for the last structure in the transition chain, {x, y, f=0xAAAAAAAA}.

Second, all MyObject objects are GC'd, along with f1, but the specializations in MyObject and callF keep the above structure transition chain alive.

Third, the program does a "callF(new MyObject(1, 2, function f2() {...}))". Unluckily, f2 is allocated at the exact address f1 was allocated at (0xAAAAAAAA). Because f2 is pointer-equal to f1, your new MyObject follows the transition chain to {x, y, f=0xAAAAAAAA}. callF's structure check matches, and callF executes the hot path for its op_get_by_id + op_call.

However, f2's JITCode is allocated in a different place than f1's. So, when callF tries to use its cached JITCode pointer, it jumps to freed memory. (A similar problem applies to the implicitly cached assumption that the caller matches the callee's arity.)

> gbarra: ggaren: so, I was not entirely keen on passing globalData to all the putDirects, but I think an extra optional 
> parameter may be a bit spooky and magically here.  What do you think to reverting putDirect's external interface, 
> and adding a putDirectFunction() method for adding functions?

I think that would be fine. (Note that the name "putDirectFunction" is currently taken -- I'm sure you can work around that, though.)

(Side note: As you mentioned on IRC, in this patch, op_method_check doesn't retain the structure it checks; it needs to.)
Comment 9 Geoffrey Garen 2009-05-21 22:07:40 PDT
> Second, all MyObject objects are GC'd, along with f1, but the specializations
> in MyObject and callF keep the above structure transition chain alive.

Aha!

I guess f1's destructor would also unlink all optimized calls to f1, eliminating the stale cached call problem.

I'm bummed that this code will be so subtle / fragile, but I guess I don't have a better alternative off-hand.

Two additional things about that, then:

1. Be sure to update the patch to ref/deref the structure used by op_method_check.

2. It's not really appropriate to have a generic name / interface for "specificFunction" -- the Structure doesn't guarantee that an object possesses that specific function -- it just holds onto information shared between a function and op_method_check. I would recommend a name like "cachedMethod" or even "opMethodCheckCache" with a warning comment that says "used by op_method_check only -- an object of this structure is not guaranteed to have this method in this slot".
Comment 10 Gavin Barraclough 2009-05-22 01:26:38 PDT
(In reply to comment #9)
> > Second, all MyObject objects are GC'd, along with f1, but the specializations
> > in MyObject and callF keep the above structure transition chain alive.
> 
> Aha!
> 
> I guess f1's destructor would also unlink all optimized calls to f1,
> eliminating the stale cached call problem.

Yes, exactly.  My patch is not changing how the actually op_call part of this works at all, so if there were a life-cycle problem relating to the call, then it would have to exist in ToT too!

> I'm bummed that this code will be so subtle / fragile, but I guess I don't have
> a better alternative off-hand.
> 
> Two additional things about that, then:
> 
> 1. Be sure to update the patch to ref/deref the structure used by
> op_method_check.
> 
> 2. It's not really appropriate to have a generic name / interface for
> "specificFunction" -- the Structure doesn't guarantee that an object possesses
> that specific function -- it just holds onto information shared between a
> function and op_method_check. I would recommend a name like "cachedMethod" or
> even "opMethodCheckCache" with a warning comment that says "used by
> op_method_check only -- an object of this structure is not guaranteed to have
> this method in this slot".

Agreed, I think we should fix this – though I'd suggest turning the fix around slightly.  I don't think the fact that the Structures may outlive the objects that they represent is inherently a problem.  Fundamentally, everything will still work fine, whether they're reused or not.  However the one thing that seems to me to be really subtle here is the fact that in a scenario similar to the one you describe, the new object allocated over a freed function might *not* be a new function, in which case the name specificFunction is badly misleading, and storing a JSFunction* to it is clearly incorrect.  Instead, I'd suggest simply generalizing the field a little - make it be a JSCell*, called something like  'specificValue' or 'cachedCell' or 'fixedType'.

That way the field is never incorrectly claiming something to be a function when it is not one, and it may be useful for further optimizations in the future – we may come up with other heuristics for caching specific values other than functions.  And I'm really not worried myself about the fact that structures may continue to exist after all the objects that they could validly represent go away.  Either they'll just go away too when the codeblock does, or they might happen to be useful again in the future – in which case - great!
Comment 11 Geoffrey Garen 2009-05-22 11:56:58 PDT
>  However the one thing that
> seems to me to be really subtle here is the fact that in a scenario similar to
> the one you describe, the new object allocated over a freed function might
> *not* be a new function, in which case the name specificFunction is badly
> misleading, and storing a JSFunction* to it is clearly incorrect.  Instead, I'd
> suggest simply generalizing the field a little - make it be a JSCell*, called
> something like  'specificValue' or 'cachedCell' or 'fixedType'.

Agreed.

> And I'm really not worried myself about the fact that
> structures may continue to exist after all the objects that they could validly
> represent go away.  Either they'll just go away too when the codeblock does, or
> they might happen to be useful again in the future – in which case - great!

Yeah, it's definitely a feature that optimized code and structures can stick around and come back to life. I was just worried about the extra wrinkle of optimizing for a specific object -- but I guess op_call has done that all along.
Comment 12 Gavin Barraclough 2009-05-22 16:48:28 PDT
Created attachment 30597 [details]
Fix lack of structure ref', revert putDirect interface (add putDirectFunction).
Comment 13 Geoffrey Garen 2009-05-22 17:06:33 PDT
Comment on attachment 30597 [details]
Fix lack of structure ref', revert putDirect interface (add putDirectFunction).

r=me
Comment 14 Gavin Barraclough 2009-05-22 18:48:51 PDT
Transmitting file data ..............................
Committed revision 44076.