Bug 117390

Summary: Introducing StackIterator class.
Product: WebKit Reporter: Mark Lam <mark.lam>
Component: JavaScriptCoreAssignee: Mark Lam <mark.lam>
Status: RESOLVED FIXED    
Severity: Normal CC: fpizlo, ggaren, mhahnenberg, msaboff, oliver
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
the patch.
ggaren: review+
revised patch.
none
patch #3
ggaren: review+
Fixed cause of layout test failures.
none
Delta between new and old rolled out patch.
none
revised patch
none
revised patch as a delta from rolled out patch
ggaren: review-
C++ idiomatic patch
none
C++ idiomatic patch as a delta from rolled out patch ggaren: review+

Description Mark Lam 2013-06-09 23:31:04 PDT
Unify how we iterate the JS stack and remove all the special purpose stack iteration / accessor functions in Interpreter.cpp.
Comment 1 Mark Lam 2013-06-09 23:38:01 PDT
Created attachment 204136 [details]
the patch.
Comment 2 Mark Lam 2013-06-09 23:46:10 PDT
Comment on attachment 204136 [details]
the patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=204136&action=review

> Source/JavaScriptCore/interpreter/StackIterator.cpp:308
> +    // this -> The callee; this is either an inlined callee in which case it already has
> +    //    a pointer to the true caller. Otherwise it contains current PC in the machine
> +    //    caller.
> +    //
> +    // machineCaller -> The caller according to the machine, which may be zero or
> +    //    more frames above the true caller due to inlining.

The above comments are stale, and came from CallFrame::trueCallerFrame().  I'll remove them.

> Source/JavaScriptCore/profiler/ProfileNode.h:133
> +        typedef Vector<RefPtr<ProfileNode> >::const_iterator StackIterator;

I made this private to avoid a name conflict with the new StackIterator class.
Comment 3 Geoffrey Garen 2013-06-10 08:23:52 PDT
Comment on attachment 204136 [details]
the patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=204136&action=review

r=me, but I think this needs a little re-working at the interface level before landing.

> Source/JavaScriptCore/ChangeLog:24
> +        The StackIterator implements a Frame class that inherits from CallFrame.
> +        The StackIterator::Frame serves as reader of the CallFrame that makes
> +        it easier to access information about the frame. The StackIterator::Frame
> +        only adds functions, and no additional data fields.

I like this design. In the past, you've talked about trying to hide or factor out some of CallFrame's interfaces. In future, I think you might be able to use this design for that purpose, too.

> Source/JavaScriptCore/ChangeLog:80
> +          There are 2 versions that can be use as follows:

Grammar: Should be "used".

> Source/JavaScriptCore/ChangeLog:87
> +          2. When you have a CallFrame*, you can print its content
> +             as follows:
> +                 StackIterator::Frame::print(callFrame, indentLevel);

The C++/object-oriented way to do this is to construct a StackIterator and then call iter->print(). Why this additional interface? I don't see it used.

> Source/JavaScriptCore/jsc.cpp:332
> +    for (iter.begin(); !iter.end(); iter++, i++)

This iteration code is not idiomatic C++. It should be something like:
    for (StackIterator iter = exec->begin(); iter != exec->end(); ++iter, ++i)
or, slightly less idiomatic but still an improvement:
    for (StackIterator iter = StackIterator::begin(exec); iter != StackIterator::end(exec); ++iter, ++i)

> Source/JavaScriptCore/API/JSContextRef.cpp:216
> +        // If callee is unknown, but we've not added any frame yet, we should
> +        // still add the frame, because something called us, and gave us arguments.
> +        if (!callee && i)
> +            break;

Would be better to explain when this can happen and why.

> Source/JavaScriptCore/interpreter/StackIterator.cpp:67
> +    seekFrameForFunction(calleeFunctionObj);

We usually use the word "find".

> Source/JavaScriptCore/interpreter/StackIterator.h:73
> +        static Frame* castFromCallFrame(CallFrame* f) { return reinterpret_cast<Frame*>(f); }

I think this would read slightly better as a constructor function. That way, you can hide the detail that this is a simple pointer cast.

> Source/JavaScriptCore/interpreter/StackIterator.h:74
> +        static CallFrame* castToCallFrame(Frame* f) { return reinterpret_cast<CallFrame*>(f); }

Ditto here, as a regular accessor function.

> Source/JavaScriptCore/interpreter/StackIterator.h:85
> +    JS_EXPORT_PRIVATE void begin(FrameFilter = nullptr);
> +    JS_EXPORT_PRIVATE void beginAt(CallFrame* startFrame, FrameFilter = nullptr);
> +    JS_EXPORT_PRIVATE void beginAt(JSFunction* calleeFunctionObj, FrameFilter = nullptr);

These should be constructor functions.

> Source/JavaScriptCore/interpreter/StackIterator.h:92
> +    void operator++(int) { gotoNextFrame(); }

I don't think we want to support post-increment. For this kind of iterator, it would be slow.
Comment 4 Mark Lam 2013-06-10 18:14:19 PDT
(In reply to comment #3)
> > Source/JavaScriptCore/ChangeLog:87
> > +          2. When you have a CallFrame*, you can print its content
> > +             as follows:
> > +                 StackIterator::Frame::print(callFrame, indentLevel);
> 
> The C++/object-oriented way to do this is to construct a StackIterator and then call iter->print(). Why this additional interface? I don't see it used.

This 2nd form is a convenience function that we can call from a debugger.  I've move it into the global namespace and renamed it to debugPrintCallFrame().
 
> > Source/JavaScriptCore/jsc.cpp:332
> > +    for (iter.begin(); !iter.end(); iter++, i++)
> 
> This iteration code is not idiomatic C++. It should be something like:
>     for (StackIterator iter = exec->begin(); iter != exec->end(); ++iter, ++i)

I've adopted this idiom.

> > Source/JavaScriptCore/API/JSContextRef.cpp:216
> > +        // If callee is unknown, but we've not added any frame yet, we should
> > +        // still add the frame, because something called us, and gave us arguments.
> > +        if (!callee && i)
> > +            break;
> 
> Would be better to explain when this can happen and why.

As discussed offline, this was a pre-existing comment that I altered slightly.  Leaving in place.

The rest has been fixed.  A new patch will be uploaded shortly.
Comment 5 Mark Lam 2013-06-10 18:15:31 PDT
Created attachment 204272 [details]
revised patch.
Comment 6 Mark Lam 2013-06-10 18:17:00 PDT
Comment on attachment 204272 [details]
revised patch.

Oops.  Need to update build files to add a new header file.  Also need to add the new file.
Comment 7 Mark Lam 2013-06-10 18:22:27 PDT
Created attachment 204274 [details]
patch #3
Comment 8 Geoffrey Garen 2013-06-10 18:59:24 PDT
Comment on attachment 204274 [details]
patch #3

View in context: https://bugs.webkit.org/attachment.cgi?id=204274&action=review

r=me

> Source/JavaScriptCore/interpreter/CallFrame.h:280
> +        JS_EXPORT_PRIVATE StackIterator beginAt(JSFunction* calleeFunctionObj, StackIterator::FrameFilter = nullptr);

In idiomatic C++, a function that finds a location in a list and returns an iterator is named "find".

Since our plan is to switch to a callback style and remove this interface soon, I guess it's OK to be expedient leave this as-is for now.
Comment 9 Mark Lam 2013-06-10 21:49:27 PDT
Landed in r151420: <http://trac.webkit.org/changeset/151420>.
Comment 10 Mark Lam 2013-06-12 08:04:55 PDT
Rolled out in r151500.
Comment 11 Mark Lam 2013-06-13 18:54:23 PDT
Created attachment 204661 [details]
Fixed cause of layout test failures.
Comment 12 Mark Lam 2013-06-13 21:46:00 PDT
Created attachment 204671 [details]
Delta between new and old rolled out patch.
Comment 13 Geoffrey Garen 2013-06-14 07:28:13 PDT
Comment on attachment 204671 [details]
Delta between new and old rolled out patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=204671&action=review

> Source/JavaScriptCore/ChangeLog:36
> +        - Changed the implementation of StackIterator::end() to return a stub
> +          empty frame instead of nullptr. This ensures that we can never
> +          segfault by deref'ing the iterator even when it has reached end().

I don't think this is an improvement. I don't think we can possibly protect clients from all of the invalid pointer accesses they may write, and pretending that we can is just confusing. (For example, you don't protect me from dereferencing ++end().) And it makes the design more complex.

> Source/JavaScriptCore/interpreter/StackIteratorPrivate.h:57
> +    void begin(FrameFilter = nullptr);
> +    void beginAt(CallFrame* startFrame, FrameFilter = nullptr);
> +    void beginAt(JSFunction* calleeFunctionObj, FrameFilter = nullptr);

Can these be constructor functions?
Comment 14 Mark Lam 2013-06-14 16:08:49 PDT
Created attachment 204742 [details]
revised patch
Comment 15 Mark Lam 2013-06-14 16:10:01 PDT
Created attachment 204743 [details]
revised patch as a delta from rolled out patch
Comment 16 Mark Lam 2013-06-14 16:14:19 PDT
(In reply to comment #13)
> (From update of attachment 204671 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=204671&action=review
> 
> > Source/JavaScriptCore/ChangeLog:36
> > +        - Changed the implementation of StackIterator::end() to return a stub
> > +          empty frame instead of nullptr. This ensures that we can never
> > +          segfault by deref'ing the iterator even when it has reached end().
> 
> I don't think this is an improvement. I don't think we can possibly protect clients from all of the invalid pointer accesses they may write, and pretending that we can is just confusing. (For example, you don't protect me from dereferencing ++end().) And it makes the design more complex.

OK, I've changed it to work this way.

> > Source/JavaScriptCore/interpreter/StackIteratorPrivate.h:57
> > +    void begin(FrameFilter = nullptr);
> > +    void beginAt(CallFrame* startFrame, FrameFilter = nullptr);
> > +    void beginAt(JSFunction* calleeFunctionObj, FrameFilter = nullptr);
> 
> Can these be constructor functions?

Sorry that I didn't catch on the first time you brought this up.  It's fixed now.
Comment 17 Geoffrey Garen 2013-06-16 12:41:02 PDT
Comment on attachment 204743 [details]
revised patch as a delta from rolled out patch

View in context: https://bugs.webkit.org/attachment.cgi?id=204743&action=review

exec->find(exec) is a strange design pattern. We're not looking for exec -- we already have it. exec->begin() would be better.

> Source/JavaScriptCore/interpreter/CallFrame.cpp:102
>      if (!this)

When can 'this' be NULL? This is a strange design pattern. Usually, invoking a member function on a null pointer is not allowed.

> Source/JavaScriptCore/interpreter/CallFrame.cpp:109
>      if (!this)

Ditto.

> Source/JavaScriptCore/interpreter/CallFrame.cpp:116
> +    if (!this)

Ditto.

> Source/JavaScriptCore/interpreter/Interpreter.cpp:413
> +    StackIterator iter = callFrame->find(callFrame);
> +    ++iter;

Don't we need to check for end() here?

> Source/JavaScriptCore/interpreter/StackIterator.cpp:48
> +    if (m_vm)

When can vm be NULL?

> Source/JavaScriptCore/interpreter/StackIterator.cpp:57
> +    if (m_vm)

Ditto.

> Source/JavaScriptCore/interpreter/StackIterator.cpp:66
> +    if (m_vm) {

Ditto.

> Source/JavaScriptCore/interpreter/StackIterator.cpp:76
> +    if (!frame)

When can frame be NULL?

> Source/JavaScriptCore/runtime/InitializeThreading.cpp:71
> +    StackIterator::initializeStatics();

This is no longer needed.

> Source/JavaScriptCore/runtime/JSFunction.cpp:208
>      ++iter;

Don't we need to check for end() here?

> Source/JavaScriptCore/runtime/JSGlobalObjectFunctions.cpp:716
>      ++iter;

Ditto.

> Source/JavaScriptCore/runtime/JSGlobalObjectFunctions.cpp:734
>      ++iter;

Ditto.

> Source/JavaScriptCore/runtime/ObjectConstructor.cpp:144
>      ++iter;

Ditto.
Comment 18 Mark Lam 2013-06-16 14:17:07 PDT
Comment on attachment 204743 [details]
revised patch as a delta from rolled out patch

View in context: https://bugs.webkit.org/attachment.cgi?id=204743&action=review

>> Source/JavaScriptCore/interpreter/CallFrame.cpp:102
>>      if (!this)
> 
> When can 'this' be NULL? This is a strange design pattern. Usually, invoking a member function on a null pointer is not allowed.

This is similar in the way that given a CallFrame* callFrame, we can always invoke callFrame->removeHostCallFrameFlag() on the callFrame pointer even if the pointer is null.  Another example of this approach was when you recommended that we instantiated a Vector<StackFrame> of size 2 in JSXMLHttplRequestCustom.cpp's JSXMLHttpRequest::send(), just so we don't have to check if the caller frame pointer is null.  The result is less verbose code in all the call sites that uses it.

Similarly here, I intended for us to always be able to provide a StackIterator even if the callFrame provided is null.  The intent is that the call sites can uniformly just get a StackIterator, and use it as if the CallFrame pointer is not null, and the StackIterator will behave correctly.  Again, this is for the purpose of reducing the verbosity at all the call sites.  All the checks below for a null frame* and null vm* results from this design goal.  To go with this idiom, StackOperator::operator++() is a no-op once we've reached end().  This is also why, in my previous patch, the StackIterator will point to a stub end "frame" when it reaches end(), but that is now removed.

The alternative to this approach is to change most of the call sites to explicitly check if CallFrame* is null (if not already guaranteed) before creating an iterator.   From surveying the call sites, it looks like the guarantee of a non-null exec is there in most cases.  Hence, this is not such a big deal right now, assuming you let me keep the part of calls to operator++() being a no-op once the iterator reaches end().  Otherwise, the client code will become slightly more cluttered with null checks. 

Do you still prefer to go with putting the burden of explicit null checks on the client sites?

>> Source/JavaScriptCore/interpreter/CallFrame.cpp:109
>>      if (!this)
> 
> Ditto.

See above.

>> Source/JavaScriptCore/interpreter/CallFrame.cpp:116
>> +    if (!this)
> 
> Ditto.

See above.

>> Source/JavaScriptCore/interpreter/Interpreter.cpp:413
>> +    ++iter;
> 
> Don't we need to check for end() here?

No because the iterator ensures that we can never iterate pass end().  After it reaches end(), operator++() is a nop.  That said, I do have to check for end() for the line below though.

>> Source/JavaScriptCore/interpreter/StackIterator.cpp:48
>> +    if (m_vm)
> 
> When can vm be NULL?

See above.

>> Source/JavaScriptCore/interpreter/StackIterator.cpp:57
>> +    if (m_vm)
> 
> Ditto.

See above.

>> Source/JavaScriptCore/interpreter/StackIterator.cpp:66
>> +    if (m_vm) {
> 
> Ditto.

See above.

>> Source/JavaScriptCore/interpreter/StackIterator.cpp:76
>> +    if (!frame)
> 
> When can frame be NULL?

See above.

>> Source/JavaScriptCore/runtime/InitializeThreading.cpp:71
>> +    StackIterator::initializeStatics();
> 
> This is no longer needed.

This is stale from my companion workspace that I used to generate this minimal delta patch.  I forgot to remove it here.  This was already removed in the actual patch.

>> Source/JavaScriptCore/runtime/JSFunction.cpp:208
>>      ++iter;
> 
> Don't we need to check for end() here?

No.  See above.

>> Source/JavaScriptCore/runtime/JSGlobalObjectFunctions.cpp:716
>>      ++iter;
> 
> Ditto.

See above.

>> Source/JavaScriptCore/runtime/JSGlobalObjectFunctions.cpp:734
>>      ++iter;
> 
> Ditto.

See above.

>> Source/JavaScriptCore/runtime/ObjectConstructor.cpp:144
>>      ++iter;
> 
> Ditto.

See above.
Comment 19 Mark Lam 2013-06-16 14:23:53 PDT
(In reply to comment #17)
> (From update of attachment 204743 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=204743&action=review
> 
> exec->find(exec) is a strange design pattern. We're not looking for exec -- we already have it. exec->begin() would be better.

It is not correct that exec->begin() is adequate.  This is because exec is not always the same as vm.topCallFrame.  This faulty assumption was what resulted in the previous batch of layout test failures.

As stated in my ChangeLog, …

          There are 3 types of users of the StackIterator. We need to cater
          to each of these as follows:

          1. Users who want a stack trace
             - Here we need to start iterating from vm.topCallFrame.
             - To do this, create the StackIterator as follows:
                  StackIterator iter = exec->begin();

          2. Users who want their caller frame
             - Here we need to start iterating from the specified CallFrame*.
             - To do this, create the StackIterator as follows:
                  StackIterator iter = exec->find(callFrame);

          3. Users who want the frame of a specific JSFunction object
             - Here we need to start iterating from vm.topCallFrame and find
               the frame that has a callee matching the specified JSFunction*.
             - To do this, create the StackIterator as follows:
                  StackIterator iter = exec->find(functionObj);

we need to distinguish between exec->begin() (which starts iterating from vm.topCallFrame) and exec->find(exec) (which starts iterating from exec).  Currently, exec->find(callFrame) is similar in form as exec->find(functionObj).

I can change all the find's to be overloaded begin's if you think that that is better.  Previously, I had them named beginAt(), but you said find() is the proper idiom, though I may have misunderstood where you were going with that comment.  Please advise as to how you want me to proceed.
Comment 20 Geoffrey Garen 2013-06-16 15:36:22 PDT
> The alternative to this approach is to change most of the call sites to explicitly check if CallFrame* is null (if not already guaranteed) before creating an iterator.

Which call sites? When is this not already guaranteed?

> Hence, this is not such a big deal right now, assuming you let me keep the part of calls to operator++() being a no-op once the iterator reaches end().  Otherwise, the client code will become slightly more cluttered with null checks. 

Ah, I guess you're conflating "null check" and "end() check" based on your knowledge of the implementation detail that end() is implemented as null?

Let me re-state how idiomatic C++ iterators work:

- They start at begin()
- They end one prior to end()
- It is an error to dereference end()
- It is an error to ++ end()
- The operation to look for an item is named find(), and it returns end() if the item could not be found.

This iterator should be idiomatic.

> It is not correct that exec->begin() is adequate.  This is because exec is not always the same as vm.topCallFrame

Clients that want to iterate from exec should call exec->begin(). Clients that want to iterate from vm.topCallFrame should call vm.topCallFrame->begin(). Clients that want to iterate from the first instance of x relative to exec should call exec->find(x).
Comment 21 Mark Lam 2013-06-16 16:54:54 PDT
(In reply to comment #20)
> > The alternative to this approach is to change most of the call sites to explicitly check if CallFrame* is null (if not already guaranteed) before creating an iterator.
> 
> Which call sites? When is this not already guaranteed?

It's not true that most of the sires are not guaranteed.  I wrote that before I did the survey of call sites, which led me to my conclusion that it may not be a big deal to do the end() checks in the call sites.  There were 1 or 2 call sites that I wasn't sure about.  But I'll double check.

> > Hence, this is not such a big deal right now, assuming you let me keep the part of calls to operator++() being a no-op once the iterator reaches end().  Otherwise, the client code will become slightly more cluttered with null checks. 
> 
> Ah, I guess you're conflating "null check" and "end() check" based on your knowledge of the implementation detail that end() is implemented as null?

I did conflate my terminology of the null check with the end() check, but I was not confused about the 2 necessarily being the same at the logical level.  I just think of the problem in a different way (sort of like CallFrame* as a smart handle where null is a legal value, and not a raw pointer) and solved it accordingly.  But since that is not idiomatic C++ as you've pointed out, it's moot.  So, moving on.

> Let me re-state how idiomatic C++ iterators work:
> 
> - They start at begin()
> - They end one prior to end()
> - It is an error to dereference end()
> - It is an error to ++ end()
> - The operation to look for an item is named find(), and it returns end() if the item could not be found.
> 
> This iterator should be idiomatic.

OK.  I will make it so.

> > It is not correct that exec->begin() is adequate.  This is because exec is not always the same as vm.topCallFrame
> 
> Clients that want to iterate from exec should call exec->begin(). Clients that want to iterate from vm.topCallFrame should call vm.topCallFrame->begin(). Clients that want to iterate from the first instance of x relative to exec should call exec->find(x).

Sounds good.  I will apply this pattern.
Comment 22 Mark Lam 2013-06-17 09:49:41 PDT
Created attachment 204830 [details]
C++ idiomatic patch
Comment 23 Mark Lam 2013-06-17 09:51:10 PDT
Created attachment 204831 [details]
C++ idiomatic patch as a delta from rolled out patch
Comment 24 Mark Lam 2013-06-17 09:58:51 PDT
Comment on attachment 204831 [details]
C++ idiomatic patch as a delta from rolled out patch

View in context: https://bugs.webkit.org/attachment.cgi?id=204831&action=review

> Source/JavaScriptCore/interpreter/Interpreter.cpp:413
> +    ++iter;

Geoff, previously, you had asked if we need to do an end() check here before iterating.  The answer is no.  Since we're working on the premise that the callFrame must be valid in order to instantiate the iterator, and we have just instantiated the iterator, we are guaranteed that at least the first ++iter is legal and valid.
Comment 25 Geoffrey Garen 2013-06-17 11:31:14 PDT
Comment on attachment 204831 [details]
C++ idiomatic patch as a delta from rolled out patch

View in context: https://bugs.webkit.org/attachment.cgi?id=204831&action=review

r=me, but please fix the missing end() check before landing.

> Source/JavaScriptCore/runtime/JSFunction.cpp:208
> +    StackIterator iter = exec->find(functionObj, skipOverBoundFunctions);
>      ++iter;

I think we need an end() check here: functionObj may not be on the stack.
Comment 26 Mark Lam 2013-06-17 11:55:33 PDT
(In reply to comment #25)
> (From update of attachment 204831 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=204831&action=review
> 
> r=me, but please fix the missing end() check before landing.
> 
> > Source/JavaScriptCore/runtime/JSFunction.cpp:208
> > +    StackIterator iter = exec->find(functionObj, skipOverBoundFunctions);
> >      ++iter;
> 
> I think we need an end() check here: functionObj may not be on the stack.

Thanks for catching that.  It's fixed.

Landed in r151651: <http://trac.webkit.org/changeset/151651>.