Bug 125431 - All observable PageLoadState properties should change in an atomic fashion, with properly nested change notifications
Summary: All observable PageLoadState properties should change in an atomic fashion, w...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: mitz
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-12-08 15:28 PST by mitz
Modified: 2013-12-10 18:05 PST (History)
3 users (show)

See Also:


Attachments
Add PageLoadState::commitChangesIfNeeded and Change objects to manage committing changes (33.53 KB, patch)
2013-12-08 15:45 PST, mitz
no flags Details | Formatted Diff | Diff
Add PageLoadState::commitChanges and Transaction objects to manage committing changes (36.10 KB, patch)
2013-12-09 22:15 PST, mitz
andersca: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description mitz 2013-12-08 15:28:34 PST
All observable PageLoadState properties should change in an atomic fashion, with properly nested change notifications
Comment 1 mitz 2013-12-08 15:45:05 PST
Created attachment 218719 [details]
Add PageLoadState::commitChangesIfNeeded and Change objects to manage committing changes
Comment 2 Darin Adler 2013-12-09 10:40:05 PST
Comment on attachment 218719 [details]
Add PageLoadState::commitChangesIfNeeded and Change objects to manage committing changes

View in context: https://bugs.webkit.org/attachment.cgi?id=218719&action=review

The basic approach here seems good. Making sure we don’t forget to commit changes by requiring a change object to make the changes is a clever technique. To me the change object seems like a change delivery deferral object, more like PageGroupLoadDeferrer than anything else. So it’s a little irritating that it’s called “a change”; that doesn’t make it clear that the timing of its destruction is important and observable.

Also kind of sad that the change object doesn’t help us get m_needsCommitting set correctly.

I am not so happy with the unpredictable timing for committing changes when they happen in functions that lack an explicit call to commitChangesIfNeeded. I assume that there is some good reason that this is guaranteed to be OK, but it’s not entirely obvious to me from reading the code. Deferring to the end of the last function that was making a change seems sort of “too far to defer” and yet “not deferred far enough”, but maybe that’s just an unfounded or aesthetic worry.

> Source/WebKit2/UIProcess/PageLoadState.cpp:32
> +    : m_data()

It seems to me that making m_data const is not the right way to express its semantics. Can we leave it non-const and just take care to modify it only in the commit function? I would suggest that instead of using const we give this a name that makes it clear you should not modify it. It is equally incorrect to modify m_pendingData without setting m_needsCommitting as to modify m_data, so I think the const does not help us enough here; I could argue that we need m_pendingData to be const.

If you decide that you like the const trick, there’s a different const trick we can play with m_pendingData, where we make a function that you have to call to get a non-const reference to it, and that call sets m_needsCommitting so you can’t forget to do that.

Maybe m_uncommittedState and m_committedState are good names. And maybe m_needsCommitting could be renamed m_mayHaveUncommittedChanges, because the boolean doesn’t really indicate whether we need committing. Although I suppose it’s like needsDisplay and such so maybe doesn’t need a name change.

> Source/WebKit2/UIProcess/PageLoadState.cpp:34
> +    , m_changeCount(0)

I’m not sure that “change count” is a good name for this because that sounds like a count of the total number of changes to page load state. To me it seems to be a count of the number of page load changes that are currently being done, which the names does not make clear. Maybe “outstanding” or “underway” in the name will help.

> Source/WebKit2/UIProcess/PageLoadState.cpp:37
> +    m_pendingData.state = State::Finished;
> +    m_pendingData.estimatedProgress = 0;

This code could go in a PageLoadState::Data constructor and then we could avoid both the explicit construction of m_data above and avoid these lines of code.

> Source/WebKit2/UIProcess/PageLoadState.cpp:38
> +    *const_cast<Data*>(&m_data) = m_pendingData;

If we keep it, no need to involve a pointer in this type cast:

    const_cast<Data&>(m_data) = m_pendingData;

> Source/WebKit2/UIProcess/PageLoadState.cpp:69
> +void PageLoadState::commitChangesIfNeeded()

I notice that the changes are committed in a particular order. Did we put specific thought into the ordering? If so, please add a brief comment explaining the thinking, including the fact that the “did” ordering is the reverse of the “will”.

> Source/WebKit2/UIProcess/PageLoadState.cpp:76
> +    bool titleChanged = m_pendingData.title != m_data.title;

The other comparisons below have m_data on left and m_pendingData on the right. Would be nice to be consistent.

> Source/WebKit2/UIProcess/PageLoadState.cpp:90
> +    *const_cast<Data*>(&m_data) = m_pendingData;

Same const_cast pointer/reference thing here.

> Source/WebKit2/UIProcess/PageLoadState.cpp:107
>      setState(State::Finished);

Strange that this is done through a function while all the rest is done by setting members of m_pendingData directly.

> Source/WebKit2/UIProcess/PageLoadState.cpp:118
> -    m_pendingAPIRequestURL = String();
> -    m_provisionalURL = String();
> -    m_url = String();
> +    m_pendingData.pendingAPIRequestURL = String();
> +    m_pendingData.provisionalURL = String();
> +    m_pendingData.url = String();
>  
> -    m_unreachableURL = String();
> +    m_pendingData.unreachableURL = String();
>      m_lastUnreachableURL = String();
>  
> -    callObserverCallback(&Observer::willChangeTitle);
> -    m_title = String();
> -    callObserverCallback(&Observer::didChangeTitle);
> -
> -    callObserverCallback(&Observer::willChangeEstimatedProgress);
> -    m_estimatedProgress = 0;
> -    callObserverCallback(&Observer::didChangeEstimatedProgress);
> +    m_pendingData.title = String();
> +
> +    m_pendingData.estimatedProgress = 0;

Maybe PageLoadState::Data should have a reset member function for this.

> Source/WebKit2/UIProcess/PageLoadState.cpp:126
> -String PageLoadState::activeURL() const
> +String PageLoadState::activeURL(const Data& data)

Seems like this could be a member of PageLoadState::Data.

> Source/WebKit2/UIProcess/PageLoadState.cpp:157
>  // Always start progress at initialProgressValue. This helps provide feedback as
>  // soon as a load starts.
>  
>  static const double initialProgressValue = 0.1;

I’d like this better at the top of the file, and I’d like the comment better as a single line.

> Source/WebKit2/UIProcess/PageLoadState.cpp:331
>  void PageLoadState::setState(State state)

This function is not helpful any more. I suggest inlining it at the call sites.

> Source/WebKit2/UIProcess/PageLoadState.cpp:335
> +    m_pendingData.state = state;

Why do we not set m_needsCommitting here? Couldn’t isLoading or activeURL change based on a state change? This question only arises because we left this as a separate function.

> Source/WebKit2/UIProcess/PageLoadState.h:62
> +        WTF_MAKE_NONCOPYABLE(Change);

Since we have a member that is a reference, we don’t need to use this macro. I like the economy of not using it.

> Source/WebKit2/UIProcess/PageLoadState.h:67
> +        Change(Change&& other)
> +            : m_pageLoadState(other.m_pageLoadState)
> +        {
> +        }

I think the compiler will write this move constructor for us, exactly like this, as long as we don’t declare any move or copy constructors. Unless maybe explicitly defining the private constructor inhibits that? I suggest trying without this and I expect we will see that it “just works”.

> Source/WebKit2/UIProcess/PageLoadState.h:75
> +        friend class PageLoadState;

I suspect that this is unneeded. I seem to recall that a nested class always bestows friendship on the class it is nested in.

> Source/WebKit2/UIProcess/PageLoadState.h:89
> +    Change change() { return Change(*this); }

Since the constructor of Change is not explicit, I don’t think we need to invoke the constructor explicitly here. If we do want to be explicit, then I suggest we use the “explicit” keyword when declaring the constructor.

> Source/WebKit2/UIProcess/PageLoadState.h:90
> +    void commitChangesIfNeeded();

This function doesn’t need “if needed” in its name. It always commits changes if they exist and never commits changes that don’t exist, so I think “commit changes” is a good name for it.

> Source/WebKit2/UIProcess/PageLoadState.h:92
> +    void reset(const Change&);

Since the Change object argument is debug-only it would be a little nicer if we could have found an idiom that optimizes the pointer out entirely. Maybe the assertion would be in an inline function and then the function to do the real work would not take the Change object.

> Source/WebKit2/UIProcess/PageLoadState.h:131
> +    void beginChange() { m_changeCount++; }

I’d use pre-increment since that’s more idiomatic in our code (for reasons that are presumably obsolete).

> Source/WebKit2/UIProcess/PageLoadState.h:154
> +    static String activeURL(const Data&);
> +    static double estimatedProgress(const Data&);

I think both of these would be good as members of Data.

> Source/WebKit2/UIProcess/PageLoadState.h:159
> +    String m_lastUnreachableURL;

It’s not obvious that this is part of m_pendingData and needs to participate in the idiom where you need to set m_needsCommitting if you change it. One way to possibly make that clearer would be to have a struct for uncommitted data that derives from the struct for committed data and includes this field.

> Source/WebKit2/UIProcess/PageLoadState.h:162
> +    int m_changeCount;

I suggest considering unsigned instead of int.

> Source/WebKit2/UIProcess/WebPageProxy.cpp:604
> +    auto change = m_pageLoadState.change();
> +
> +    m_pageLoadState.setPendingAPIRequestURL(change, url);

I think it’s unclear that these change objects are RAII objects and people might overlook the importance of their lifetimes.

Should I be concerned that this change will be committed all the way at the end of this loadURL function rather than earlier, say right after this function call? To put this another way, I would be tempted to write this:

    m_pageLoadState.setPendingAPIRequestURL(m_pageLoadState.change(), url);

Is there something wrong with writing it that way in a case like this where there is only a single change?

> Source/WebKit2/UIProcess/WebPageProxy.cpp:688
>      if (!isValid())
>          reattachToWebProcess();
>  
> -    m_pageLoadState.setUnreachableURL(unreachableURL);
> +    auto change = m_pageLoadState.change();
> +
> +    m_pageLoadState.setUnreachableURL(change, unreachableURL);

Strangely, in this function we do the work after calling reattachToWebProcess. Inconsistent with the other cases above.

> Source/WebKit2/UIProcess/WebPageProxy.cpp:2420
> +    const String& pendingAPIRequestURL = m_pageLoadState.pendingAPIRequestURL();
> +    auto change = m_pageLoadState.change();
> +
> +    if (request.url() != pendingAPIRequestURL)
> +        m_pageLoadState.clearPendingAPIRequestURL(change);

I think the pendingAPIRequestURL local variable makes this code harder to read. I suggest not using a local for it.
Comment 3 mitz 2013-12-09 22:15:31 PST
Created attachment 218834 [details]
Add PageLoadState::commitChanges and Transaction objects to manage committing changes
Comment 4 mitz 2013-12-09 22:40:48 PST
(In reply to comment #2)
> (From update of attachment 218719 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=218719&action=review

Thank you for the detailed review. I’ve posted an updated patch which addresses some of your comments. I will also address some of them below:

> The basic approach here seems good. Making sure we don’t forget to commit changes by requiring a change object to make the changes is a clever technique. To me the change object seems like a change delivery deferral object, more like PageGroupLoadDeferrer than anything else. So it’s a little irritating that it’s called “a change”; that doesn’t make it clear that the timing of its destruction is important and observable.

I have renamed it Transaction, even though arguably the semantics don’t fully match.

> Also kind of sad that the change object doesn’t help us get m_needsCommitting set correctly.

I have addressed this by making the mutating function take a Transaction::Token. Constructing a Token, which is now necessary to cause mutation, now sets the flag correctly. Callers can’t construct a Token themselves, only cause it to be constructed as a side effect of calling a mutating function.

> I am not so happy with the unpredictable timing for committing changes when they happen in functions that lack an explicit call to commitChangesIfNeeded. I assume that there is some good reason that this is guaranteed to be OK, but it’s not entirely obvious to me from reading the code. Deferring to the end of the last function that was making a change seems sort of “too far to defer” and yet “not deferred far enough”, but maybe that’s just an unfounded or aesthetic worry.

PageLoadState is for consumption by clients of the WebKit API, so it is mostly important to commit changes before returning to the client, and sometimes before calling out to the client, but otherwise, while executing in the framework, it’s not important to commit changes. By deferring committing changes until we are almost ready to leave framework code, we may be saving ourselves some reentrancy trouble.

We’ll still need to consider how this mechanism interacts with clients that call the API from multiple threads, to the extent that we’re allowing it.

> > Source/WebKit2/UIProcess/PageLoadState.cpp:32
> > +    : m_data()
> 
> It seems to me that making m_data const is not the right way to express its semantics. Can we leave it non-const and just take care to modify it only in the commit function? I would suggest that instead of using const we give this a name that makes it clear you should not modify it. It is equally incorrect to modify m_pendingData without setting m_needsCommitting as to modify m_data, so I think the const does not help us enough here; I could argue that we need m_pendingData to be const.

I dropped the constness.

> If you decide that you like the const trick, there’s a different const trick we can play with m_pendingData, where we make a function that you have to call to get a non-const reference to it, and that call sets m_needsCommitting so you can’t forget to do that.

I didn’t do this, and I addressed changing the flag differently (see above).

> Maybe m_uncommittedState and m_committedState are good names.

I switched to those names.

> And maybe m_needsCommitting could be renamed m_mayHaveUncommittedChanges, because the boolean doesn’t really indicate whether we need committing. Although I suppose it’s like needsDisplay and such so maybe doesn’t need a name change.

I changed to this name anyway.

> 
> > Source/WebKit2/UIProcess/PageLoadState.cpp:34
> > +    , m_changeCount(0)
> 
> I’m not sure that “change count” is a good name for this because that sounds like a count of the total number of changes to page load state. To me it seems to be a count of the number of page load changes that are currently being done, which the names does not make clear. Maybe “outstanding” or “underway” in the name will help.

Renamed to m_outstandingTransactionCount.

> 
> > Source/WebKit2/UIProcess/PageLoadState.cpp:37
> > +    m_pendingData.state = State::Finished;
> > +    m_pendingData.estimatedProgress = 0;
> 
> This code could go in a PageLoadState::Data constructor and then we could avoid both the explicit construction of m_data above and avoid these lines of code.

Moved to the Data constructor.

> I notice that the changes are committed in a particular order. Did we put specific thought into the ordering? If so, please add a brief comment explaining the thinking, including the fact that the “did” ordering is the reverse of the “will”.

The ordering doesn’t matter much, and I haven’t put much thought into it. We may find that we need to change it, but the order should not be part of the API. The “did” being in reverse order to the “will” is a requirement of Cocoa Key-Value Observing (which is our only concrete Observer so far), and I added a comment to the effect.

> > Source/WebKit2/UIProcess/PageLoadState.cpp:76
> > +    bool titleChanged = m_pendingData.title != m_data.title;
> 
> The other comparisons below have m_data on left and m_pendingData on the right. Would be nice to be consistent.

Fixed.

> > Source/WebKit2/UIProcess/PageLoadState.cpp:107
> >      setState(State::Finished);
> 
> Strange that this is done through a function while all the rest is done by setting members of m_pendingData directly.

Fixed.

> 
> > Source/WebKit2/UIProcess/PageLoadState.cpp:118
> > -    m_pendingAPIRequestURL = String();
> > -    m_provisionalURL = String();
> > -    m_url = String();
> > +    m_pendingData.pendingAPIRequestURL = String();
> > +    m_pendingData.provisionalURL = String();
> > +    m_pendingData.url = String();
> >  
> > -    m_unreachableURL = String();
> > +    m_pendingData.unreachableURL = String();
> >      m_lastUnreachableURL = String();
> >  
> > -    callObserverCallback(&Observer::willChangeTitle);
> > -    m_title = String();
> > -    callObserverCallback(&Observer::didChangeTitle);
> > -
> > -    callObserverCallback(&Observer::willChangeEstimatedProgress);
> > -    m_estimatedProgress = 0;
> > -    callObserverCallback(&Observer::didChangeEstimatedProgress);
> > +    m_pendingData.title = String();
> > +
> > +    m_pendingData.estimatedProgress = 0;
> 
> Maybe PageLoadState::Data should have a reset member function for this.

I think of Data as a simple way to get two instances of the data members we need to duplicate. I don’t want to put logic into it.

> > Source/WebKit2/UIProcess/PageLoadState.cpp:126
> > -String PageLoadState::activeURL() const
> > +String PageLoadState::activeURL(const Data& data)
> 
> Seems like this could be a member of PageLoadState::Data.

See above.

> > Source/WebKit2/UIProcess/PageLoadState.cpp:157
> >  // Always start progress at initialProgressValue. This helps provide feedback as
> >  // soon as a load starts.
> >  
> >  static const double initialProgressValue = 0.1;
> 
> I’d like this better at the top of the file, and I’d like the comment better as a single line.

Done.

> 
> > Source/WebKit2/UIProcess/PageLoadState.cpp:331
> >  void PageLoadState::setState(State state)
> 
> This function is not helpful any more. I suggest inlining it at the call sites.

Done.

> > Source/WebKit2/UIProcess/PageLoadState.h:62
> > +        WTF_MAKE_NONCOPYABLE(Change);
> 
> Since we have a member that is a reference, we don’t need to use this macro. I like the economy of not using it.

Forgot to remove this before posting the patch. Maybe I’ll remember to do it before committing it!

> 
> > Source/WebKit2/UIProcess/PageLoadState.h:67
> > +        Change(Change&& other)
> > +            : m_pageLoadState(other.m_pageLoadState)
> > +        {
> > +        }
> 
> I think the compiler will write this move constructor for us, exactly like this, as long as we don’t declare any move or copy constructors. Unless maybe explicitly defining the private constructor inhibits that? I suggest trying without this and I expect we will see that it “just works”.

It didn’t work, at least not with the WTF_MAKE_NONCOPYABLE in place.

> > Source/WebKit2/UIProcess/PageLoadState.h:75
> > +        friend class PageLoadState;
> 
> I suspect that this is unneeded. I seem to recall that a nested class always bestows friendship on the class it is nested in.

It is needed.

> > Source/WebKit2/UIProcess/PageLoadState.h:89
> > +    Change change() { return Change(*this); }
> 
> Since the constructor of Change is not explicit, I don’t think we need to invoke the constructor explicitly here. If we do want to be explicit, then I suggest we use the “explicit” keyword when declaring the constructor.

Changed to return *this.

> 
> > Source/WebKit2/UIProcess/PageLoadState.h:90
> > +    void commitChangesIfNeeded();
> 
> This function doesn’t need “if needed” in its name. It always commits changes if they exist and never commits changes that don’t exist, so I think “commit changes” is a good name for it.

Removed ifNeeded from the name.

> > Source/WebKit2/UIProcess/PageLoadState.h:92
> > +    void reset(const Change&);
> 
> Since the Change object argument is debug-only it would be a little nicer if we could have found an idiom that optimizes the pointer out entirely. Maybe the assertion would be in an inline function and then the function to do the real work would not take the Change object.

The mutating functions now take a Token, which have no data members or virtual functions when assertions are disabled. I don’t know whether that allows the compiler to not pass a pointer, but I’d like to think it does.

> > Source/WebKit2/UIProcess/PageLoadState.h:131
> > +    void beginChange() { m_changeCount++; }
> 
> I’d use pre-increment since that’s more idiomatic in our code (for reasons that are presumably obsolete).

Done!

> > Source/WebKit2/UIProcess/PageLoadState.h:154
> > +    static String activeURL(const Data&);
> > +    static double estimatedProgress(const Data&);
> 
> I think both of these would be good as members of Data.

I disagree.

> > Source/WebKit2/UIProcess/PageLoadState.h:159
> > +    String m_lastUnreachableURL;
> 
> It’s not obvious that this is part of m_pendingData and needs to participate in the idiom where you need to set m_needsCommitting if you change it.

It’s not obvious because it’s not true, as far as I understand.

> > Source/WebKit2/UIProcess/PageLoadState.h:162
> > +    int m_changeCount;
> 
> I suggest considering unsigned instead of int.

Done.

> > Source/WebKit2/UIProcess/WebPageProxy.cpp:604
> > +    auto change = m_pageLoadState.change();
> > +
> > +    m_pageLoadState.setPendingAPIRequestURL(change, url);
> 
> I think it’s unclear that these change objects are RAII objects and people might overlook the importance of their lifetimes.

It’s possible. Hopefully people will look at existing code and at least copy it if they don’t understand why it is the way it is.

> Should I be concerned that this change will be committed all the way at the end of this loadURL function rather than earlier, say right after this function call? To put this another way, I would be tempted to write this:
> 
>     m_pageLoadState.setPendingAPIRequestURL(m_pageLoadState.change(), url);
> 
> Is there something wrong with writing it that way in a case like this where there is only a single change?

No, but I’ve deliberately avoided this pattern so that people aren’t tempted to do it in functions that do multiple changes. I wish there was a way to prevent passing a temporary Change (now: Transaction::Token).

> 
> > Source/WebKit2/UIProcess/WebPageProxy.cpp:688
> >      if (!isValid())
> >          reattachToWebProcess();
> >  
> > -    m_pageLoadState.setUnreachableURL(unreachableURL);
> > +    auto change = m_pageLoadState.change();
> > +
> > +    m_pageLoadState.setUnreachableURL(change, unreachableURL);
> 
> Strangely, in this function we do the work after calling reattachToWebProcess. Inconsistent with the other cases above.

The change log mentions how this addresses a FIXME. Maybe the inconsistency is more important than the FIXME. Maybe we can rely more on KVO in our Cocoa API and remove some of the delegate methods so that this ordering issue becomes moot (for Cocoa, at least).

I think it would be interesting to see how far we can go with an API that doesn’t have a callback like “the provisional request was redirected” because clients can just observe that the provisional URL has changed. Is that not enough information for a client?

> > Source/WebKit2/UIProcess/WebPageProxy.cpp:2420
> > +    const String& pendingAPIRequestURL = m_pageLoadState.pendingAPIRequestURL();
> > +    auto change = m_pageLoadState.change();
> > +
> > +    if (request.url() != pendingAPIRequestURL)
> > +        m_pageLoadState.clearPendingAPIRequestURL(change);
> 
> I think the pendingAPIRequestURL local variable makes this code harder to read. I suggest not using a local for it.

Removed. I made this change back when I had a draconian version of the PageLoadState API that disallowed reading whenever there were outstanding Changes.
Comment 5 Anders Carlsson 2013-12-10 17:39:20 PST
Comment on attachment 218834 [details]
Add PageLoadState::commitChanges and Transaction objects to manage committing changes

View in context: https://bugs.webkit.org/attachment.cgi?id=218834&action=review

> Source/WebKit2/ChangeLog:26
> +        (WebKit::PageLoadState::endTransaction): Added. Calles when a Transaction is desctructed.

Desctructed.

> Source/WebKit2/ChangeLog:69
> +        and made its consturctor initialize state and estimatedProgress.

Consturctor.

> Source/WebKit2/UIProcess/PageLoadState.h:67
> +        Transaction(Transaction&& other)
> +            : m_pageLoadState(other.m_pageLoadState)
> +        {
> +        }

I think this move constructor should “empty out” the old object. 

Otherwise you could get into situations where a moved-from transaction object still invokes endTransaction(). One way to do this is to have a boolean flag. I would just make m_pageLoadState a pointer instead and null it out when moving and then check that it’s not null before calling endTransaction inside the Transaction destructor.

> Source/WebKit2/UIProcess/PageLoadState.h:77
> +        Transaction(PageLoadState& pageLoadState)

I think you should make this constructor explicit.
Comment 6 mitz 2013-12-10 18:05:14 PST
Fixed in <http://trac.webkit.org/r160405>.