Bug 29021 - Explore allowing pages with unload handlers into the Page Cache
Summary: Explore allowing pages with unload handlers into the Page Cache
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Brady Eidson
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2009-09-07 22:05 PDT by Brady Eidson
Modified: 2010-01-26 07:51 PST (History)
14 users (show)

See Also:


Attachments
Disable the unload handler check on Mac + Windows (4.27 KB, patch)
2009-09-14 20:05 PDT, Brady Eidson
no flags Details | Formatted Diff | Diff
Slightly better preprocessor style this time (4.26 KB, patch)
2009-09-14 20:14 PDT, Brady Eidson
no flags Details | Formatted Diff | Diff
Don't fire unload events when leaving a page if the document has been cached. (1.51 KB, patch)
2009-09-16 23:17 PDT, Brady Eidson
beidson: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Brady Eidson 2009-09-07 22:05:31 PDT
We should explore allowing pages with unload handlers into the Page Cache.

Historically, we decided to discount these pages under the assumption that unload handlers do critical work.  Firefox seemed to follow suite a few years later, as evidenced by https://developer.mozilla.org/En/Using_Firefox_1.5_caching

In recent months, while exploring enhancements we can make to the page cache, it's become clear that *many* pages are discounted from our page cache because of unload handlers.  And - quite frankly - most of them don't do important work, don't do interesting work, or don't even do any work at all.

We've now added the pagehide/pageshow events that Firefox added to help developers work around this page caching limitation.  Also, the landscape of the web has changed quite a bit since 2003.

We should explore whether or not keeping this limitation is worthwhile.  This exploration might be as simple as checking in the one line change that will allow these pages in to the page cache, and watch and see if anything important breaks.

In radar as <rdar://problem/7196485>
Comment 1 Sam Weinig 2009-09-09 18:18:18 PDT
I would prefer if this went behind an ENABLE #define to allow ports to easily choose their behavior while this is still under investigation.  In general, this seems like it could impose a functional regression and one with subtle and hard to diagnose fallout at that, and thus giving flexibility during the exploration seems prudent.
Comment 2 Brady Eidson 2009-09-14 20:05:55 PDT
Created attachment 39585 [details]
Disable the unload handler check on Mac + Windows
Comment 3 Brady Eidson 2009-09-14 20:06:56 PDT
Once this gets checked in, I have a blog post prepared that will discuss the state of the page cache, and put a call out to site/library developers to experiment and comment.
Comment 4 Brady Eidson 2009-09-14 20:14:18 PDT
Created attachment 39586 [details]
Slightly better preprocessor style this time
Comment 5 Brady Eidson 2009-09-14 21:33:52 PDT
Committed r48380: <http://trac.webkit.org/changeset/48380>
Comment 6 Antti Koivisto 2009-09-14 23:15:27 PDT
I think you mean http://trac.webkit.org/changeset/48388
Comment 7 Brady Eidson 2009-09-14 23:27:15 PDT
Yes, I did, and I'm never using bugzilla-tool again!
Comment 8 Eric Seidel 2009-09-15 09:40:22 PDT
(In reply to comment #7)
> Yes, I did, and I'm never using bugzilla-tool again!

Hum.  I've never seen it post the wrong revision number like that, but I would love to find out why!  I'll follow up with you via email and file/fix any necessary bugs.
Comment 9 Brady Eidson 2009-09-15 18:07:56 PDT
Re-opening, as this bug needs to track the possibility of putting this check back in to place.
Comment 10 Brady Eidson 2009-09-16 23:17:15 PDT
Created attachment 39681 [details]
Don't fire unload events when leaving a page if the document has been cached.

As long as we're running this experiment, lets run it right!  Don't fire unload events if the document was put in the Page Cache.
Comment 11 Sam Weinig 2009-09-16 23:29:23 PDT
Comment on attachment 39681 [details]
Don't fire unload events when leaving a page if the document has been cached.

You sure this can't be tested?
Comment 12 Brady Eidson 2009-09-16 23:35:54 PDT
I can't think of a good way right now that wouldn't some *very* ugly hacking in WebCore simply for DRTs benefit.

Perhaps it's my lack of caffiene and sleep.  If I think of one in the morning, I'll add it.
Comment 13 Brady Eidson 2009-09-16 23:37:03 PDT
http://trac.webkit.org/changeset/48456
Comment 14 Jake Archibald 2009-09-17 15:39:37 PDT
You could save the page to the page cache, then fire the unload event and navigate away. If back is clicked, the page will be returned in the state it was BEFORE unload was fired. This gives you the best of both worlds. (this was suggested in Chrome's bug tracker)

If you're going to fire unload when the page is removed from memory (rather than when it is navigated away from) you need to be careful with "Your work is unsaved, are you sure you want to navigate away?" type messages. Applications like Gmail use these. However, the solution above works around this.

Also, (and webkit may do this already) you need to make sure the element that the mouse is currently over has its mouseout event fired. Else you might get sticky hover effects.

It's great that this is being looked into, as many js libraries use the unload event to clean up in IE & prevent the sticky hover issue, breaking the page cache.

Jake.
Comment 15 Paul Sowden 2009-09-21 23:02:49 PDT
Hey, I just read the blog post over on Sufin' Safari (really good btw) and I thought I'd throw out our use case for the onunload event as a data point.

I work at Meebo and we have two pieces of JavaScript that make heavy use of this event, meebo.com and a Meebo chat bar that is syndicated on partner Web sites.  In both of these cases we're using long polling to maintain a connection to our servers to power IM presence for the user.  We use the onunload event to send a "quit" command to the server to log out of IM when the page is navigated away.

For meebo.com (which is a destination IM Web site) we also use the onbeforeunload event to warn the user that navigating from the page will cause their session to be disconnected.  When the user navigates away from the page we rely on the onunload event in order to disconnect the user from their session, and I actually think in this case it would be undesirable for the quick "Page Cache" backwards navigation, as this would imply that we would reconnect you to IM, when I feel like navigating away from this page should probably tear down the page and land you back on a blank slate if you were to navigate back.  Not firing the onunload event is also undesirable as you'll appear to stay online for another minute or two  after you've left before our server times out the session.

For the Meebo chat bar (which is conceptually very similar to Facebook's chat bar) your presence should actually follow you based on whether you're on the page or not so in this case implementing the pageshow/pagehide events rather than the onunload event would be ideal, and I hope that we'll be able to do this in the not too distant future!

I hope these use cases are useful!
Comment 16 Timothy Hatcher 2009-09-22 21:01:56 PDT
On the flipside, meebo.com could use pagehide to disconnect and pageshow to reconnect you. This would let you keep the same IM "windows" open, and all other sate.

That is what I would expect as a meebo.com user. But I guess there are security concerns…
Comment 17 Eric Seidel 2009-10-05 11:02:47 PDT
I assume this can be closed now that it's landed?
Comment 18 Brady Eidson 2009-10-05 11:30:10 PDT
No.

Unfortunately bugzilla doesn't have a great notion of "temporarily change the code in a given manner and track an experiment", but that's what we're doing here.
Comment 19 Eric Seidel 2009-10-05 11:38:34 PDT
Sounds good.  I'm going to clear the r+ flags since our "to-be-committed" queue doesn't understand "temporary change in code to try an experiment" either. :(
Comment 20 Eric Seidel 2009-10-05 11:39:08 PDT
Comment on attachment 39586 [details]
Slightly better preprocessor style this time

Clearing darin's review on this obsolete patch.
Comment 21 Eric Seidel 2009-10-05 11:39:25 PDT
Comment on attachment 39681 [details]
Don't fire unload events when leaving a page if the document has been cached.

Clearing sam's review now that this patch has been landed.
Comment 22 Brady Eidson 2009-10-27 09:43:11 PDT
We haven't seen a rash of bug reports in the WebKit project as we expected we might.  

But we do know:
-Disabling this exemption cause a couple of benchmark performance regressions
-Some important corporate intranet apps rely on the status-quo unload handler behavior, and this change breaks them.
-Other browsers have seen substantial negative effects in analytics/marketshare/usage numbers when they fired unload handlers less often than the status quo.

While the perf. regressions could be explored and fixed, then other two points are pretty damning.

For now I think we're re-enabling this check.
Comment 23 Brady Eidson 2009-10-27 09:45:18 PDT
Re-enabled in http://trac.webkit.org/projects/webkit/changeset/50153

I'll close this bug for now.  If/when we come up with new ideas to improve this situation in the future, I'll file a new bug.
Comment 24 Jake Archibald 2009-10-27 10:00:55 PDT
What about my earlier comment about caching the state of the page before unload? If it's being ignored because the comment should have been made elsewhere, let me know.

"We haven't seen a rash of bug reports in the WebKit project as we expected we
might."

I'm very surprised by this. The latest version of jQuery (1.3.2) detaches ALL event listeners on unload. This should have broken most interactivity on the page added via jQuery.

I've just checked in the jQuery trunk, they've changed this code so it only runs in IE, however that isn't in an official jQuery release yet.

"Other browsers have seen substantial negative effects in
analytics/marketshare/usage numbers when they fired unload handlers less often
than the status quo."

Is this really a deal breaker? Shouldn't the priority be the user experience rather than usage stats? Could you work around this by sending an async request to the page after it's called back from the page cache?

"Some important corporate intranet apps rely on the status-quo unload handler
behavior, and this change breaks them."

Do these also operate in standards mode? Either way, caching the state of the page before unload is called should solve this.

Jake.
Comment 25 Brady Eidson 2009-10-27 10:18:02 PDT
I thought I replied to your suggestion when you originally made it.  Sometimes I fail at using bugzilla correctly  ;)

"You could save the page to the page cache, then fire the unload event and
navigate away. If back is clicked, the page will be returned in the state it
was BEFORE unload was fired."

There's two major problems with this approach.
1 - It's technologically VERY very difficult.  No, not impossible, and we're not afraid of doing difficult things, but this would require a major re-architecturing of both JavaScripteCore and WebCore to accomplish.  A major undertaking that we'd need to conclusively know was worth it.
2 - Loads and unloads need to be balance.  What you're suggesting opens the possibility that unload is fired twice (Or more!) for only a single corresponding load event.  This is bad(tm)  :)

>Is this really a deal breaker? Shouldn't the priority be the user experience rather than usage stats?
It's not only usage stats - it's ALL negative side effects from us not running code that programmers think we'll run.  This whole experiment was about pushing the limits in balancing user experience and fulfilling an API contract with developers.  We learned a lot.  We raised awareness.  This is good for users in the long run.

>Could you work around this by sending an async request to the page after it's called back from the page cache?
I encourage you to follow progress in https://bugs.webkit.org/show_bug.cgi?id=30457 and https://bugs.webkit.org/show_bug.cgi?id=30458.  But without some heuristic manner of determining all of the side effects of running an unload handler (which we most certainly do not have) that still doesn't solve the page cache problem.
Comment 26 Grace Kloba 2009-12-02 20:32:36 PST
Brady, what happen if we call unload as before, which means reverting http://trac.webkit.org/changeset/48456, then we call m_document->dispatchWindowLoadEvent() in CachedFrameBase::restore(). This way, loadEvent and unloadEvent are balanced. Even unload may destroy some critical part, it is better than not cache at all.
Comment 27 Jake Archibald 2009-12-03 01:21:49 PST
It'd be interesting to test, but I predict it would break pages.

It's very rare for an unload handler to be the opposite of a load handler. Load tends to be used to modify elements on the page & start timers, unless people are reverting these things in unload, you'll start getting duplicates appearing on the page.