Bug 69774

Summary: REGRESSION: High frequency memory warnings cause Safari to hog the CPU doing useless garbage collection
Product: WebKit Reporter: Michael Saboff <msaboff>
Component: WebCore Misc.Assignee: Michael Saboff <msaboff>
Status: RESOLVED FIXED    
Severity: Normal CC: aroben, ggaren, ossy
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: Mac   
OS: OS X 10.7   
Attachments:
Description Flags
Proposed patch
none
Updated patch with additional comment for the choice of 5 seconds
ggaren: review-
Patch that cancels event handlers and reinstalls after delay ggaren: review+

Description Michael Saboff 2011-10-10 11:50:18 PDT
In a low memory situation, Mac OS  can start sending a flurry of low memory events. The current memory pressure handling code will always try to free up memory. This can end up with WebKit processes consuming large amounts of CPU bring to free up memory without any real benefit.
Comment 1 Michael Saboff 2011-10-10 12:04:14 PDT
Created attachment 110379 [details]
Proposed patch

This patch throttles low memory events to no more frequent than once every 5 seconds.  It will drop low memory events, but since the original problem is too many events we'll respond when needed later.
Comment 2 Adam Roben (:aroben) 2011-10-10 13:48:57 PDT
Comment on attachment 110379 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=110379&action=review

> Source/WebCore/platform/mac/MemoryPressureHandlerMac.mm:55
> +static const time_t s_secondsBetweenMemoryCleanup = 5;

How was this value chosen? It would be useful to have a comment explaining so that people looking at this in the future know how to evaluate whether this value is still reasonable.
Comment 3 Filip Pizlo 2011-10-10 13:52:59 PDT
Comment on attachment 110379 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=110379&action=review

>> Source/WebCore/platform/mac/MemoryPressureHandlerMac.mm:55
>> +static const time_t s_secondsBetweenMemoryCleanup = 5;
> 
> How was this value chosen? It would be useful to have a comment explaining so that people looking at this in the future know how to evaluate whether this value is still reasonable.

I'd recomment putting stuff like this into runtime/Heuristics.h.  Then you can name it and also automatically get a story for tuning it.
Comment 4 Michael Saboff 2011-10-10 14:11:54 PDT
Created attachment 110402 [details]
Updated patch with additional comment for the choice of 5 seconds

Given that this change is in WebCore and that the constant is fairly simple, I didn't feel that it made sense to place it in JSC/runtime/Heuristics.h.
Comment 5 Geoffrey Garen 2011-10-10 16:29:38 PDT
Given my understanding of how the warning works on OS X, even with this patch, under memory pressure, the kernel will still constantly call into WebKit's GCD source. It's an improvement to return early, but we could do even better, and avoid the constant calls into WebKit's GCD source as well.

To get the full benefit, you could unregister the GCD source entirely, and set a timer to re-register it after a constant delay.

I think this improvement is worth it, since your comment mentions "greatly reducing CPU usage". Without exact numbers I don't know for sure, but it sounds like your testing showed that this patch didn't eliminate WebKit CPU usage within the constant window. Unregistering the GCD source would eliminate CPU usage within the constant window.
Comment 6 Michael Saboff 2011-10-10 17:04:34 PDT
(In reply to comment #5)
> Given my understanding of how the warning works on OS X, even with this patch, under memory pressure, the kernel will still constantly call into WebKit's GCD source. It's an improvement to return early, but we could do even better, and avoid the constant calls into WebKit's GCD source as well.
> 
> To get the full benefit, you could unregister the GCD source entirely, and set a timer to re-register it after a constant delay.
> 
> I think this improvement is worth it, since your comment mentions "greatly reducing CPU usage". Without exact numbers I don't know for sure, but it sounds like your testing showed that this patch didn't eliminate WebKit CPU usage within the constant window. Unregistering the GCD source would eliminate CPU usage within the constant window.

"Greatly reduced CPU usage" means WebProcess goes from ~100% to ~8%.  My test is to use the notify event "org.WebKit.lowMemory" instead of the system memory pressure event as this is easier to produce and control.  In a terminal I start "while true; do notifyutil -p org.WebKit.lowMemory; done" and then look at the CPU usage using the activity monitor.  Without the change, the WebProcess uses ~ one CPU (~100%) responding to events.  With the change, it uses between 6-10% of a CPU throwing away events.  The UI process isn't impacted much, probably since the GC and other memory clean-up can happen within the time between events.

Note that without the change, it takes a long time (tens of seconds) for the WebProcess to catch up, that is after the shell while loop is killed, CPU stays at 100%.  With the change, the WebProcess returns back to pre-test CPU usage (~2% with the windows/tabs I had open) almost immediately.

Note that I typed this response in a Safari with the change while the test was running!  Before the change, Safari is completely unusable with my test.

If we want to unregister and reregister instead of throwing away events, we can do that.  I was just trying to keep the fix simple since this is a corner case.
Comment 7 Geoffrey Garen 2011-10-12 10:27:02 PDT
I definitely want to acknowledge that your current patch is a big improvement. Based on your numbers, though, it sounds like we're leaving ~6% on the table (~2% baseline load, ~8% load under notification). I think it's worth going after that last ~6%.

I want that last ~6% because I don't think this is a corner case. I think that systems under memory pressure tend to stay that way, as they page out and in different parts of the user's working set.
Comment 8 Michael Saboff 2011-10-12 17:34:55 PDT
Created attachment 110778 [details]
Patch that cancels event handlers and reinstalls after delay

This patch implements the approach that Geoff recommends.

Using the same artificial test using "while 1 - notifyutil" and the WebProcess fluctuates between 2% and 4% CPU with similar pages loaded.
Comment 9 Geoffrey Garen 2011-10-13 15:24:07 PDT
Comment on attachment 110778 [details]
Patch that cancels event handlers and reinstalls after delay

View in context: https://bugs.webkit.org/attachment.cgi?id=110778&action=review

r=me

> Source/WebCore/platform/mac/MemoryPressureHandlerMac.mm:71
> +    if (_timer_event_source) {
> +        dispatch_source_cancel(_timer_event_source);
> +        _timer_event_source = 0;
> +    }

I think this timer cleanup code should go in the timer handler.

If you want, maybe you should add an early return here, so that if the timer is set, and we're supposed to hold off, we ignore new install calls, which would violate our rate limiting plan.
Comment 10 Michael Saboff 2011-10-13 18:15:51 PDT
Committed r97433: <http://trac.webkit.org/changeset/97433>
Comment 11 Csaba Osztrogon√°c 2011-10-13 23:28:21 PDT
... and Qt-win buildfix landed in http://trac.webkit.org/changeset/97443