Bug 195325

Summary: Canvas context allocation fails because "Total canvas memory use exceeds the maximum limit"
Product: WebKit Reporter: Esseb <esseb>
Component: CanvasAssignee: Dean Jackson <dino>
Status: RESOLVED FIXED    
Severity: Normal CC: bugzillawebkit, dino, fpizlo, ggaren, gsnedders, info, james, mark.lam, maros.pis, patrik+webkit, paul.neave, sabouhallawa, sergemorel_, simon.fraser, tomac, webkit-bug-importer, webkit, wing, ysuzuki
Priority: P2 Keywords: InRadar
Version: Safari 12   
Hardware: iPhone / iPad   
OS: iOS 12   
URL: https://output.jsbin.com/ganafaketa/1
Attachments:
Description Flags
Testcase
none
Screenshot from TimeLine in Safari developer tools none

Description Esseb 2019-03-05 05:11:55 PST
Created attachment 363637 [details]
Testcase

Canvas elements without a strong references do not get garbage collected immediately in Safari on iOS 12. No issues in iOS 11.

There is more information on https://stackoverflow.com/questions/52532614/total-canvas-memory-use-exceeds-the-maximum-limit-safari-12 which also indicate it can occur in Safari on Mac also, but the higher canvas memory limit makes it harder to trigger the bug there.

The regression between Safari on iOS 11 and iOS 12 could be as a result of https://github.com/WebKit/webkit/commit/5d5b478917c685e50d1032ccf761ca53fc8f1b74#diff-b411cd4839e4bbc17b00570536abfa8f making the bug easier to hit with the lower canvas memory limit.

If I explicitly set the canvas elements width and height to 0 before removing references to the canvas elements the issue disappears.

This bug causes a problem with the map on https://www.yr.no/en/map/radar/1-72837/Norway/Oslo/Oslo/Oslo which uses OpenLayers. OpenLayers has an issue tracking this issue also on https://github.com/openlayers/openlayers/issues/9166

I've attached a testcase, but the testcase is also available on https://output.jsbin.com/ganafaketa/1

I've tested this bug with iOS 12.1.4 on an iPhone X.
Comment 1 Esseb 2019-03-05 05:29:08 PST
The error message I get when the bug occurs is:

TypeError: null is not an object (evaluating 'context.fillRect')
Comment 2 Esseb 2019-03-05 05:35:57 PST
Created attachment 363639 [details]
Screenshot from TimeLine in Safari developer tools

I've added a screenshot from TimeLine in Safari developer tools.

I followed the testcase instructions when recording that timeline, the button click that triggered the error message was done just after the 13 second mark.
Comment 3 Sam Sneddon [:gsnedders] 2019-03-05 05:55:42 PST
Interestingly, I can reproduce this in STP 76 on macOS Mojave, but not on Safari 12.0.3.

In STP, there's very few GCs happening while memory consumption increases as you press the button and none lead to the memory reported in the Timeline decreasing.
Comment 4 Radar WebKit Bug Importer 2019-03-05 12:28:47 PST
<rdar://problem/48609162>
Comment 5 Simon Fraser (smfr) 2019-08-05 12:41:04 PDT
Console says:
[Warning] Total canvas memory use exceeds the maximum limit (384 MB). (runner, line 31)
Comment 6 James 2020-03-24 17:55:24 PDT
We are also experiencing this bug in the same way. After hundreds (or thousands) of small canvas elements being created and successfully drawn to with a 2d context, at some point it just gives up and returns null for the context. In our case we were able to switch to HTMLImageElements and then no problem, which shows that the memory budget doesn't have to be so small. These regression bugs are really frustrating for developers.
Comment 7 Peter 2021-11-19 02:50:05 PST
We've been seeing the same issue for a long time. Affects iOS 14/15 as well. Is there any information we can provide?
Comment 8 Serge M. 2021-12-06 02:49:16 PST
Same here, Mobile Safari and especially WKWebView is susceptible to this. Creating a photo editing website is next to impossible with this bug.
Comment 9 patrik 2022-01-31 02:15:14 PST
My sentry account is getting flooded with errors regarding this. 50 in just the last 24 hours to be exact and I have no way of fixing this.
Comment 10 Maros 2022-06-15 03:56:11 PDT
Our app is suffering from this bug and Sentry is flooded, even on newest iOS, is there any plan for fixing this?
Comment 11 Dean Jackson 2022-06-17 18:58:58 PDT
Finally looking into this, sorry. :(

I don't think there are any leaks here, just that the canvases are not collected soon enough. At least, I see the HTMLCanvasElements getting destroyed. I'll check that their backing store goes away.
Comment 12 Dean Jackson 2022-06-21 11:36:43 PDT
Oops. Never pressed submit on this comment from Friday:

Easy to reproduce if I hardcode the canvas limit to 256MB. Seems to be that that code that stops new canvases from being created has nothing to do with the code that triggers a garbage collection, and so we hit the canvas limit.
Comment 13 Dean Jackson 2022-06-21 11:38:05 PDT
This would be expected behaviour if there were still references to the canvas elements, but there isn't in this test.

The fix is to see if it is ok to trigger GC while the HTMLCanvasElement is trying to create backing store, but only if there isn't enough memory, and then retry the allocation.
Comment 14 Dean Jackson 2022-06-21 20:04:03 PDT
After a rather lengthy discussion with folks about this, we're down to three options for a "solution":

A. Do nothing - this is expected behaviour [1]. The workaround of setting the canvas size to 0 is probably the best solution.

B. Remove the canvas-specific limit. This would allow pages to create more/bigger canvas elements, although this means they are more likely to hit the overall Web page limit (iOS process limit) at which point the whole page will crash. [2]

C. Implement a different algorithm such as LRU eviction. The canvas that was used least recently would get evicted. A "use" would be any paint or API call (i.e. a canvas visible on the page wouldn't get evicted).

It is too late in the iOS 16 cycle to consider C. Also, there is no way to tell an exiting canvas/context that its backing store has been evicted (WebGL has a way, but 2d contexts do not).

[1] The canvas objects that are set to null may still be referenced from other places, such as their 2d contexts, or possibly internal references such as the stack or JIT tiers. i.e. it's not always easy to determine if they are reachable.

[2] Is it worse for the page to break more often because a canvas didn't get created, or for the entire page to disappear because too many canvases were created? At least the former is recoverable, and has a partial workaround. One could also wait for a bit until GC actually kicks in and then try creating the canvas again.
Comment 15 Dean Jackson 2022-06-21 20:09:31 PDT
Here is someone complaining and explaining the workaround: https://pqina.nl/blog/total-canvas-memory-use-exceeds-the-maximum-limit/
(although I'm not sure you have to clear the contents - just setting the dimensions should do that for you)

I think in the short term our options are A or B above.

Option A is definitely frustrating for developers. I sympathise :(

Option B is fine, until it isn't. The user might have entered a lot of information onto the page before it is killed by the system. Seems better to have a broken page than killing it completely.

Then again, we don't know how many pages would create so much canvas backing store to trigger a jetsam. I expect it is a small %.

For the moment I'm choosing option A, but looking out for good arguments in the comments.
Comment 16 Peter 2022-06-22 05:09:26 PDT
Hi Dean,

Thank you for looking into this!

> A. Do nothing - this is expected behaviour [1]. The workaround of setting the canvas size to 0 is probably the best solution.
> Option A is definitely frustrating for developers. I sympathise :disappointed:
None of the other browsers have problem with this, so I’m not sure this is expected behavior. The problem is that not only the canvases no longer referenced anywhere on the page aren’t properly collected (and you could do the workaround here), but mainly, that when you refresh a page the memory still keeps being allocated somewhere. And I haven’t seen this behavior with anything else in Safari, that when you reload a page, the memory for the page that is gone would still be allocated somewhere.

Resetting canvas size is also frustrating to users, because when you’re leaving a page, resetting the canvas size creates a delay and content shift before user is navigated to next page.

We’re able to hit the issue just by rendering our pages generating charts on canvases which render just fine on their own, but when the user navigates between multiple pages (which are full reload, no SPA) that contain canvases, they will hit the memory limit after some time and the canvases then crash.

I made a video showcasing this issue on iPad (6th gen) OS 15.5. You can see at 0:55, after a minute of browsing, the browser is no longer able to allocate canvases.
https://drive.google.com/file/d/1Yz22DSRCHe-OTAAAszId4lZLp_TejKRQ/view?usp=sharing

Happy to hop on a call or provide more minimal reproducible examples if needed.
Comment 17 Dean Jackson 2022-06-22 11:35:40 PDT
I still need to investigate this part:

> We are also experiencing this bug in the same way. After hundreds (or thousands) of small canvas elements being created and successfully drawn to with a 2d context, at some point it just gives up and returns null for the context.
Comment 18 Dean Jackson 2022-06-22 11:40:50 PDT
Hi Peter,

> > A. Do nothing - this is expected behaviour [1]. The workaround of setting the canvas size to 0 is probably the best solution.
> > Option A is definitely frustrating for developers. I sympathise :disappointed:

> None of the other browsers have problem with this, so I’m not sure this is
> expected behavior. 

As far as I can tell, it's because they have chosen option B by default. i.e. they'll let you create canvas objects until everything blows up.

That might be the best thing to do in this situation. I'm not sure.

> The problem is that not only the canvases no longer
> referenced anywhere on the page aren’t properly collected (and you could do
> the workaround here), but mainly, that when you refresh a page the memory
> still keeps being allocated somewhere. 

Interesting! That would be a different bug - so I'll try to reproduce it.

Garbage Collection does not necessarily run during a reload though, so you might be seeing the same issue just in a slightly different way.

> Resetting canvas size is also frustrating to users, because when you’re
> leaving a page, resetting the canvas size creates a delay and content shift
> before user is navigated to next page.

OK. This likely is GC not running yet.

Maybe removing the canvas limit is a good idea, because the system memory pressure would force GC.

> Happy to hop on a call or provide more minimal reproducible examples if
> needed.

Thanks!
Comment 19 Dean Jackson 2022-06-22 13:19:47 PDT
Pull request: https://github.com/WebKit/WebKit/pull/1693
Comment 20 Dean Jackson 2022-06-22 13:28:27 PDT
I opened a fake pull request because I assume it is easier to discuss this on GitHub. I'm going to ask for more opinions on the internet.

https://github.com/WebKit/WebKit/pull/1693
Comment 21 Sam Sneddon [:gsnedders] 2022-06-22 21:40:13 PDT
(In reply to Dean Jackson from comment #13)
> This would be expected behaviour if there were still references to the
> canvas elements, but there isn't in this test.
> 
> The fix is to see if it is ok to trigger GC while the HTMLCanvasElement is
> trying to create backing store, but only if there isn't enough memory, and
> then retry the allocation.

Has this been ruled out as an option? Or is there any reason why GC triggers can't be altered to make this less likely?

(In reply to Dean Jackson from comment #14)
> C. Implement a different algorithm such as LRU eviction. The canvas that was
> used least recently would get evicted. A "use" would be any paint or API
> call (i.e. a canvas visible on the page wouldn't get evicted).
> 
> It is too late in the iOS 16 cycle to consider C. Also, there is no way to
> tell an exiting canvas/context that its backing store has been evicted
> (WebGL has a way, but 2d contexts do not).

i.e., move away from getContext() returning null, and move towards evicting some context? As for telling contexts: bug 227280; this has been spec'd since last year, and implemented in Blink (but not Gecko).
Comment 22 Geoffrey Garen 2022-09-23 13:57:25 PDT
(In reply to Sam Sneddon [:gsnedders] from comment #21)
> (In reply to Dean Jackson from comment #13)
> > This would be expected behaviour if there were still references to the
> > canvas elements, but there isn't in this test.
> > 
> > The fix is to see if it is ok to trigger GC while the HTMLCanvasElement is
> > trying to create backing store, but only if there isn't enough memory, and
> > then retry the allocation.
> 
> Has this been ruled out as an option? Or is there any reason why GC triggers
> can't be altered to make this less likely?

The short answer is that everyone thinks they want this until they have it, and then they realize that it is No Fun At All (TM).

If you think about this as a singular failure, it seems obvious that a last ditch synchronous GC would reclaim the dead canvases, release their graphics memory, and then allow you to carry on. Yay!

But the problem is that you almost certainly got into this condition because you were doing something repeatedly. So it will happen again. Suddenly that "last ditch" behavior is actually the steady state of your program.

A steady state algorithm that does a sync GC after every new allocation (or after every N allocations for any fixed value of N) is O(N^2). So, we replace a broken site with a hung site (or a dead battery, or however you prefer to express the consequences of an O(N^2) steady state).

This doesn't mean that we can't consider some sort of last ditch GC feature to help with some cases; but it does mean that it will not solve most cases.

> (In reply to Dean Jackson from comment #14)
> > C. Implement a different algorithm such as LRU eviction. The canvas that was
> > used least recently would get evicted. A "use" would be any paint or API
> > call (i.e. a canvas visible on the page wouldn't get evicted).
> > 
> > It is too late in the iOS 16 cycle to consider C. Also, there is no way to
> > tell an exiting canvas/context that its backing store has been evicted
> > (WebGL has a way, but 2d contexts do not).
> 
> i.e., move away from getContext() returning null, and move towards evicting
> some context? As for telling contexts: bug 227280; this has been spec'd
> since last year, and implemented in Blink (but not Gecko).

Yup!

This is my preferred option.

Note that this problem is not unique to graphics. The same applies to WebSocket, FileSystem, WebGL, and any other finite resource on a computer that is not GC-allocated memory.
Comment 23 Paul Neave 2023-04-12 06:25:56 PDT
I still see this issue in Safari on iOS 16.4.

The real problem is that there is currently no way to resolve this issue when it does happen. Reloading the page doesn't help as Safari immediately complains that "Total canvas memory use exceeds the maximum limit (384 MB)." Navigating to a new URL then back again doesn't fix this issue either. The only fix is to force-quit Safari, which isn't exactly an ideal UX.

A page reload should clear all canvas memory usage back to zero. That way we can at least show a prompt to the user to ask them to reload the page to fix the problem.
Comment 24 Jake 2023-04-18 03:45:56 PDT
The fact the memory doesn't clear between refreshes is really bad.

I made a quick way to test this here:
https://www.mathsuniverse.com/experiments/safari-canvases.html
... hit 'Add' a couple of times to add 10x 6000x6000 canvases each time, and you should get the out of memory error (at least if memory limit is ~2GB as it is on the Mac I'm trying on... but the iPhone SE I'm also testing with has a limit just over 200MB instead).

Now untick the 'release on reload' checkbox then refresh and press 'Add' just once this time, and you'll get the error again because there's a bunch of canvas memory left from before the refresh.

That checkbox toggles whether onbeforeunload tries to release the canvases manually (by setting canvas size to 1x1). Doing that (onbeforeunload) seems like the only valid solution for me and others to use before this bug gets fixed.
Comment 25 Paul Neave 2023-05-25 05:02:37 PDT
Thanks Jake. I can confirm that your example works and demonstrates this bug. Using onbeforeunload works as a fix on desktop Safari, but it doesn't work on iOS.

Ideally Safari would handle this automatically and purge all canvases from memory upon reload.
Comment 26 Peter 2023-05-25 05:26:18 PDT
Hi Paul,

Using onbeforeunload isn't really a solution, because then leaving a page is slower and user sees that the page blinks on the removal of canvas and only then their browser leaves the page creating weird UX.
Comment 27 Paul Neave 2023-05-25 06:05:38 PDT
It's not ideal, but it's an improvement over the current UX where all of the canvas elements break and can never be recreated.

After a little more research, it seems like the 'pagehide' event works to purge canvases more reliably on mobile Safari. But again, this should really be handled by Safari automatically.
Comment 28 Dean Jackson 2023-05-25 17:05:31 PDT
Pull request: https://github.com/WebKit/WebKit/pull/14375
Comment 29 EWS 2023-06-29 13:44:25 PDT
Committed 265628@main (6bd11f3792f0): <https://commits.webkit.org/265628@main>

Reviewed commits have been landed. Closing PR #14375 and removing active labels.
Comment 30 Peter 2023-07-12 09:26:37 PDT
Which iOs version will contain the fix?
Comment 31 Holger Jeromin 2023-07-26 05:38:13 PDT
> Which iOs version will contain the fix?

This issue is linked in the blog post:
https://webkit.org/blog/14390/release-notes-for-safari-technology-preview-174/