Bug 225559 - Implement standards-compliant user gesture tracking
Summary: Implement standards-compliant user gesture tracking
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit Misc. (show other bugs)
Version: Safari Technology Preview
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on: 247159
Blocks:
  Show dependency treegraph
 
Reported: 2021-05-08 03:57 PDT by Ashley Gullen
Modified: 2022-10-28 04:38 PDT (History)
10 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ashley Gullen 2021-05-08 03:57:48 PDT
Safari has a big pile of inconsistent, API-specific code for tracking user gestures. This makes it hard to know if any given code will work with user gestures. It also seems WebKit keeps adding support for propagating user gestures with various new APIs and specific situations, deepening the complexity and presumably adding lots of complicated code.

This is clearly not sustainable in the long term, and it still does not even cover all use cases (e.g. postMessage across a worker is still omitted and is something we need). There needs to be a general purpose solution. Google designed and developed User Activation v2 which solves this (see https://mustaqahmed.github.io/user-activation-v2/) which is apparently now in the spec. This amounts to just two bits of state and a short timeout, and solves all use cases.

Some examples of the various hacks WebKit has collected over the years: user gestures are allowed to propagate through nested timers (issue 172173, from 2017), loading script elements (issue 174959, from 2017), XMLHttpRequest (issue 197428, from 2019) fetch (issue 214444, from 2020), and requestAnimationFrame (issue 223775, from 2021). What about all other APIs? Where will this end?

Please implement the user gesture tracking state as is now described in the spec. It covers all APIs and use cases, including ones WebKit doesn't handle such as postMessage() to a worker and back, makes it simpler for web developers to understand what the rules are, and presumably means you could delete all the hacks added over the years for the various special cases that have been implemented so far.
Comment 1 Ryosuke Niwa 2021-05-10 16:02:07 PDT
Do you have a specific use case / test case where Safari's heuristics doesn't work?

We're aware that the user activation v2 work Google had but we have some security & privacy concerns with regards to things like allowing system pasteboard access.

It is actually precisely the fact it allows all APIs to propagate the user activation that is of concern since it would make it hard to identify exactly how this new spec'ed behavior will impact various API's user activation requirement.
Comment 2 Ashley Gullen 2021-05-11 01:01:40 PDT
> Do you have a specific use case / test case where Safari's heuristics doesn't work?

- When copying to the clipboard, we have to do some async work to prepare the data to be copied. Currently we use FileReader APIs for that. The async work loses the user gesture and then the copy is blocked. In future we might want to use other APIs to prepare the data to read, such as posting to a worker to process off-thread, DecompressionStream, createImageBitmap, IndexedDB...

- We make a game engine that is capable of running entirely in a Web Worker with OffscreenCanvas. Input events are forwarded to the worker via postMessage() where we then run logic, and any DOM calls are forwarded back to the DOM via postMessage() again. This currently loses the user gesture in Safari and thus blocks our use of OffscreenCanvas in Safari.

> It is actually precisely the fact it allows all APIs to propagate the user activation that is of concern since it would make it hard to identify exactly how this new spec'ed behavior will impact various API's user activation requirement.

Doesn't the simplified model actually make it easier to evaluate?

I'm not clear what is being protected against - surely abusive web content will just make any abusive calls in a synchronous user input event. Blocking a subsequent call up to, say, 1 second later, seems only to block legitimate web content that had to do a bit of async work first.
Comment 3 Ryosuke Niwa 2021-05-11 14:13:02 PDT
(In reply to Ashley Gullen from comment #2)
> > Do you have a specific use case / test case where Safari's heuristics doesn't work?
> 
> - When copying to the clipboard, we have to do some async work to prepare
> the data to be copied. Currently we use FileReader APIs for that. The async
> work loses the user gesture and then the copy is blocked. In future we might
> want to use other APIs to prepare the data to read, such as posting to a
> worker to process off-thread, DecompressionStream, createImageBitmap,
> IndexedDB...

Funny you bring that up. As we mentioned in https://bugs.webkit.org/show_bug.cgi?id=222262, the correct way to use async clipboard API is to initiate the write immediately where each item will be written via promise.

> - We make a game engine that is capable of running entirely in a Web Worker
> with OffscreenCanvas. Input events are forwarded to the worker via
> postMessage() where we then run logic, and any DOM calls are forwarded back
> to the DOM via postMessage() again. This currently loses the user gesture in
> Safari and thus blocks our use of OffscreenCanvas in Safari.

I'm a bit confused here. OffscreenCanvas isn't supported in Safari. Are you talking about the hypothetical future in which Safari supports OffscreenCanvas? But drawing into OffscreenCanvas or painting it in the main thread wouldn't require any user gesture. So what exactly are we talking about here?
Comment 4 Ashley Gullen 2021-05-12 05:19:28 PDT
> As we mentioned in https://bugs.webkit.org/show_bug.cgi?id=222262, the correct way to use async clipboard API is to initiate the write immediately where each item will be written via promise.

Oh, I didn't know that. I guess it doesn't work for writeText() though, so you need to go through the write() API for that. Part of my point is these rules are complicated and I don't believe Safari documents them all in any one place, so it's hard to figure out combinations of what works and what doesn't work.

Another case I can illustrate is: suppose we want to do a bit of async work and then open a popup window. Safari blocks the popup, but Chrome allows it.

> I'm a bit confused here. OffscreenCanvas isn't supported in Safari. Are you talking about the hypothetical future in which Safari supports OffscreenCanvas?

Yes - it looks like OffscreenCanvas is nearly ready to ship in Safari, but we will not be able to use it on account of user gesture restrictions in Safari.

In Chrome, we have an architecture where we run our entire game engine in a Web Worker, rendering with OffscreenCanvas. This is good for performance isolation as it moves the entire performance overhead of our engine off the main thread. Input events are forwarded to the worker via postMessage(). However workers do not have access to APIs like requestFullscreen(). So if we need a DOM API, we postMessage() back to the DOM with an instruction to make the corresponding call. This works in Chrome, but in Safari this does not propagate the user gesture so blocks the call.

To illustrate this all more clearly I made a quick demo: https://downloads.scirra.com/labs/safariusergesture/index.html

'Open popup after 500ms timer' shows that opening a popup in a setTimeout() callback is allowed, up to a limit. It looks like Safari uses a limit of 1 second in other places, so I guess this applies to setTimeout() too.

'Open popup after async blob read' shows that if you await a FileReader operation for so much as a fraction of a second, Safari loses the user gesture and blocks the subsequent popup - even though it would have been allowed for up to 1 second with setTimeout(). This serves only to impede legitimate use cases.

'Open popup after postMessage roundtrip to worker' demonstrates the architecture we would use with OffscreenCanvas. Again a postMessage() roundtrip loses the user gesture, even though it completes well within 1 second.

All three cases work in Chrome, but only the first works in Safari.

I think the main question is: if user gestures are allowed for up to 1 second anyway, why make it API-specific? It just restricts legitimate use cases, and does not offer any privacy/security benefit as abusive content can still do what it wants within a 1 second window via setTimeout(). Chrome's user gesture model also uses a short timeout, but it doesn't limit the APIs used. So it seems this is just a web compatibility win with no downside.
Comment 5 Radar WebKit Bug Importer 2021-05-15 03:58:17 PDT
<rdar://problem/78054106>
Comment 6 Sam Sneddon [:gsnedders] 2021-08-24 02:02:28 PDT
This is now https://html.spec.whatwg.org/multipage/interaction.html#tracking-user-activation
Comment 7 John Ozbay 2022-04-04 08:23:16 PDT
(In reply to Ryosuke Niwa from comment #1)
> Do you have a specific use case / test case where Safari's heuristics
> doesn't work?

Hello, this week we got affected by this as well, and thought I'll stop by to comment an easy to reproduce real-life use-case example on jsfiddle to demonstrate where Safari's heuristics doesn't work.

Here's the scenario (and I have a jsfiddle link below which you can test on iOS 15.4.1) :

– user presses a 'download file' button.
– we show downloading progress indicator

In an async function: 
– we need to "await fetch" a few json files, 
– do something with json files client side (i.e. await doSomething(), like combine, decrypt contents client side etc – basically anything) 
– then use navigator.share() to let the user share the contents. (let's say an image file, or some textual contents from the json files etc)

Intuitively, end users shouldn't have to press three separate buttons for each step of : fetch, decrypt, and finally to trigger navigator.share(). 
They should just press "download file", and after a brief loading spinner, they should be presented with the native navigator.share popup.

But right now, since all three steps are in an async function, if either the download time or i.e. the decrypt time takes even a few seconds (i.e. due to network conditions, say it's a 10 - 20mb file), we use lose user activation by the time we get to navigator.share(), and since navigator.share() requires user activation, it throws : 

"NotAllowedError: The request is not allowed by the user agent or the platform in the current context, possibly because the user denied permission."

--

Here's a barebone example I prepared to demonstrate this on jsfiddle : 

https://jsfiddle.net/dw6g2opr/8/

If you try this example on iOS 15.4.1 with a relatively decent speed internet connection in Europe (~100mbps on cellular, so maybe try 30-50mbps to be safe?), you'll see that navigator.share() won't work half the time, depending on the network conditions / due to the time it takes to fetch the first two files.

Consequently, at the moment we cannot use navigator.share on iOS, and instead use alternatives like display resulting images then instruct users to tap and hold on images to share etc, or if it's a PDF or another file, opening it in a new tab using a blob URL and asking users to press share from there etc. 

--

Expected behavior / what would be great:  
Users should be able to async fetch a large file, then use navigator.share() to open the native share dialog for that file with a single tap/click.

--

Hoping this helps :-)
Many thanks!