Bug 206924 - Content blocker: add an option to replace response with static content
Summary: Content blocker: add an option to replace response with static content
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit Misc. (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-01-29 02:35 PST by Andrey Meshkov
Modified: 2021-08-11 09:24 PDT (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrey Meshkov 2020-01-29 02:35:35 PST
AdGuard and uBlock Origin provide an option to replace the matching request's response with some static content.

The rule modifier is called $redirect and it is nowadays used really often.

Examples of the rules that could be used:
||example.org/script.js$script,redirect=noopjs -- return an empty JS file
||example.org/style.csss$style,redirect=noopcss -- return an empty CSS file

Here's what I suggest: add a new content blocker that will provide the same functionality.
Keep a list of named pre-defined resources that can be used to replace the answer.


Here's how it could look like:

    "action": {
        "type": "redirect",
        "resource": "blank-css"
    }

Here's the list of resources that would be helpful:

* Blank CSS file
* Blank HTML document (to replace iframes)
* Blanked JS file
* Blank text file
* "Blank" MP3 (1-sec mp3 file)
* "Blank" MP4 (1-sec mp4 file)
Comment 1 Radar WebKit Bug Importer 2020-01-29 13:53:44 PST
<rdar://problem/59005386>
Comment 2 Sam Sneddon [:gsnedders] 2021-05-19 16:52:03 PDT
Are empty ("empty") resources sufficient for the vast majority of cases here?

(I'm aware there are some cases, like Google Analytics, where stub implementations of the API might be necessary, c.f. bug 199173 and bug 199704, but these seem to be exceptionally rare in comparison to just wanting the load to succeed.)
Comment 3 Andrey Meshkov 2021-05-20 02:41:37 PDT
(In reply to Sam Sneddon [:gsnedders] from comment #2)
> Are empty ("empty") resources sufficient for the vast majority of cases here?
> 
> (I'm aware there are some cases, like Google Analytics, where stub
> implementations of the API might be necessary, c.f. bug 199173 and bug
> 199704, but these seem to be exceptionally rare in comparison to just
> wanting the load to succeed.)

In AdGuard filters there're over 1000 "redirect" rules.
~800 of them are different kinds of "noop" rules.

There's an important thing about them, though. About ~100of them are different versions of "noopvast" and "noopvmap" - special "stubs" for replacing VAST/VPAID XML files (files with video ads metadata).

I haven't checked exact numbers, but I am pretty sure the situation with uBlock Origin filters is similar to ours. 

Here're the full sources for static non-JS "stubs" we're using:
https://github.com/AdguardTeam/Scriptlets/blob/master/src/redirects/static-redirects.yml
Comment 4 Andrey Meshkov 2021-05-20 02:43:43 PDT
> ~800 of them are different kinds of "noop" rules.

By "noop" I mean "empty" resource stubs -- empty JS, CSS, HTML resources and such.
Comment 5 Sam Sneddon [:gsnedders] 2021-05-20 06:22:15 PDT
(In reply to Andrey Meshkov from comment #3)
> (In reply to Sam Sneddon [:gsnedders] from comment #2)
> > Are empty ("empty") resources sufficient for the vast majority of cases here?
> > 
> > (I'm aware there are some cases, like Google Analytics, where stub
> > implementations of the API might be necessary, c.f. bug 199173 and bug
> > 199704, but these seem to be exceptionally rare in comparison to just
> > wanting the load to succeed.)
> 
> In AdGuard filters there're over 1000 "redirect" rules.
> ~800 of them are different kinds of "noop" rules.
> 
> There's an important thing about them, though. About ~100of them are
> different versions of "noopvast" and "noopvmap" - special "stubs" for
> replacing VAST/VPAID XML files (files with video ads metadata).
> 
> I haven't checked exact numbers, but I am pretty sure the situation with
> uBlock Origin filters is similar to ours. 
> 
> Here're the full sources for static non-JS "stubs" we're using:
> https://github.com/AdguardTeam/Scriptlets/blob/master/src/redirects/static-
> redirects.yml

Right; realistically if WebKit owns the set of possible replacement files then it doesn't matter if they are "empty" or not, but the real desire is to keep them static to avoid a malicious content blocker from being able to load different replacement resources per-user and therefore add a trivial fingerprinting mechanism (e.g., imagine generating different window.uniqueUser = 2314321 JS files as replacements).

Yes, this makes it slightly more costly in terms of adding new files to the set, especially given you're blocked on the next release of the framework rather than being able to modify it within the extension, but hopefully adding a new file would be of comparable difficulty to modifying your YAML file and something other WebKit contributors (yourself or others!) could write patches for.

I guess there'd be some concern about the set of files becoming massive, but it doesn't look like the current set in that repo is very large.
Comment 6 Andrey Meshkov 2021-05-20 08:19:08 PDT
(In reply to Sam Sneddon [:gsnedders] from comment #5)

> Right; realistically if WebKit owns the set of possible replacement files
> then it doesn't matter if they are "empty" or not, but the real desire is to
> keep them static to avoid a malicious content blocker from being able to
> load different replacement resources per-user and therefore add a trivial
> fingerprinting mechanism (e.g., imagine generating different
> window.uniqueUser = 2314321 JS files as replacements).

I completely agree. Also, that's exactly how it works in the ad blockers I mentioned.

Each of them maintains a list of static replacements which then can be used in a declarative manner.

For instance, `||example.org^$replace=noopjs` is a rule that basically says "Replace example.org responses with the specified replacement resource called noopjs".

> I guess there'd be some concern about the set of files becoming massive, but it doesn't look like the current set in that repo is very large.

These "empty" static resources are indeed changed rarely and I doubt there'll be any massive change in the future.

There're also some more complicated replacement resources though (for instance, a stub for Google Analytics). Here's the full list of replacement resources we use at AdGuard: https://github.com/AdguardTeam/Scriptlets/blob/master/dist/redirects.yml
Comment 7 Andrey Meshkov 2021-05-20 08:31:24 PDT
(In reply to Sam Sneddon [:gsnedders] from comment #5)
> and something other WebKit contributors (yourself or others!) could write patches for.

That'd be really great. We are ready to contribute, send pull requests, and help to keep it up-to-date.
Comment 8 Michael Catanzaro 2021-08-10 15:57:47 PDT
(In reply to Andrey Meshkov from comment #7)
> That'd be really great. We are ready to contribute, send pull requests, and
> help to keep it up-to-date.

Sounds like Apple is OK with this feature? At least it seems pretty reasonable for content blockers to be able to inject predefined "empty" responses instead of just blocking content outright.

Sam, if you could confirm that we've reached "patches welcome" state here, then that would allow potential contributors (Andrey?) to feel more comfortable investing time in trying to implement it. Or we could ask on https://lists.webkit.org/pipermail/webkit-dev/.
Comment 9 Sam Sneddon [:gsnedders] 2021-08-11 05:11:50 PDT
(In reply to Michael Catanzaro from comment #8)
> (In reply to Andrey Meshkov from comment #7)
> > That'd be really great. We are ready to contribute, send pull requests, and
> > help to keep it up-to-date.
> 
> Sounds like Apple is OK with this feature? At least it seems pretty
> reasonable for content blockers to be able to inject predefined "empty"
> responses instead of just blocking content outright.
> 
> Sam, if you could confirm that we've reached "patches welcome" state here,
> then that would allow potential contributors (Andrey?) to feel more
> comfortable investing time in trying to implement it. Or we could ask on
> https://lists.webkit.org/pipermail/webkit-dev/.

My understanding is that there's no objection from the Apple WebKit team to this. We do want to be able to vet the substitution files, as I wrote in comment 5, hence we do want some control over them (e.g. by having them as part of WebKit rather than the content blocker). With that restriction, I'm pretty sure patches are welcome!
Comment 10 Alex Christensen 2021-08-11 09:24:09 PDT
I don't think there's objection to a small number of straightforward replacements.  It might be a little tricky to implement with its integration in the loader and the content blocker engine, but it can be done.  The simple resources in your first comment seem ok to me.