Bug 205719 - Raise WKContentRuleList limit to 300k rules
Summary: Raise WKContentRuleList limit to 300k rules
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit Misc. (show other bugs)
Version: Safari 13
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-01-03 05:38 PST by Krzysztof Jan Modras [:chrmod]
Modified: 2021-05-28 07:41 PDT (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Krzysztof Jan Modras [:chrmod] 2020-01-03 05:38:58 PST
50k rule limit --- as technical limitation it mostly as inconvenience to developers. 50k is simply not enough to provide sufficient coverage, so in most cases developers simply add multiple lists of less than 50k. AdGuard for instance uses 6 Safari extensions - and even this if not enough to include all block lists.
Comment 1 Radar WebKit Bug Importer 2020-01-03 20:22:33 PST
<rdar://problem/58312879>
Comment 2 Maciej Stachowiak 2020-02-19 02:44:18 PST
The factors that affect the limit are compile speed for a rule list, memory use for a compiled rule list, and memory use while compiling (if too extreme; we don't want the compile step to result in a memory use kill.

And I guess match speed, but that doesn't impose as strict a constraint.

We could test on more recent hardware to find other reasonable values.
Comment 3 Maciej Stachowiak 2020-02-19 02:45:19 PST
Assuming there is a limit, what limit do you think would be reasonable?
Comment 4 Maciej Stachowiak 2020-02-19 02:45:47 PST
<rdar://problem/36115489>
Comment 5 Andrey Meshkov 2020-02-19 03:17:57 PST
Well, the compilation speed and memory usage is an issue indeed, it is quite problematic even with the current limit.

We may try looking into it, but with the current prefixes-tree implementation I don't think this would be easy to solve.

Would you consider an alternative implementation of the `trigger.url-filter` syntax (maybe with a different name)? Instead of regular expressions, we could use a different syntax, the one that's supported by almost every content blocker: https://kb.adguard.com/en/general/how-to-create-your-own-ad-filters#examples

In Manifest V3 Google uses pretty much the same syntax.

We have a C++ implementation that we can try adapting for WebKit if we agree on details.

> Assuming there is a limit, what limit do you think would be reasonable?

300k
Comment 6 Maciej Stachowiak 2020-02-19 21:27:08 PST
Request to support an alternate syntax probable needs to be a separate bug. The alternate syntax does not look like it would be easier to compile or evaluate, among other things it includes regexes as a subset. So it's probably a separate topic.
Comment 7 Maciej Stachowiak 2020-02-19 21:29:02 PST
(More efficient compilation and/or evaluation welcome of course, even if it uses different algorithms.)
Comment 8 Andrey Meshkov 2020-02-20 00:33:29 PST
(In reply to Maciej Stachowiak from comment #6)
> Request to support an alternate syntax probable needs to be a separate bug.
> The alternate syntax does not look like it would be easier to compile or
> evaluate, among other things it includes regexes as a subset. So it's
> probably a separate topic.

Got it, I'll file a feature request and try to explain all the pros and cons.

We'll need some time to do that as I'd like to prepare a dirty patch with the proposed syntax to demonstrate the difference.
Comment 9 Maciej Stachowiak 2021-05-19 14:27:40 PDT
The current shipping limit has been increased from 50k to 150k. Not quite 300k but a lot more than it used to be.
Comment 10 Andrey Meshkov 2021-05-19 14:38:37 PDT
(In reply to Maciej Stachowiak from comment #9)
> The current shipping limit has been increased from 50k to 150k. Not quite
> 300k but a lot more than it used to be.

Awesome news, thank you!