<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>205719</bug_id>
          
          <creation_ts>2020-01-03 05:38:58 -0800</creation_ts>
          <short_desc>Raise WKContentRuleList limit to 300k rules</short_desc>
          <delta_ts>2021-05-28 07:41:11 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>WebKit Misc.</component>
          <version>Safari 13</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>NEW</bug_status>
          <resolution></resolution>
          
          <see_also>https://bugs.webkit.org/show_bug.cgi?id=219626</see_also>
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords>InRadar</keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Krzysztof Jan Modras [:chrmod]">krzysztof.modras</reporter>
          <assigned_to name="Nobody">webkit-unassigned</assigned_to>
          <cc>achristensen</cc>
    
    <cc>am</cc>
    
    <cc>krzysztof.modras</cc>
    
    <cc>mcatanzaro</cc>
    
    <cc>mjs</cc>
    
    <cc>webkit-bug-importer</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>1602395</commentid>
    <comment_count>0</comment_count>
    <who name="Krzysztof Jan Modras [:chrmod]">krzysztof.modras</who>
    <bug_when>2020-01-03 05:38:58 -0800</bug_when>
    <thetext>50k rule limit --- as technical limitation it mostly as inconvenience to developers. 50k is simply not enough to provide sufficient coverage, so in most cases developers simply add multiple lists of less than 50k. AdGuard for instance uses 6 Safari extensions - and even this if not enough to include all block lists.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1602721</commentid>
    <comment_count>1</comment_count>
    <who name="Radar WebKit Bug Importer">webkit-bug-importer</who>
    <bug_when>2020-01-03 20:22:33 -0800</bug_when>
    <thetext>&lt;rdar://problem/58312879&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1620601</commentid>
    <comment_count>2</comment_count>
    <who name="Maciej Stachowiak">mjs</who>
    <bug_when>2020-02-19 02:44:18 -0800</bug_when>
    <thetext>The factors that affect the limit are compile speed for a rule list, memory use for a compiled rule list, and memory use while compiling (if too extreme; we don&apos;t want the compile step to result in a memory use kill.

And I guess match speed, but that doesn&apos;t impose as strict a constraint.

We could test on more recent hardware to find other reasonable values.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1620602</commentid>
    <comment_count>3</comment_count>
    <who name="Maciej Stachowiak">mjs</who>
    <bug_when>2020-02-19 02:45:19 -0800</bug_when>
    <thetext>Assuming there is a limit, what limit do you think would be reasonable?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1620603</commentid>
    <comment_count>4</comment_count>
    <who name="Maciej Stachowiak">mjs</who>
    <bug_when>2020-02-19 02:45:47 -0800</bug_when>
    <thetext>&lt;rdar://problem/36115489&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1620614</commentid>
    <comment_count>5</comment_count>
    <who name="Andrey Meshkov">am</who>
    <bug_when>2020-02-19 03:17:57 -0800</bug_when>
    <thetext>Well, the compilation speed and memory usage is an issue indeed, it is quite problematic even with the current limit.

We may try looking into it, but with the current prefixes-tree implementation I don&apos;t think this would be easy to solve.

Would you consider an alternative implementation of the `trigger.url-filter` syntax (maybe with a different name)? Instead of regular expressions, we could use a different syntax, the one that&apos;s supported by almost every content blocker: https://kb.adguard.com/en/general/how-to-create-your-own-ad-filters#examples

In Manifest V3 Google uses pretty much the same syntax.

We have a C++ implementation that we can try adapting for WebKit if we agree on details.

&gt; Assuming there is a limit, what limit do you think would be reasonable?

300k</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1621071</commentid>
    <comment_count>6</comment_count>
    <who name="Maciej Stachowiak">mjs</who>
    <bug_when>2020-02-19 21:27:08 -0800</bug_when>
    <thetext>Request to support an alternate syntax probable needs to be a separate bug. The alternate syntax does not look like it would be easier to compile or evaluate, among other things it includes regexes as a subset. So it&apos;s probably a separate topic.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1621072</commentid>
    <comment_count>7</comment_count>
    <who name="Maciej Stachowiak">mjs</who>
    <bug_when>2020-02-19 21:29:02 -0800</bug_when>
    <thetext>(More efficient compilation and/or evaluation welcome of course, even if it uses different algorithms.)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1621129</commentid>
    <comment_count>8</comment_count>
    <who name="Andrey Meshkov">am</who>
    <bug_when>2020-02-20 00:33:29 -0800</bug_when>
    <thetext>(In reply to Maciej Stachowiak from comment #6)
&gt; Request to support an alternate syntax probable needs to be a separate bug.
&gt; The alternate syntax does not look like it would be easier to compile or
&gt; evaluate, among other things it includes regexes as a subset. So it&apos;s
&gt; probably a separate topic.

Got it, I&apos;ll file a feature request and try to explain all the pros and cons.

We&apos;ll need some time to do that as I&apos;d like to prepare a dirty patch with the proposed syntax to demonstrate the difference.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1761709</commentid>
    <comment_count>9</comment_count>
    <who name="Maciej Stachowiak">mjs</who>
    <bug_when>2021-05-19 14:27:40 -0700</bug_when>
    <thetext>The current shipping limit has been increased from 50k to 150k. Not quite 300k but a lot more than it used to be.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1761713</commentid>
    <comment_count>10</comment_count>
    <who name="Andrey Meshkov">am</who>
    <bug_when>2021-05-19 14:38:37 -0700</bug_when>
    <thetext>(In reply to Maciej Stachowiak from comment #9)
&gt; The current shipping limit has been increased from 50k to 150k. Not quite
&gt; 300k but a lot more than it used to be.

Awesome news, thank you!</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>