<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>210576</bug_id>
          
          <creation_ts>2020-04-15 15:58:10 -0700</creation_ts>
          <short_desc>Certain regexes with range-quantified groups fail to match</short_desc>
          <delta_ts>2020-04-19 16:47:16 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>JavaScriptCore</component>
          <version>WebKit Nightly Build</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>NEW</bug_status>
          <resolution></resolution>
          
          <see_also>https://bugs.webkit.org/show_bug.cgi?id=188407</see_also>
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Ross Kirsling">ross.kirsling</reporter>
          <assigned_to name="Nobody">webkit-unassigned</assigned_to>
          <cc>ashvayka</cc>
    
    <cc>hi</cc>
    
    <cc>msaboff</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>1642019</commentid>
    <comment_count>0</comment_count>
    <who name="Ross Kirsling">ross.kirsling</who>
    <bug_when>2020-04-15 15:58:10 -0700</bug_when>
    <thetext>I&apos;m not even sure what to title this, but I extracted it from test262/harness/testIntl.js.

The following is false for JSC but true for all other engines:
```
/(?:\w+-)+((\w){5,8})-\1/.test(&apos;de-gregory-gregory&apos;)
```

This was as far as I could manage to shrink the regex.
(The backreference can be inlined and the nested group can be made a non-capturing group, but everything else seems needed?)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1642021</commentid>
    <comment_count>1</comment_count>
    <who name="Devin Rousso">hi</who>
    <bug_when>2020-04-15 16:04:01 -0700</bug_when>
    <thetext>This works tho 🤔

```
/(?:\w+-)+(\w{5,8})-\1/.test(&apos;de-gregory-gregory&apos;)
```</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1642023</commentid>
    <comment_count>2</comment_count>
    <who name="Ross Kirsling">ross.kirsling</who>
    <bug_when>2020-04-15 16:05:14 -0700</bug_when>
    <thetext>(In reply to Devin Rousso from comment #1)
&gt; This works tho 🤔
&gt; 
&gt; ```
&gt; /(?:\w+-)+(\w{5,8})-\1/.test(&apos;de-gregory-gregory&apos;)
&gt; ```

Hence the title. :P</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1642026</commentid>
    <comment_count>3</comment_count>
    <who name="Alexey Shvayka">ashvayka</who>
    <bug_when>2020-04-15 16:06:36 -0700</bug_when>
    <thetext>(In reply to Ross Kirsling from comment #0)
&gt; The following is false for JSC but true for all other engines:

Same result in Safari 12.1. I wonder if it&apos;s the same issue as in https://bugs.webkit.org/show_bug.cgi?id=188407?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1642029</commentid>
    <comment_count>4</comment_count>
    <who name="Devin Rousso">hi</who>
    <bug_when>2020-04-15 16:07:52 -0700</bug_when>
    <thetext>(In reply to Ross Kirsling from comment #2)
&gt; (In reply to Devin Rousso from comment #1)
&gt; &gt; This works tho 🤔
&gt; &gt; 
&gt; &gt; ```
&gt; &gt; /(?:\w+-)+(\w{5,8})-\1/.test(&apos;de-gregory-gregory&apos;)
&gt; &gt; ```
&gt; 
&gt; Hence the title. :P

🤦‍♂️</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1643261</commentid>
    <comment_count>5</comment_count>
    <who name="Ross Kirsling">ross.kirsling</who>
    <bug_when>2020-04-19 16:42:16 -0700</bug_when>
    <thetext>Alexey noticed that my shrunken regex in comment 0 succeeds with a `u` flag, but the original regex does not. If we unshrink just a bit, this fails:

  /(?:\w+-)+((\w){5,8})-((\w){5,8}-)*\1/u.test(&apos;de-gregory-gregory&apos;)

...which may suggest multiple issues at play.


I also kept trying ways to further shrink/restrict the comment 0 regex and noticed that the following fails (with or without `u`):

  /^(?:aa~)+(?:a){2,3}~aa?a?a?$/.test(&apos;aa~aa~aaaa&apos;)

...so nested groups may not be necessary, but that (?:a){2,3} is really important. It needs to be a quantified group with a lower bound greater than 1 and an upper bound greater than the lower bound. (Presumably the bound restrictions are needed so that it doesn&apos;t get automatically simplified?)</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>