<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>122891</bug_id>
          
          <creation_ts>2013-10-16 05:27:03 -0700</creation_ts>
          <short_desc>Yarr does not compile Peacekeeper email validation regex</short_desc>
          <delta_ts>2013-10-21 05:16:41 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>JavaScriptCore</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>UNCONFIRMED</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>0</everconfirmed>
          <reporter name="Jan de Mooij">jdemooij</reporter>
          <assigned_to name="Nobody">webkit-unassigned</assigned_to>
          <cc>barraclough</cc>
    
    <cc>benjamin</cc>
    
    <cc>msaboff</cc>
    
    <cc>ossy</cc>
    
    <cc>pvarga</cc>
    
    <cc>sam</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>940356</commentid>
    <comment_count>0</comment_count>
      <attachid>214358</attachid>
    <who name="Jan de Mooij">jdemooij</who>
    <bug_when>2013-10-16 05:27:03 -0700</bug_when>
    <thetext>Created attachment 214358
Shell testcase

I&apos;m attaching a simple shell version of Peacekeeper&apos;s stringValidateForm test.

The good news is that Yarr is able to compile 4 of the 5 regular expressions. The bad news is that the other regex is interpreted and this slows us down. If I run the test in Safari it takes 1200 ms, with the email regex commented out it&apos;s 460 ms so the slow regex is where we spend most of our time on this test.

The email validation part is this:

    input = &quot;jaakko.alajoki@futuremark.com&quot;;
    result = /^\w+([\.-]?\w+)*@\w+([\.-]?\w+)*(\.\w{2,3})+$/.test(input);</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>940357</commentid>
    <comment_count>1</comment_count>
    <who name="Jan de Mooij">jdemooij</who>
    <bug_when>2013-10-16 05:29:47 -0700</bug_when>
    <thetext>Yarr JIT compilation is aborting here I think:

    // We can currently only compile quantity 1 subpatterns that are
    // not copies. We generate a copy in the case of a range quantifier,
    // e.g. /(?:x){3,9}/, or /(?:x)+/ (These are effectively expanded to
    // /(?:x){3,3}(?:x){0,6}/ and /(?:x)(?:x)*/ repectively). The problem
    // comes where the subpattern is capturing, in which case we would
    // need to restore the capture from the first subpattern upon a
    // failure in the second.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>940439</commentid>
    <comment_count>2</comment_count>
    <who name="Gavin Barraclough">barraclough</who>
    <bug_when>2013-10-16 10:52:09 -0700</bug_when>
    <thetext>Yarr JIT now compiles regular expressions twice, in one of two mode – &apos;match-only&apos;, and &apos;include-subpatterns&apos;.

Match-only is used for when the regexp is first run, to scan for the match, at which point we return a boolean result (in the case of &apos;test&apos; matches), or returned as a lazily populated regexp matches array object.

Include-subpattens is used when an subpatten match is explicitly required, for example if an entry in the matches array is accessed.

The restriction referenced in the comment is that the JIT won&apos;t backtrack sub pattern matches, so we can&apos;t compile quantified captures unless we can guarantee they won&apos;t backtrack.  But this restriction doesn&apos;t apply for match-only compilations, which aren&apos;t recording the matched sub patterns anyway.

We can probably pretty much say, if (compileMode == MatchOnly) then we can compile any parens.  We&apos;d have to look a little more closely to check whether any of the restrictions still need to apply.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>940595</commentid>
    <comment_count>3</comment_count>
    <who name="Gavin Barraclough">barraclough</who>
    <bug_when>2013-10-16 17:13:45 -0700</bug_when>
    <thetext>Actually, no, I&apos;m way oversimplifying here. :-( – each iteration would also need separate backtracking state internally, so this does effectively demand stack allocation.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="0"
              isprivate="0"
          >
            <attachid>214358</attachid>
            <date>2013-10-16 05:27:03 -0700</date>
            <delta_ts>2013-10-16 05:27:03 -0700</delta_ts>
            <desc>Shell testcase</desc>
            <filename>test.js</filename>
            <type>application/x-javascript</type>
            <size>664</size>
            <attacher name="Jan de Mooij">jdemooij</attacher>
            
              <data encoding="base64">ZnVuY3Rpb24gcnVuKCkgewogICAgLy8gcHcgc3RyZW5ndGgKICAgIGlucHV0ID0gInBhc3N3b3Jk
MSI7CiAgICByZXN1bHQgPSAvXig/PS57OCx9KSg/PS4qW0EtWl0pKD89LipbYS16XSkoPz0uKlsw
LTldKSg/PS4qXFxXKS4qJC9nLnRlc3QoaW5wdXQpOwogICAgcmVzdWx0ID0gL14oPz0uezcsfSko
KCg/PS4qW0EtWl0pKD89LipbYS16XSkpfCgoPz0uKltBLVpdKSg/PS4qWzAtOV0pKXwoKD89Lipb
YS16XSkoPz0uKlswLTldKSkpLiokL2cudGVzdChpbnB1dCk7CiAgICByZXN1bHQgPSAvKD89Lns2
LH0pLiovZy50ZXN0KGlucHV0KTsKCiAgICAvLyBlbWFpbAogICAgaW5wdXQgPSAiamFha2tvLmFs
YWpva2lAZnV0dXJlbWFyay5jb20iOwogICAgcmVzdWx0ID0gL15cdysoW1wuLV0/XHcrKSpAXHcr
KFtcLi1dP1x3KykqKFwuXHd7MiwzfSkrJC8udGVzdChpbnB1dCk7CgogICAgLy8gcGhvbmUKICAg
IGlucHV0ID0gIjA1MCAzNDIgMTI1MiI7CiAgICByZXN1bHQgPSAvXlwoWzEtOV1cZHsyfVwpXHM/
XGR7M31cLVxkezR9JC8udGVzdChpbnB1dCk7Cn0KZnVuY3Rpb24gdGVzdCgpIHsKICAgIHZhciB0
ID0gbmV3IERhdGU7CiAgICBmb3IgKHZhciBpID0gMDsgaSA8IDEwMDAwMDA7IGkrKykKCXJ1bigp
OwogICAgcHJpbnQobmV3IERhdGUgLSB0KTsKfQp0ZXN0KCk7Cg==
</data>

          </attachment>
      

    </bug>

</bugzilla>