Bug 174044 - RegExp's anchored with .* with \g flag can return wrong match start for strings with multiple matches
Summary: RegExp's anchored with .* with \g flag can return wrong match start for stri...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Michael Saboff
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2017-06-30 14:48 PDT by Michael Saboff
Modified: 2017-10-11 10:19 PDT (History)
7 users (show)

See Also:


Attachments
Patch (11.38 KB, patch)
2017-06-30 15:23 PDT, Michael Saboff
oliver: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Saboff 2017-06-30 14:48:13 PDT
Consider the string:
    s = "\na\na\na\n";
along with the RegExp:
    r = new RegExp(".*\\s.*", "g");

The result of s.match(r) should be a match object with 4 entries, "\na", "\na", "\na" & "\n".
Instead we get "\na", "a\na", "a\na" & "a\n".
Comment 1 Michael Saboff 2017-06-30 15:11:02 PDT
<rdar://problem/33018426>
Comment 2 Michael Saboff 2017-06-30 15:23:07 PDT
Created attachment 314302 [details]
Patch
Comment 3 Oliver Hunt 2017-06-30 15:44:25 PDT
Comment on attachment 314302 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=314302&action=review

> Source/JavaScriptCore/yarr/YarrJIT.cpp:2699
> +        if (m_pattern.m_saveInitialStartValue) {
> +#ifdef HAVE_INITIAL_START_REG
> +            move(index, initialStart);
> +#else
> +            storeToFrame(index, m_pattern.m_initialStartValueFrameLocation);
> +#endif

I almost wish we could bludgeon templates into doing this for us. Almost. (I suspect it would turn into "can I implement register allocation with templates at compile time?" questions :D )
Comment 4 Michael Saboff 2017-06-30 18:17:01 PDT
Committed r219031: <http://trac.webkit.org/changeset/219031>