WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
176435
Add support for RegExp named capture groups
https://bugs.webkit.org/show_bug.cgi?id=176435
Summary
Add support for RegExp named capture groups
Michael Saboff
Reported
2017-09-05 18:10:10 PDT
This is to add support for the ECMAScript proposed Regular Expression named capture groups. See
https://github.com/tc39/proposal-regexp-named-groups
for details. For example, the RegExp /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/ when matched would create a match object result with properties group.year, group.month and group.day in addition to the array like capture results.
Attachments
Patch
(53.04 KB, patch)
2017-09-05 18:24 PDT
,
Michael Saboff
buildbot
: commit-queue-
Details
Formatted Diff
Diff
Archive of layout-test-results from ews107 for mac-elcapitan-wk2
(1.58 MB, application/zip)
2017-09-05 19:55 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews112 for mac-elcapitan
(2.38 MB, application/zip)
2017-09-05 19:58 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews101 for mac-elcapitan
(1.55 MB, application/zip)
2017-09-05 20:02 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews125 for ios-simulator-wk2
(1.41 MB, application/zip)
2017-09-05 20:16 PDT
,
Build Bot
no flags
Details
Updated patch addressing review comments and test failures.
(53.99 KB, patch)
2017-09-06 15:24 PDT
,
Michael Saboff
fpizlo
: review+
Details
Formatted Diff
Diff
Show Obsolete
(5)
View All
Add attachment
proposed patch, testcase, etc.
Michael Saboff
Comment 1
2017-09-05 18:10:34 PDT
<
rdar://problem/33183183
>
Michael Saboff
Comment 2
2017-09-05 18:24:37 PDT
Created
attachment 319965
[details]
Patch The style checker errors are due to the checker looking at code that didn't change or new code following existing formatting.
Build Bot
Comment 3
2017-09-05 18:26:34 PDT
Attachment 319965
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/yarr/YarrSyntaxChecker.cpp:47: Missing space inside { }. [whitespace/braces] [5] ERROR: Source/JavaScriptCore/yarr/YarrSyntaxChecker.cpp:51: Missing space inside { }. [whitespace/braces] [5] ERROR: Source/JavaScriptCore/yarr/YarrPattern.cpp:528: Should have only a single space after a punctuation in a comment. [whitespace/comments] [5] ERROR: Source/JavaScriptCore/yarr/YarrPattern.cpp:957: One space before end of line comments [whitespace/comments] [5] ERROR: Source/JavaScriptCore/yarr/YarrPattern.cpp:958: One space before end of line comments [whitespace/comments] [5] Total errors found: 5 in 21 files If any of these errors are false positives, please file a bug against check-webkit-style.
Build Bot
Comment 4
2017-09-05 19:14:28 PDT
Comment on
attachment 319965
[details]
Patch
Attachment 319965
[details]
did not pass jsc-ews (mac): Output:
http://webkit-queues.webkit.org/results/4458091
New failing tests: jsc-layout-tests.yaml/js/script-tests/string_replace_function.js.layout-ftl-no-cjit mozilla-tests.yaml/ecma_2/RegExp/hex-001.js.mozilla-ftl-eager-no-cjit-validate-phases mozilla-tests.yaml/ecma_2/RegExp/unicode-001.js.mozilla-ftl-eager-no-cjit-validate-phases jsc-layout-tests.yaml/js/script-tests/regexp-in-and-foreach-handling.js.layout-ftl-eager-no-cjit jsc-layout-tests.yaml/js/script-tests/string_replace_function.js.layout-no-llint mozilla-tests.yaml/ecma_2/RegExp/hex-001.js.mozilla-dfg-eager-no-cjit-validate-phases jsc-layout-tests.yaml/js/script-tests/string_replace_function.js.layout-ftl-eager-no-cjit jsc-layout-tests.yaml/js/script-tests/regexp-in-and-foreach-handling.js.layout-dfg-eager-no-cjit jsc-layout-tests.yaml/js/script-tests/string_replace_function.js.layout-dfg-eager-no-cjit microbenchmarks/emscripten-cube2hash.js.ftl-eager-no-cjit-b3o1 mozilla-tests.yaml/ecma_2/RegExp/unicode-001.js.mozilla-dfg-eager-no-cjit-validate-phases jsc-layout-tests.yaml/js/script-tests/string_replace_function.js.layout stress/regexp-replace-proxy.js.ftl-eager-no-cjit mozilla-tests.yaml/ecma_2/RegExp/exec-002.js.mozilla-ftl-eager-no-cjit-validate-phases jsc-layout-tests.yaml/js/script-tests/string_replace_function.js.layout-no-cjit jsc-layout-tests.yaml/js/script-tests/string_replace_function.js.layout-no-ftl
Build Bot
Comment 5
2017-09-05 19:55:26 PDT
Comment on
attachment 319965
[details]
Patch
Attachment 319965
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.webkit.org/results/4458230
New failing tests: js/string_replace_function.html js/dom/string-replace-3.html
Build Bot
Comment 6
2017-09-05 19:55:27 PDT
Created
attachment 319971
[details]
Archive of layout-test-results from ews107 for mac-elcapitan-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-elcapitan-wk2 Platform: Mac OS X 10.11.6
Build Bot
Comment 7
2017-09-05 19:58:20 PDT
Comment on
attachment 319965
[details]
Patch
Attachment 319965
[details]
did not pass mac-debug-ews (mac): Output:
http://webkit-queues.webkit.org/results/4458210
New failing tests: js/string_replace_function.html js/dom/string-replace-3.html
Build Bot
Comment 8
2017-09-05 19:58:21 PDT
Created
attachment 319972
[details]
Archive of layout-test-results from ews112 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews112 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Build Bot
Comment 9
2017-09-05 20:02:21 PDT
Comment on
attachment 319965
[details]
Patch
Attachment 319965
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.webkit.org/results/4458273
New failing tests: js/string_replace_function.html js/dom/string-replace-3.html
Build Bot
Comment 10
2017-09-05 20:02:23 PDT
Created
attachment 319973
[details]
Archive of layout-test-results from ews101 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews101 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Build Bot
Comment 11
2017-09-05 20:16:38 PDT
Comment on
attachment 319965
[details]
Patch
Attachment 319965
[details]
did not pass ios-sim-ews (ios-simulator-wk2): Output:
http://webkit-queues.webkit.org/results/4458260
New failing tests: js/string_replace_function.html js/dom/string-replace-3.html
Build Bot
Comment 12
2017-09-05 20:16:39 PDT
Created
attachment 319974
[details]
Archive of layout-test-results from ews125 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews125 Port: ios-simulator-wk2 Platform: Mac OS X 10.12.5
JF Bastien
Comment 13
2017-09-05 21:26:16 PDT
Comment on
attachment 319965
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=319965&action=review
Not sure I'm the best person to review this, but here's a first pass.
> Source/JavaScriptCore/ChangeLog:12 > + cases of malformed back references in String.prototype.replace() as I beleive that it
"believe"
> Source/JavaScriptCore/ChangeLog:18 > + groups.
Bug to do so?
> Source/JavaScriptCore/dfg/DFGStrengthReductionPhase.cpp:517 > + dataLog("Giving up because of named capture groups.\n");
Link to bug. Does it just print out the message when this is reached, but otherwise bails out of the optimization?
> Source/JavaScriptCore/runtime/RegExp.cpp:239 > + m_namedsGroupToParenIndex.swap(pattern.m_namedsGroupToParenIndex);
"nameds" ?
> Source/JavaScriptCore/runtime/RegExp.h:89 > + if (!i || m_captureGroupNames.size() <= i)
Why isn't index 0 valid?
> Source/JavaScriptCore/runtime/RegExp.h:99 > + return m_namedsGroupToParenIndex.get(groupName);
Can't you do *it instead to avoid re-searching?
> Source/JavaScriptCore/runtime/RegExpMatchesArray.cpp:74 > +enum ShouldCreateGroups { DontCreateGroups, CreateGroups };
enum class
> Source/JavaScriptCore/runtime/RegExpMatchesArray.cpp:76 > +static Structure* createStructureImpl(VM& vm, JSGlobalObject* globalObject, IndexingType indexingType, ShouldCreateGroups shouldCreateGroups= DontCreateGroups)
Missing space before equal.
> Source/JavaScriptCore/runtime/RegExpMatchesArray.cpp:87 > + }
Maybe it's weird here, but I like having a switch that covers both cases of the enum, without a default. That way you get a warning if you forget a case, and as a reader I know without looking up the enum that you've covered all cases. Even here, if DontCreateGroups is just "break;" I feel like it's kinda nice.
> Source/JavaScriptCore/runtime/RegExpMatchesArray.h:106 > + }
Does this need to be duplicated for the good / bad times sides of the branch? Seems like you don't need the `hasNamedCaptures` bool, and `JSObject* groups` could instead be that "bool" (by being non nullptr if it has named captures). Then it's not duplicating the code too?
> Source/JavaScriptCore/runtime/RegExpMatchesArray.h:149 > + if (!groupName.isEmpty())
Empty group name is just a regular capture?
> Source/JavaScriptCore/runtime/StringPrototype.cpp:213 > + // Named back reference
Don't you need a check for `replacement.length() > i + 2` somewhere in here?
> Source/JavaScriptCore/yarr/YarrParser.h:867 > + int tryConsumeUnicodeEscape()
Is this unicode escape stuff related to named capture?
Michael Saboff
Comment 14
2017-09-06 11:33:18 PDT
Comment on
attachment 319965
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=319965&action=review
>> Source/JavaScriptCore/ChangeLog:12 >> + cases of malformed back references in String.prototype.replace() as I beleive that it > > "believe"
Fixed.
>> Source/JavaScriptCore/ChangeLog:18 >> + groups. > > Bug to do so?
I filed
https://bugs.webkit.org/show_bug.cgi?id=176464
.
>> Source/JavaScriptCore/dfg/DFGStrengthReductionPhase.cpp:517 >> + dataLog("Giving up because of named capture groups.\n"); > > Link to bug. > > Does it just print out the message when this is reached, but otherwise bails out of the optimization?
I filed
https://bugs.webkit.org/show_bug.cgi?id=176464
. Yes it logs and bails. The logging matches existing code.
>> Source/JavaScriptCore/runtime/RegExp.cpp:239 >> + m_namedsGroupToParenIndex.swap(pattern.m_namedsGroupToParenIndex); > > "nameds" ?
Changed to m_namedGroupToParenIndex.
>> Source/JavaScriptCore/runtime/RegExp.h:89 >> + if (!i || m_captureGroupNames.size() <= i) > > Why isn't index 0 valid?
Because that is the whole match and not a sub-group.
>> Source/JavaScriptCore/runtime/RegExp.h:99 >> + return m_namedsGroupToParenIndex.get(groupName); > > Can't you do *it instead to avoid re-searching?
it->value is what I want.
>> Source/JavaScriptCore/runtime/RegExpMatchesArray.cpp:74 >> +enum ShouldCreateGroups { DontCreateGroups, CreateGroups }; > > enum class
Fixed.
>> Source/JavaScriptCore/runtime/RegExpMatchesArray.cpp:76 >> +static Structure* createStructureImpl(VM& vm, JSGlobalObject* globalObject, IndexingType indexingType, ShouldCreateGroups shouldCreateGroups= DontCreateGroups) > > Missing space before equal.
Fixed.
>> Source/JavaScriptCore/runtime/RegExpMatchesArray.cpp:87 >> + } > > Maybe it's weird here, but I like having a switch that covers both cases of the enum, without a default. That way you get a warning if you forget a case, and as a reader I know without looking up the enum that you've covered all cases. Even here, if DontCreateGroups is just "break;" I feel like it's kinda nice.
I originally followed existing similar patterns, but I made the change you suggested.
>> Source/JavaScriptCore/runtime/RegExpMatchesArray.h:106 >> + } > > Does this need to be duplicated for the good / bad times sides of the branch? Seems like you don't need the `hasNamedCaptures` bool, and `JSObject* groups` could instead be that "bool" (by being non nullptr if it has named captures). Then it's not duplicating the code too?
I restructured things some and put the creation of the "groups" object inside of the setProperties lambda. That reduced some of the duplication. I kept hasNamedCaptures as it is used to select the structure for the array as well as whether or not "groups" gets created.
>> Source/JavaScriptCore/runtime/RegExpMatchesArray.h:149 >> + if (!groupName.isEmpty()) > > Empty group name is just a regular capture?
Yes. If a capture group name was not provided, then groupName will be empty.
>> Source/JavaScriptCore/runtime/StringPrototype.cpp:213 >> + // Named back reference > > Don't you need a check for `replacement.length() > i + 2` somewhere in here?
No, because it was checked up above at line 188 with "if (i + 1 == replacement.length()) break;".
>> Source/JavaScriptCore/yarr/YarrParser.h:867 >> + int tryConsumeUnicodeEscape() > > Is this unicode escape stuff related to named capture?
Yes, because the identifiers can contain unicode escapes.
Michael Saboff
Comment 15
2017-09-06 15:24:33 PDT
Created
attachment 320067
[details]
Updated patch addressing review comments and test failures.
Build Bot
Comment 16
2017-09-06 15:26:16 PDT
Attachment 320067
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/yarr/YarrSyntaxChecker.cpp:47: Missing space inside { }. [whitespace/braces] [5] ERROR: Source/JavaScriptCore/yarr/YarrSyntaxChecker.cpp:51: Missing space inside { }. [whitespace/braces] [5] ERROR: Source/JavaScriptCore/yarr/YarrPattern.cpp:528: Should have only a single space after a punctuation in a comment. [whitespace/comments] [5] ERROR: Source/JavaScriptCore/yarr/YarrPattern.cpp:957: One space before end of line comments [whitespace/comments] [5] ERROR: Source/JavaScriptCore/yarr/YarrPattern.cpp:958: One space before end of line comments [whitespace/comments] [5] Total errors found: 5 in 21 files If any of these errors are false positives, please file a bug against check-webkit-style.
JF Bastien
Comment 17
2017-09-06 15:32:46 PDT
Comment on
attachment 320067
[details]
Updated patch addressing review comments and test failures. View in context:
https://bugs.webkit.org/attachment.cgi?id=320067&action=review
r=me though it would be good if someone else took a look too since I'm not familiar with the code.
> Source/JavaScriptCore/runtime/RegExpMatchesArray.cpp:74 > +enum class ShouldCreateGroups { DontCreateGroups, CreateGroups };
With enum class I wouldn't repeat "Group" in the enumerated names. Maybe even just ShouldCreateGroup::{No,Yes} ?
Michael Saboff
Comment 18
2017-09-06 15:40:21 PDT
(In reply to JF Bastien from
comment #17
)
> Comment on
attachment 320067
[details]
> Updated patch addressing review comments and test failures. > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=320067&action=review
> > r=me though it would be good if someone else took a look too since I'm not > familiar with the code.
I'll set review back to ? to get someone else to look.
> > Source/JavaScriptCore/runtime/RegExpMatchesArray.cpp:74 > > +enum class ShouldCreateGroups { DontCreateGroups, CreateGroups }; > > With enum class I wouldn't repeat "Group" in the enumerated names. Maybe > even just ShouldCreateGroup::{No,Yes} ?
I changed it to No / Yes.
Filip Pizlo
Comment 19
2017-09-07 12:54:52 PDT
Comment on
attachment 320067
[details]
Updated patch addressing review comments and test failures. View in context:
https://bugs.webkit.org/attachment.cgi?id=320067&action=review
> Source/JavaScriptCore/runtime/RegExpMatchesArray.h:107 > - > +
Revert
> Source/JavaScriptCore/runtime/RegExpMatchesArray.h:130 > - > +
Revert
Michael Saboff
Comment 20
2017-09-07 16:13:40 PDT
Committed
r221769
: <
http://trac.webkit.org/changeset/221769
>
Darin Adler
Comment 21
2017-09-10 14:32:11 PDT
Comment on
attachment 320067
[details]
Updated patch addressing review comments and test failures. View in context:
https://bugs.webkit.org/attachment.cgi?id=320067&action=review
> Source/JavaScriptCore/yarr/YarrParser.h:203 > + NO_RETURN_DUE_TO_ASSERT void atomNamedBackReference(String) { RELEASE_ASSERT_NOT_REACHED(); }
This is unconventional. Since String is just an object that holds a RefPtr, we normally pass "const String&" or "String&&", and very rarely pass "String".
> Source/JavaScriptCore/yarr/YarrSyntaxChecker.cpp:51 > + void atomNamedBackReference(String) {}
This is unconventional. Since String is just an object that holds a RefPtr, we normally pass "const String&" or "String&&", and very rarely pass "String".
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug