WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED CONFIGURATION CHANGED
146052
`new RegExp(/a/i, 'g')` should create a new RegExp with the original source, and provided flags
https://bugs.webkit.org/show_bug.cgi?id=146052
Summary
`new RegExp(/a/i, 'g')` should create a new RegExp with the original source, ...
Jordan Harband
Reported
2015-06-17 01:40:28 PDT
As of ES6 -
https://people.mozilla.org/~jorendorff/es6-draft.html#sec-regexp-pattern-flags
step 6.d. is the point where ES5 and below would throw. This bug is to remove the exception from the RegExp constructor (note: it remains in RegExp#compile)
Attachments
Patch
(7.90 KB, patch)
2015-06-17 18:06 PDT
,
Jordan Harband
no flags
Details
Formatted Diff
Diff
Patch
(9.04 KB, patch)
2015-06-17 22:45 PDT
,
Jordan Harband
darin
: review-
buildbot
: commit-queue-
Details
Formatted Diff
Diff
Archive of layout-test-results from ews100 for mac-mavericks
(561.36 KB, application/zip)
2015-06-17 23:18 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews104 for mac-mavericks-wk2
(616.34 KB, application/zip)
2015-06-17 23:21 PDT
,
Build Bot
no flags
Details
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Jordan Harband
Comment 1
2015-06-17 18:06:44 PDT
Created
attachment 255054
[details]
Patch
Darin Adler
Comment 2
2015-06-17 18:35:43 PDT
Comment on
attachment 255054
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=255054&action=review
Tentatively saying review- because the testing doesn’t seem to cover all the new code.
> Source/JavaScriptCore/runtime/RegExpConstructor.cpp:261 > + JSValue sourceValue = arg0.get(exec, exec->propertyNames().source);
This code change seems nothing like the bug title. The bug title says that something should no longer throw, but the code here is looking at the "source" property, something the old code didn’t do at all!
> Source/JavaScriptCore/runtime/RegExpConstructor.cpp:263 > + return 0;
Should be nullptr, not 0. In new code, and ideally in existing code as well.
> Source/JavaScriptCore/runtime/RegExpConstructor.cpp:270 > return 0;
Another case that should be nullptr.
> LayoutTests/fast/regex/script-tests/constructor.js:8 > +shouldBeTrue("re !== RegExp(re, 'i')"); > +shouldBeTrue("String(RegExp(re, 'i')) === '/abc/i'");
This doesn’t seem to be sufficient test coverage. In particular, this doesn’t test whether the existing flags overwrite all the old flags, or simply add to them. I also don’t see any coverage of the new "source" property behavior. Do we have enough tests that any change to the code makes a test fail?
Darin Adler
Comment 3
2015-06-17 18:36:19 PDT
Comment on
attachment 255054
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=255054&action=review
>> LayoutTests/fast/regex/script-tests/constructor.js:8 >> +shouldBeTrue("String(RegExp(re, 'i')) === '/abc/i'"); > > This doesn’t seem to be sufficient test coverage. In particular, this doesn’t test whether the existing flags overwrite all the old flags, or simply add to them. > > I also don’t see any coverage of the new "source" property behavior. Do we have enough tests that any change to the code makes a test fail?
In particular, if the old regular expression can be any object with a "source" property, then we should test that, and not only test with an actual RegExp object.
Darin Adler
Comment 4
2015-06-17 18:40:16 PDT
Comment on
attachment 255054
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=255054&action=review
>> Source/JavaScriptCore/runtime/RegExpConstructor.cpp:261 >> + JSValue sourceValue = arg0.get(exec, exec->propertyNames().source); > > This code change seems nothing like the bug title. The bug title says that something should no longer throw, but the code here is looking at the "source" property, something the old code didn’t do at all!
Hmm, I’m thinking that what we should really do here is extract the old string more directly, with code more like the "arg1.isUndefined()" code path, casting to RegExpObject, getting the RegExp from there and then getting the source string from that. I guess we should also assert !callAsConstructor here; I guess that’s guaranteed because arg1 will always be undefined if we are called as a constructor?
Darin Adler
Comment 5
2015-06-17 18:41:01 PDT
Comment on
attachment 255054
[details]
Patch So it’s possible the tests are OK, and what we should fix here is get the old regular expression in a more straightforward way. In fact, I think there might even be a more efficient way to change the flags on a regular expression rather than recompiling the whole thing.
Jordan Harband
Comment 6
2015-06-17 18:59:08 PDT
I'm open to all your suggestions! I'll try to think over them tonight. Definitely more test coverage is always a good thing; I'll add as many as I can think of.
Jordan Harband
Comment 7
2015-06-17 19:31:45 PDT
(In reply to
comment #4
)
> Comment on
attachment 255054
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=255054&action=review
> > >> Source/JavaScriptCore/runtime/RegExpConstructor.cpp:261 > >> + JSValue sourceValue = arg0.get(exec, exec->propertyNames().source); > > > > This code change seems nothing like the bug title. The bug title says that something should no longer throw, but the code here is looking at the "source" property, something the old code didn’t do at all! > > Hmm, I’m thinking that what we should really do here is extract the old > string more directly, with code more like the "arg1.isUndefined()" code > path, casting to RegExpObject, getting the RegExp from there and then > getting the source string from that. > > I guess we should also assert !callAsConstructor here; I guess that’s > guaranteed because arg1 will always be undefined if we are called as a > constructor?
`new RegExp(/a/g, 'i')` should still work, and `arg1` will not be undefined - that's the new behavior, whether called or constructed :-)
Jordan Harband
Comment 8
2015-06-17 19:32:42 PDT
(In reply to
comment #5
)
> Comment on
attachment 255054
[details]
> Patch > > So it’s possible the tests are OK, and what we should fix here is get the > old regular expression in a more straightforward way. In fact, I think there > might even be a more efficient way to change the flags on a regular > expression rather than recompiling the whole thing.
The spec requires that it be a brand new regular expression in JS-land, not simply a mutation (a la RegExp#compile, which sadly mutates regexes in-place). Is there a more efficient way to do this in C++ without creating an observable mutation in JS?
Jordan Harband
Comment 9
2015-06-17 22:22:58 PDT
(In reply to
comment #3
)
> Comment on
attachment 255054
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=255054&action=review
> > >> LayoutTests/fast/regex/script-tests/constructor.js:8 > >> +shouldBeTrue("String(RegExp(re, 'i')) === '/abc/i'"); > > > > This doesn’t seem to be sufficient test coverage. In particular, this doesn’t test whether the existing flags overwrite all the old flags, or simply add to them. > > > > I also don’t see any coverage of the new "source" property behavior. Do we have enough tests that any change to the code makes a test fail? > > In particular, if the old regular expression can be any object with a > "source" property, then we should test that, and not only test with an > actual RegExp object.
According to
http://www.ecma-international.org/ecma-262/6.0/#sec-regexp-pattern-flags
steps 5 and 6, the "source" and "flags" properties are only used when the provided arg0 has a [[RegExpMatcher]] internal slot - which means, a RegExp or an ES6 subclass. So, it shouldn't work on any old object (other than stringifying it in
http://www.ecma-international.org/ecma-262/6.0/#sec-regexpinitialize
step 2.
Jordan Harband
Comment 10
2015-06-17 22:45:06 PDT
Created
attachment 255085
[details]
Patch
Jordan Harband
Comment 11
2015-06-17 22:46:10 PDT
Next I'll explore (and am anxiously open to suggestions) ways to improve the C++ code for RegExpConstructor, but I've fixed the rest of the comments you've made I think.
Build Bot
Comment 12
2015-06-17 23:18:17 PDT
Comment on
attachment 255085
[details]
Patch
Attachment 255085
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.appspot.com/results/5552113341956096
New failing tests: sputnik/Conformance/15_Native_Objects/15.10_RegExp/15.10.4/S15.10.4.1_A2_T1.html sputnik/Conformance/15_Native_Objects/15.10_RegExp/15.10.3/S15.10.3.1_A2_T2.html sputnik/Conformance/15_Native_Objects/15.10_RegExp/15.10.3/S15.10.3.1_A2_T1.html sputnik/Conformance/15_Native_Objects/15.10_RegExp/15.10.4/S15.10.4.1_A2_T2.html
Build Bot
Comment 13
2015-06-17 23:18:23 PDT
Created
attachment 255088
[details]
Archive of layout-test-results from ews100 for mac-mavericks The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews100 Port: mac-mavericks Platform: Mac OS X 10.9.5
Build Bot
Comment 14
2015-06-17 23:21:49 PDT
Comment on
attachment 255085
[details]
Patch
Attachment 255085
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.appspot.com/results/5717704195440640
New failing tests: sputnik/Conformance/15_Native_Objects/15.10_RegExp/15.10.4/S15.10.4.1_A2_T1.html sputnik/Conformance/15_Native_Objects/15.10_RegExp/15.10.3/S15.10.3.1_A2_T2.html sputnik/Conformance/15_Native_Objects/15.10_RegExp/15.10.3/S15.10.3.1_A2_T1.html sputnik/Conformance/15_Native_Objects/15.10_RegExp/15.10.4/S15.10.4.1_A2_T2.html
Build Bot
Comment 15
2015-06-17 23:21:52 PDT
Created
attachment 255089
[details]
Archive of layout-test-results from ews104 for mac-mavericks-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews104 Port: mac-mavericks-wk2 Platform: Mac OS X 10.9.5
Darin Adler
Comment 16
2015-06-18 10:02:11 PDT
Comment on
attachment 255085
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=255085&action=review
Looks good. review- because tests are still failing. Test coverage still seems quite light to me. I know for some old features we don’t have the coverage we should, but no reason to skimp for this since we are changing it. Our basic strategy is to beef up the testing when we make bug fixes like this one. As far as I can tell, after the patch testing doesn’t cover anything other than a single character string for the second argument. Cases I think we should test for that argument: 1) empty string 2) multiple valid flags characters 3) explicitly passing undefined rather than just omitting an argument 4) null, since it does not behave like undefined although in many other contexts it often does 5) consider at least one other type like true, false, floating point number, integer 6) an object with a toString that yields a valid flags string of some sort 7) an object with a toString that raises an exception Also should test that when the toString on the first argument raises an exception that the second argument’s side effects are not observed; this is what ends up being wrong if we write bindings that evaluate arguments in the wrong order and we need a test to show we didn’t make that mistake. Glad that the actual binding is pretty simple; need to make sure these things work, though.
> Source/JavaScriptCore/runtime/RegExpConstructor.cpp:252 > if (arg0.inherits(RegExpObject::info())) {
Since we are using inherits here, it’s not just RegExpObject itself that will work; there could be other internal subclasses that might have different code paths. To ensure good test coverage we should look over the code and see what other cases exist. If there are no other cases, then we can make this code faster by using == rather than inherits to check if this is a regular expression. If there are other cases, then we need to add test cases for them. I didn’t do the research myself but it should be relatively quick.
> Source/JavaScriptCore/runtime/RegExpConstructor.cpp:264 > + JSValue sourceValue = arg0.get(exec, exec->propertyNames().source); > + if (exec->hadException()) > + return nullptr; > + pattern = sourceValue.toString(exec)->value(exec);
It’s unnecessarily inefficient to use the arbitrary get function here and it also creates the false idea that there are two possible exceptions here, one getting the argument and another converting it to a string, causing us to write the dead code to handle these two impossible exceptions. Let me write out for you explicitly what I was trying to hint at before. This is the one line that should replace the four lines above: pattern = static_cast<RegExpObject*>(asObject(arg0))->regExp()->pattern(); I also mentioned that it was possible to do something even more efficient. The way to do that would be to create a new constructor for RegExp that takes an existing RegExp& and a set of flags. But I understand if you don’t have the expertise to figure out if that can be done more efficiently than the “create an entirely new RegExp” code path. I am not sure if regular expression compilation has to be entirely redone in such cases or not. We can lave that thing out for now.
> Source/JavaScriptCore/runtime/RegExpConstructor.cpp:270 > if (exec->hadException()) > - return 0; > + return nullptr;
Once you do what I have suggested, the exception check is only needed in the non-RegExpObject code path, so please move it inside the else clause.
Jordan Harband
Comment 17
2015-07-24 20:09:02 PDT
(In reply to
comment #16
)
> It’s unnecessarily inefficient to use the arbitrary get function here and it > also creates the false idea that there are two possible exceptions here, one > getting the argument and another converting it to a string, causing us to > write the dead code to handle these two impossible exceptions.
Why is this a false idea? It's *absolutely possible* using defineProperty to make the `get` throw, and separately, the `toString` function could throw. The spec requires that *both* of these possible exception points be triggered, and cause an exception - and if the `get` throws, `toString` must not be observable.
Jordan Harband
Comment 18
2015-07-24 20:32:54 PDT
Thanks for your thorough review - sorry for the delay in responding, i've been out of the country. I'll add the test cases you suggest and in general beef up the test coverage. I'll also fix the (very good catch) point that the function is generic, and doesn't require a RegExp object. I'll wait til we figure out the get/toString thing before making changes there.
Darin Adler
Comment 19
2015-07-26 20:28:41 PDT
(In reply to
comment #17
)
> (In reply to
comment #16
) > > > It’s unnecessarily inefficient to use the arbitrary get function here and it > > also creates the false idea that there are two possible exceptions here, one > > getting the argument and another converting it to a string, causing us to > > write the dead code to handle these two impossible exceptions. > > Why is this a false idea? It's *absolutely possible* using defineProperty to > make the `get` throw, and separately, the `toString` function could throw. > The spec requires that *both* of these possible exception points be > triggered, and cause an exception - and if the `get` throws, `toString` must > not be observable.
My understanding is that once we know that the object we are dealing with is a RegExpObject, in that faster optimized code path we know that toString can’t throw, and in fact we don’t need to call toString, but can instead get directly at the internal value with a faster method. That’s what I meant by "false idea".
Jordan Harband
Comment 20
2015-07-26 23:34:13 PDT
(In reply to
comment #19
)
> (In reply to
comment #17
) > > (In reply to
comment #16
) > > > > > It’s unnecessarily inefficient to use the arbitrary get function here and it > > > also creates the false idea that there are two possible exceptions here, one > > > getting the argument and another converting it to a string, causing us to > > > write the dead code to handle these two impossible exceptions. > > > > Why is this a false idea? It's *absolutely possible* using defineProperty to > > make the `get` throw, and separately, the `toString` function could throw. > > The spec requires that *both* of these possible exception points be > > triggered, and cause an exception - and if the `get` throws, `toString` must > > not be observable. > > My understanding is that once we know that the object we are dealing with is > a RegExpObject, in that faster optimized code path we know that toString > can’t throw, and in fact we don’t need to call toString, but can instead get > directly at the internal value with a faster method. > > That’s what I meant by "false idea".
`var a = /a/g; Object.defineProperty(a, 'toString', { get: function () { throw new TypeError(); } }); new RegExp(a, 'i')` In no case, ever, can we know for sure that toString can't throw.
Darin Adler
Comment 21
2015-07-27 14:43:49 PDT
(In reply to
comment #20
)
> `var a = /a/g; > Object.defineProperty(a, 'toString', { get: function () { throw new > TypeError(); } }); > new RegExp(a, 'i')` > > In no case, ever, can we know for sure that toString can't throw.
There is still an optimization possible for the incredibly common case where such a function hasn’t been supplied. In any case, please do make appropriate new test cases so we can verify we have written the correct code for this.
Ross Kirsling
Comment 22
2021-02-21 00:56:30 PST
Closing, looks like this was fixed in
r197962
(
bug 155315
).
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug