RESOLVED FIXED 71572
String new RegExp('\n').toString() returns is invalid RegularExpressionLiteral
https://bugs.webkit.org/show_bug.cgi?id=71572
Summary String new RegExp('\n').toString() returns is invalid RegularExpressionLiteral
Yusuke Suzuki
Reported 2011-11-04 10:55:38 PDT
new RegExp('\n').toString() returns / / , so this result is including LineTerminator. And, according to ECMA262 section 15.10.6.4 "The returned String has the form of a RegularExpressionLiteral that evaluates to another RegExp object with the same behaviour as this object." http://es5.github.com/#x15.10.6.4 And, because of RegularExpressionNonTerminator, LineTerminator is not allowed in RegularExpressionLiteral. So RegExp.source should be escaped. Furthermore, new RegExp('\\\n').toString() should provide /\n/, because /\\\n/ doesn't have the same behaviour to new RegExp('\\\n').
Attachments
Patch (6.87 KB, patch)
2011-11-04 11:04 PDT, Yusuke Suzuki
no flags
Patch (7.87 KB, patch)
2011-11-15 13:33 PST, Yusuke Suzuki
no flags
Patch (8.11 KB, patch)
2011-11-15 14:24 PST, Yusuke Suzuki
no flags
Yusuke Suzuki
Comment 1 2011-11-04 11:04:32 PDT
Created attachment 113682 [details] Patch attached patch
Yusuke Suzuki
Comment 2 2011-11-05 04:47:49 PDT
reported issue and sent patch to V8 http://code.google.com/p/v8/issues/detail?id=1813
Gavin Barraclough
Comment 3 2011-11-13 14:09:36 PST
Comment on attachment 113682 [details] Patch Good catch & nice fix. But I think we may want to make one tweak to this patch before landing. Most regular expressions don't contain forwards slashes or newlines, but this patch will always produce a fresh copy of the string every time source is called. Currently the code has an early return for strings that don't contain a forwards slash. We could expand this to also check for newline characters, and early return if the string contains neither forward slashes or newlines. r- for the loss of this optimization, but r+ in principle, we should definitely take this fix & the code looks great otherwise. :-)
Yusuke Suzuki
Comment 4 2011-11-15 12:18:16 PST
(In reply to comment #3) > (From update of attachment 113682 [details]) > Good catch & nice fix. But I think we may want to make one tweak to this patch before landing. Most regular expressions don't contain forwards slashes or newlines, but this patch will always produce a fresh copy of the string every time source is called. Currently the code has an early return for strings that don't contain a forwards slash. We could expand this to also check for newline characters, and early return if the string contains neither forward slashes or newlines. r- for the loss of this optimization, but r+ in principle, we should definitely take this fix & the code looks great otherwise. :-) Thanks for your review. You're right. I'll add early return to this patch.
Yusuke Suzuki
Comment 5 2011-11-15 13:33:02 PST
Created attachment 115232 [details] Patch attached patch v2. First, checking this pattern should be escaped, and if it is false return early
Darin Adler
Comment 6 2011-11-15 13:44:40 PST
Comment on attachment 115232 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=115232&action=review Looks OK. Needs a couple small changes. > Source/JavaScriptCore/runtime/RegExpObject.cpp:124 > + bool previousIsBackSlash = false; backslash is one word, not two, so this should be named previousCharacterWasBackslash > Source/JavaScriptCore/runtime/RegExpObject.cpp:145 > + if (ch == '\n' || ch == '\r' || (ch & ~1) == 0x2028) { This should use the inline isLineTerminator function instead of repeating its code. > Source/JavaScriptCore/runtime/RegExpObject.cpp:157 > + if (!shouldEscape) > + return jsString(exec, pattern); This is indented wrong. Should be indented four spaces. > Source/JavaScriptCore/runtime/RegExpObject.cpp:177 > + if (ch == '\n' || ch == '\r' || (ch & ~1) == 0x2028) { This should use the inline isLineTerminator function instead of repeating its code.
Yusuke Suzuki
Comment 7 2011-11-15 14:21:29 PST
(In reply to comment #6) Thank you for your review. > backslash is one word, not two, so this should be named previousCharacterWasBackslash Yes, I'll rename it. > > Source/JavaScriptCore/runtime/RegExpObject.cpp:145 > > + if (ch == '\n' || ch == '\r' || (ch & ~1) == 0x2028) { > This should use the inline isLineTerminator function instead of repeating its code. Right, I'll change it to using Lexer<UChar>::isLineTerminator. > This is indented wrong. Should be indented four spaces. Oops. This is mistake. I'll upload revised patch soon.
Yusuke Suzuki
Comment 8 2011-11-15 14:24:10 PST
Created attachment 115244 [details] Patch attached patch v3
Gavin Barraclough
Comment 9 2011-11-16 10:08:56 PST
Comment on attachment 115244 [details] Patch Looks great.
WebKit Review Bot
Comment 10 2011-11-16 10:44:35 PST
Comment on attachment 115244 [details] Patch Clearing flags on attachment: 115244 Committed r100469: <http://trac.webkit.org/changeset/100469>
WebKit Review Bot
Comment 11 2011-11-16 10:44:40 PST
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.