RESOLVED FIXED 50554
RegExp.prototype.toString does not escape slashes
https://bugs.webkit.org/show_bug.cgi?id=50554
Summary RegExp.prototype.toString does not escape slashes
Jan de Mooij
Reported 2010-12-06 00:49:46 PST
Consider this: RegExp("abc/def").toString() Result in FF and Opera: /abc\/def/ Result in Chrome and Safari: /abc/def/ It does work correctly for literal regexps like /abc\/def/.toString() ES5 15.10.6.4 mentions this: === NOTE The returned String has the form of a RegularExpressionLiteral that evaluates to another RegExp object with the same behaviour as this object. === /abc/def/ is invalid as regexp literal, so the behavior is different in Safari/Chrome. FWIW, here is the V8 bug: http://code.google.com/p/v8/issues/detail?id=956
Attachments
The patch (8.70 KB, patch)
2011-06-18 01:38 PDT, Gavin Barraclough
darin: review-
webkit.review.bot: commit-queue-
Archive of layout-test-results from ec2-cr-linux-03 (1.54 MB, application/zip)
2011-06-18 02:10 PDT, WebKit Review Bot
no flags
New patch (8.87 KB, patch)
2011-06-18 19:19 PDT, Gavin Barraclough
oliver: review+
webkit.review.bot: commit-queue-
Archive of layout-test-results from ec2-cr-linux-01 (1.43 MB, application/zip)
2011-06-18 19:37 PDT, WebKit Review Bot
no flags
Mark Hahnenberg
Comment 1 2011-01-13 18:13:37 PST
There also appears to be a related discrepancy between the behaviors of the source property of RegExp objects in Firefox and Chrome/Safari. RegExp("abc/def").source Result in FF: abc\/def Result in Chrome/Safari: abc/def I have a patch that fixes the problem with toString() without modifying the behavior of the source property, but it would be much simpler if the behavior of the source property needed to change as well. Is this what needs to happen? From ES5 15.10.7.1: "The value of the source property is a String in the form of a Pattern representing the current regular expression." The grammar for Patterns (15.10.1) seems to indicate that the way Firefox does it is incorrect.
Gavin Barraclough
Comment 2 2011-01-14 11:40:29 PST
(In reply to comment #1) Hi Mark, > From ES5 15.10.7.1: > "The value of the source property is a String in the form of a Pattern representing the current regular expression." The term 'representing' seems ambiguous to me, but I guess we can safely assume a pattern represents a regular expression if it would compile to have the same [[Match]] behavior. I'd suggest the use of the indefinite article is probably key in interpreting this sentence. The spec refers to 'a' pattern rather than requiring a particular pattern - if the spec is intended to require 'the' pattern from which it was compiled I think it would say so. As such, given the RegExp /ABC/ I'd suggest that an implementation could return "ABC", "\x61\x62\x63", or any other pattern that would compile to produce the same expression, and still be considered to be in compliance with the spec. > The grammar for Patterns (15.10.1) seems to indicate that the way Firefox does it is incorrect. Sorry - I don't follow, what's your reasoning here? - based on the interpretation above, FF's pattern does represent the source, and "abc\/def" is a valid pattern per section 15.10.1 of the spec ("\/" is a valid IdentityEscape). Are you assuming a more strict definition of 'source'? - that it must match the pattern compiled from? > I have a patch that fixes the problem with toString() without modifying the behavior of the source property, but it would be much simpler if the behavior of the source property needed to change as well. Is this what needs to happen? Great! If FireFox's behavior is incorrect, then we should stick to producing "abc/def" for source, & just change toString. But assuming my interpretation of the spec is correct, I'd say changing 'source' too to match FireFox make sense. pros to switching: * it's good from a web compatibility perspective to match FireFox. * from a developer's perspective, the escaped form is more useful (can be concatenated into a string & eval'ed as a valid RegularExpressionLiteral). * from a developer's perspective, I'd argue having two string representations for a RegExp would be potentially confusing & could be problematic, likely better to have a single canonical string representation associated with a given RegExp. cons to switching: * intuitively it seems like 'source' should actually be the source the expression was compiled from. On balance, I'd certainly favour following FF on this one. cheers, G. [ I think there is probably a cleaner and simpler solution, that is also likely completely impractical. The fundamental problem here is that there are valid patterns that are not a valid RegularExpressionLiteral. If we could just amend the spec to make an unescaped forwards slash an invalid pattern character, then this issue should go away. :-P ]
Mark Hahnenberg
Comment 3 2011-01-14 12:29:08 PST
(In reply to comment #2) > Sorry - I don't follow, what's your reasoning here? - based on the interpretation above, FF's pattern does represent the source, and "abc\/def" is a valid pattern per section 15.10.1 of the spec ("\/" is a valid IdentityEscape). Are you assuming a more strict definition of 'source'? - that it must match the pattern compiled from? I misread the grammar for Patterns. You are correct. > pros to switching: > * it's good from a web compatibility perspective to match FireFox. > * from a developer's perspective, the escaped form is more useful (can be concatenated into a string & eval'ed as a valid RegularExpressionLiteral). > * from a developer's perspective, I'd argue having two string representations for a RegExp would be potentially confusing & could be problematic, likely better to have a single canonical string representation associated with a given RegExp. > > cons to switching: > * intuitively it seems like 'source' should actually be the source the expression was compiled from. > > On balance, I'd certainly favour following FF on this one. I agree with this :-) I'll make the necessary changes to make source and toString() match Firefox's behavior (interestingly, RegExp objects created using the literal notation already match Firefox's behavior).
Gavin Barraclough
Comment 4 2011-06-18 01:38:21 PDT
Created attachment 97691 [details] The patch
WebKit Review Bot
Comment 5 2011-06-18 01:40:08 PDT
Attachment 97691 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/fast..." exit_code: 1 Source/JavaScriptCore/runtime/RegExpObject.cpp:130: Missing space before ( in while( [whitespace/parens] [5] Source/JavaScriptCore/runtime/RegExpObject.cpp:138: Missing space before { [whitespace/braces] [5] Total errors found: 2 in 7 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Review Bot
Comment 6 2011-06-18 02:10:28 PDT
Comment on attachment 97691 [details] The patch Attachment 97691 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/8874811 New failing tests: fast/regex/toString.html
WebKit Review Bot
Comment 7 2011-06-18 02:10:33 PDT
Created attachment 97694 [details] Archive of layout-test-results from ec2-cr-linux-03 The attached test failures were seen while running run-webkit-tests on the chromium-ews. Bot: ec2-cr-linux-03 Port: Chromium Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Darin Adler
Comment 8 2011-06-18 10:01:36 PDT
Comment on attachment 97691 [details] The patch View in context: https://bugs.webkit.org/attachment.cgi?id=97691&action=review Looks like this caused some test failures" jquery/manipulation.html = TIMEOUT Regressions: Unexpected text diff mismatch : (1) fast/regex/toString.html = TEXT Could you check into those? > Source/JavaScriptCore/runtime/RegExpObject.cpp:117 > + size_t fwdSlash = pattern.find('/'); This local variable does not contain a forward slash. I think a better name would be forwardSlashLocation or forwardSlashPosition. > Source/JavaScriptCore/runtime/RegExpObject.cpp:129 > + size_t slashes = fwdSlash; I would call this slashesPosition instead of slashes. Again, it's a number, not slashes, which is why. >> Source/JavaScriptCore/runtime/RegExpObject.cpp:130 >> + while(slashes && pattern[slashes - 1] == '\\') > > Missing space before ( in while( [whitespace/parens] [5] The style bot is right about this missing space. > Source/JavaScriptCore/runtime/RegExpObject.cpp:135 > + // doubel escape it. Typo: doubel > Source/JavaScriptCore/runtime/RegExpObject.cpp:137 > + result.append(pattern.substringSharingImpl(completed, fwdSlash + 1)); This tells me that substringSharingImpl needs a better name; we need a name that makes it clearer when you need to use it and when you must not use it. > LayoutTests/fast/regex/script-tests/toString.js:11 > + shouldBeTrue("re1.test(string)"); > + shouldBeTrue("re2.test(string)"); Can you write this so that the patterns are visible in the shouldBe expressions, making the test output clearer? I think you should. > LayoutTests/fast/regex/script-tests/toString.js:22 > +// These strings are equilvalent, since the '\' is identity escaping the '/' at the string level. Typo: equivalent
Gavin Barraclough
Comment 9 2011-06-18 12:20:44 PDT
> Looks like this caused some test failures" > > jquery/manipulation.html = TIMEOUT This passes for me locally, and I think it's unlikely I introduced any regression here; my patch only effect JSC, and this appears to be the cr-linux bot failing? > Regressions: Unexpected text diff mismatch : (1) > fast/regex/toString.html = TEXT Again this looks like the cr-linux bot failing? - so I'm assuming this is v8 failing the new test cases (per comments below, their behavior matches our old one & so they will have this bug to fix too).
Gavin Barraclough
Comment 10 2011-06-18 19:19:23 PDT
Created attachment 97712 [details] New patch
WebKit Review Bot
Comment 11 2011-06-18 19:37:40 PDT
Comment on attachment 97712 [details] New patch Attachment 97712 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/8913006 New failing tests: fast/regex/toString.html
WebKit Review Bot
Comment 12 2011-06-18 19:37:45 PDT
Created attachment 97714 [details] Archive of layout-test-results from ec2-cr-linux-01 The attached test failures were seen while running run-webkit-tests on the chromium-ews. Bot: ec2-cr-linux-01 Port: Chromium Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Gavin Barraclough
Comment 13 2011-06-27 22:56:05 PDT
Fixed in r89895
Note You need to log in before you can comment on or make changes to this bug.