WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED INVALID
45959
String.prototype.replace passes undefined rather than empty string for groups that don't match
https://bugs.webkit.org/show_bug.cgi?id=45959
Summary
String.prototype.replace passes undefined rather than empty string for groups...
Mike Kamermans
Reported
2010-09-17 06:03:05 PDT
Created
attachment 67903
[details]
demonstrator file To illustrate the problem, all that is needed is to create a page with the following javascript: var s = "aaaccc", c2Type = "n/a"; var ret = s.replace(/(a+)(b+)?/g, function(all, c1, c2, i, str) { c2Type = typeof c2; return ""; }); alert(s+"\n"+c2Type+"\n"+ret); and then run this via the browser and note the text in the alert. It should generate an alert with the following text: aaaccc string ccc Instead, safari generates an alert with the following text: aaccc undefined ccc Basically, the String.prototype.replace function as described in the ECMA-262 standard is not properly implemented. The problem was analysed on
https://bugzilla.mozilla.org/show_bug.cgi?id=597035
and concluded by Brendan Eich. The idea is that if the search value is a regexp, and the replacement is a function, then that function is called with a fluid number of arguments. If the regexp has n capture groups, the function will be called with the following arguments: 1) The full regexp match 1+0) The string caught by the region covered by the first left-parens in the regexp, to its closing parens ... 1+n+1) The string caught by the region covered by the last left-parens in the regexp, to its closing parens 1+n+2) the position in the original string that the full regexp matched on 1+n+3) the original string except for the position in the original string, all of these arguments are strings, with empty matches represented by the empty string "". However, in the implementation used by Webkit (as well as Opera and Chromium, it should be noted), the empty captures are left undefined, instead of defined as "". This is incorrect, and should be fixed.
Attachments
demonstrator file
(490 bytes, text/html)
2010-09-17 06:03 PDT
,
Mike Kamermans
no flags
Details
Proposed patch
(2.11 KB, patch)
2010-09-18 01:22 PDT
,
Mark Hahnenberg
oliver
: review-
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Oliver Hunt
Comment 1
2010-09-17 15:40:59 PDT
Retitled to state the actual bug.
Mark Hahnenberg
Comment 2
2010-09-18 01:22:33 PDT
Created
attachment 68002
[details]
Proposed patch This seemed like a good starter bug for a new contributor such as myself. I wasn't sure if I needed to add a regression test and if so, where to put it. As far as the patch goes, I just replaced the jsUndefined() call with an empty string retrieved from the JSGlobalData object.
Oliver Hunt
Comment 3
2010-09-18 09:38:49 PDT
Comment on
attachment 68002
[details]
Proposed patch View in context:
https://bugs.webkit.org/attachment.cgi?id=68002&action=prettypatch
Hi mark, The logic of the patch is correct, only the style issues i mentioned stand in the way of that. Before we can land the patch though we'll need to have a testcase for this new behaviour (so we don't regress). I'm actually surprised we don't already have a test for this (checking for the wrong behaviour) have you run the full webkit test suite (it's somewhat unfortunate but the majority of our js tests are part of the webkit test harness these days so require a full build of webkit :-/) Assuming you've built webkit using build-webkit you should be able to do run-webkit-tests fast/js to run the bulk of the js tests (which will show whether any existing test covers this case). If a test already covers this but expects the wrong behaviour, just correct the test to check for the right thing. Otherwise just add a few tests to the end of LayoutTests/fast/js/script-tests/string-replace-3.js and regenerate the expected output.
> JavaScriptCore/runtime/StringPrototype.cpp:296 > + JSGlobalData* globalData;
Don't declare globalData here, as you're not initialising it, and it's only used in one placed anyway
> JavaScriptCore/runtime/StringPrototype.cpp:344 > + globalData = &exec->globalData();
Make this globalData actually be the declaration. It is worth considering not even having the temporary -- i'm not sure it really gains anything over simply having cachedCall.setArgument(i, exec->globalData().smallStrings.emptyString(globalData));
Mark Hahnenberg
Comment 4
2010-09-18 20:47:09 PDT
Hmm, when I ran the js tests, there was a regression. Upon investigating further into the file that contained the failing tests, I ran across this bug report from 2007:
https://bugs.webkit.org/show_bug.cgi?id=14931
. If you look at the blog post it references, along with some of the proposed patches, it appears as if this bug were undoing what they that bug supposedly fixed. I'm not really sure which way is the correct way, and the parts of the ECMA-262 spec quoted by the Mozilla bug report above seem to leave this particular detail undefined. Thoughts?
Oliver Hunt
Comment 5
2010-09-18 21:32:14 PDT
(In reply to
comment #4
)
> Hmm, when I ran the js tests, there was a regression. Upon investigating further into the file that contained the failing tests, I ran across this bug report from 2007:
https://bugs.webkit.org/show_bug.cgi?id=14931
. If you look at the blog post it references, along with some of the proposed patches, it appears as if this bug were undoing what they that bug supposedly fixed. > > I'm not really sure which way is the correct way, and the parts of the ECMA-262 spec quoted by the Mozilla bug report above seem to leave this particular detail undefined. Thoughts?
Ah the hilarity. Our normal approach in cases like this is to look at firefox and IE behaviour and see what they do, if they do the same thing we can consider changing to match their behaviour. We should also contact the ecmascript mailing list to get them to clarify the expected behaviour.
Mark Hahnenberg
Comment 6
2010-09-19 01:07:05 PDT
Unfortunately Firefox and IE appear to do different things: Firefox 4.0b6 - passes empty string IE8/9 - passes undefined Opera 10.62 - passes undefined I looked a little more closely at the ECMA-262 spec. When it describes how RegExp.prototype.exec should behave, it explicitly states than any non-matched capture groups should be considered undefined. It seems intuitive that when String.prototype.replace is passed a function for replaceValue, it would do something along the lines of: 1) Run the regex specified in the searchValue argument on the receiver of the method call ("this") by calling exec(), e.g. "searchValue.exec(this);" 2) Take the elements of the resulting array from the exec() call and pass them into replaceValue along with the other required arguments. 3) Replace the matched substrings with the return value of the function, converting that returned value to a String if it's not already. Since the behavior of replace() seems to align with the behavior of exec(), perhaps they should work similarly to avoid cognitive dissonance between the two. Obviously, the implementation doesn't have to actually work like this, but it seems like a fairly intuitive way of thinking about it (if I understand the spec correctly, that is :-) From the ECMA-262 spec for RegExp.prototype.exec: "If choices in the left Alternative are exhausted, the right Disjunction is tried instead of the left Alternative. Any capturing parentheses inside a portion of the pattern skipped by | produce undefined values instead of Strings. Thus, for example, /a|ab/.exec("abc") returns the result "a" and not "ab". Moreover, /((a)|(ab))((c)|(bc))/.exec("abc") returns the array ["abc", "a", "a", undefined, "bc", undefined, "bc"] and not ["abc", "ab", undefined, "ab", "c", "c", undefined]"
Mike Kamermans
Comment 7
2010-09-20 22:40:34 PDT
It would probably be better to consult with the ECMA mailing list on this, rather than relying on how exec() behaves to determine how replace() should resolve. According to Eich the captures should be empty strings once the pattern matching is done, with the only way to get an undefined capture is during back-reference in the regexp itself. It is entirely possible that the IE9 team implemented replace() incorrectly as well, and they have something to fix, too. They did write a new JS engine pretty much from scratch.
Mark Hahnenberg
Comment 8
2010-09-24 21:16:05 PDT
So it appears that the Mozilla folks realized that they had misinterpreted the spec:
https://bugzilla.mozilla.org/show_bug.cgi?id=597035#c7
I think this bug should be RESOLVED INVALID.
Geoffrey Garen
Comment 9
2010-09-28 11:40:49 PDT
Bummer! Oh well, it's good that you tracked this down for the record.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug