Bug 31349 - String to number coercion is not spec compliant
Summary: String to number coercion is not spec compliant
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Macintosh Intel OS X 10.6
: P2 Normal
Assignee: Darin Adler
Depends on:
Reported: 2009-11-11 05:29 PST by kangax
Modified: 2010-07-12 14:32 PDT (History)
3 users (show)

See Also:

Show parseFloat failure with non-CString characters (1.19 KB, text/html)
2010-01-16 13:15 PST, Brian Foley
no flags Details
Patch (23.73 KB, patch)
2010-07-09 18:35 PDT, Darin Adler
ggaren: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description kangax 2009-11-11 05:29:17 PST
As explained here (http://thinkweb2.com/projects/prototype/sputniktests-web-runner/#number-u00A0), `Number('\u00A0')` returns `NaN` in WebKit, but should be `0`. Similarly, `parseFloat(“\u205F -1.1”)` returns `NaN` but should be `-1.1`. 

See 9.3.1 (ToNumber Applied to the String Type); in particular, StrWhiteSpaceChar production.
Comment 1 Alexey Proskuryakov 2009-11-11 19:19:34 PST
We already have bug 25490 for nbsp. Maybe we should make this a duplicate, and extend the scope of the former to cover U+205F MEDIUM MATHEMATICAL SPACE.
Comment 2 kangax 2009-11-11 19:34:51 PST
Sure, marking as duplicate of 25490 would make sense. 

Speaking of separate tickets, would you rather have all Sputniktests failures (http://kangax.github.com/sputniktests-webrunner/) summed up in one ticket, or should I file them separately?
Comment 3 kangax 2009-11-11 20:00:24 PST
I would also like to add that not only U+00A0 and U+205F fail, but practically the entire Zs whitespace category — U+2000-U+200A, U+2028, U+2029, and others.
Comment 4 Brian Foley 2010-01-16 13:15:05 PST
Created attachment 46747 [details]
Show parseFloat failure with non-CString characters
Comment 5 Brian Foley 2010-01-16 13:16:20 PST
This problem still exists with Safari 4.0.4 and r53317. It appears to be a duplicate of #16717
Comment 6 Derk-Jan Hartman 2010-02-22 12:49:49 PST
Just ran into this issue on Wikipedia where we were trying to discover why a range (with the en dash) was not supported in a sortable table.

Bug 16717 indeed looks like a dupe of this. Bdash notes:

From ustring.cpp:
954      // FIXME: If tolerateTrailingJunk is true, then we want to tolerate
non-8-bit junk
955      // after the number, so is8Bit is too strict a check.
956      if (!is8Bit())
957        return NaN;
Comment 7 Darin Adler 2010-06-16 22:20:06 PDT

*** This bug has been marked as a duplicate of bug 16717 ***
Comment 8 Darin Adler 2010-06-16 22:42:41 PDT
Some aspects of this bug have been fixed, but others have not.

Specifically, there was a bug where all non-ASCII values would cause the conversion to fail. That was fixed. Here are the remaining problems that I know of:

    1) Whitespace other than U+0020 is not correctly skipped.

    2) Illegal UTF-16 sequences will cause parseFloat to fail.

There are comments about both of these problems in UString::toDouble that I added a while back when I noticed the mistakes.
Comment 9 Darin Adler 2010-06-16 22:43:00 PDT
I’d like to fix these some time soon.
Comment 10 Darin Adler 2010-07-09 18:35:17 PDT
Created attachment 61138 [details]
Comment 11 Geoffrey Garen 2010-07-12 13:54:25 PDT
Comment on attachment 61138 [details]

Comment 12 Darin Adler 2010-07-12 14:32:53 PDT
Committed r63120: <http://trac.webkit.org/changeset/63120>