Bug 89071

Summary: JavaScript: Invalid date parse for ISO 8601 strings when no timezone given
Product: WebKit Reporter: Alexey Androsov <doochik>
Component: JavaScriptCoreAssignee: Ross Kirsling <ross.kirsling>
Status: RESOLVED FIXED    
Severity: Normal CC: alecflett, barraclough, beidson, benjamin, cdumez, cmarcelo, commit-queue, darin, dbates, ews-watchlist, florian, haydenbraxton, jsbell, keith_miller, mark.lam, msaboff, othree, ross.kirsling, saam, tj, tzagallo, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
See Also: https://bugs.webkit.org/show_bug.cgi?id=206560
Attachments:
Description Flags
Patch
none
Patch
none
Patch for landing none

Description Alexey Androsov 2012-06-14 01:03:44 PDT
Standard: http://dotat.at/tmp/ISO_8601-2004_E.pdf
4.3.2: The zone designator is empty if use is made of local time in accordance with 4.2.2.2 through 4.2.2.4, it is the UTC designator [Z] if use is made of UTC of day in accordance with 4.2.4 and it is the difference-component if use is made of local time and the difference from UTC in accordance with 4.2.5.2.

My timezone is UTC+4, so date without timezone must be parsed as localtime.
Example: 2012-06-14T05:00:00 -> 2012-06-14T05:00:00+04:00 -> 2012-06-14T01:00:00Z

valid result (Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:13.0) Gecko/20100101 Firefox/13.0):
new Date('2012-06-14T05:00:00').toISOString() -> "2012-06-14T01:00:00.000Z"

invalid result (Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.56 Safari/536.5):
new Date('2012-06-14T05:00:00').toISOString() -> "2012-06-14T05:00:00.000Z"
Comment 1 Gavin Barraclough 2012-09-25 13:52:45 PDT
OOI ecmascript date parsing is not ISO 8601 compliant, but we may be able to make this case work.
Comment 2 othree 2019-12-19 07:15:57 PST
In latest ECMAScript spec 20.4.3.2 <https://tc39.es/ecma262/#sec-date.parse>

There is a line talked about date time string without timezone should be local time:

"When the UTC offset representation is absent, date-only forms are interpreted as a UTC time and date-time forms are interpreted as a local time."
Comment 3 Ross Kirsling 2019-12-19 10:49:53 PST
Created a test262 issue as well, since there's currently no conformance test coverage for this case:
https://github.com/tc39/test262/issues/2450
Comment 4 Ross Kirsling 2020-01-03 15:17:06 PST
*** Bug 205652 has been marked as a duplicate of this bug. ***
Comment 5 Ross Kirsling 2020-01-03 15:19:23 PST
*** Bug 196894 has been marked as a duplicate of this bug. ***
Comment 6 Ross Kirsling 2020-01-03 21:19:28 PST
Created attachment 386753 [details]
Patch
Comment 7 Ross Kirsling 2020-01-04 00:24:37 PST
Created attachment 386760 [details]
Patch
Comment 8 Ross Kirsling 2020-01-04 12:59:43 PST Comment hidden (obsolete)
Comment 9 Darin Adler 2020-01-04 20:33:26 PST
Comment on attachment 386760 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=386760&action=review

Not a huge fan of "out arguments" but we should respect the style of this old-ish code, I suppose.

> Source/JavaScriptCore/runtime/JSDateMath.cpp:218
> +    int offset = isLocalTime ? localTimeOffset(vm, ms, WTF::LocalTime).offset : 0;
> +    return ms - offset;

I don't think it’s great to use the trinary operator here. How about this?

    if (isLocalTime)
        ms -= localTimeOffset(vm, ms, WTF::LocalTime).offset;

    return ms;

Also, maybe instead of "ms" we could name the local variable "value", or "date", or "time". I think a word is better than an SI unit for a variable name.

> Source/WTF/wtf/DateMath.cpp:575
> +    isLocalTime = false;

Why not do this at the very top of the function instead of down here?

> Source/WTF/wtf/DateMath.cpp:653
> +        isLocalTime = false;

Why not do this at the start of the function rather than here in this else?
Comment 10 Ross Kirsling 2020-01-04 23:59:30 PST
(In reply to Darin Adler from comment #9)
> Comment on attachment 386760 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=386760&action=review
> 
> Not a huge fan of "out arguments" but we should respect the style of this
> old-ish code, I suppose.

Agreed.

> > Source/WTF/wtf/DateMath.cpp:575
> > +    isLocalTime = false;
> 
> Why not do this at the very top of the function instead of down here?
> 
> > Source/WTF/wtf/DateMath.cpp:653
> > +        isLocalTime = false;
> 
> Why not do this at the start of the function rather than here in this else?

I thought the logic was easier to follow by keeping it more local, but I can move these up.
Comment 11 Ross Kirsling 2020-01-05 00:11:17 PST
Created attachment 386783 [details]
Patch for landing
Comment 12 WebKit Commit Bot 2020-01-05 00:55:39 PST
Comment on attachment 386783 [details]
Patch for landing

Clearing flags on attachment: 386783

Committed r254038: <https://trac.webkit.org/changeset/254038>
Comment 13 WebKit Commit Bot 2020-01-05 00:55:41 PST
All reviewed patches have been landed.  Closing bug.
Comment 14 Radar WebKit Bug Importer 2020-01-05 00:56:20 PST
<rdar://problem/58321992>
Comment 15 Darin Adler 2020-01-20 09:52:07 PST
Comment on attachment 386783 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=386783&action=review

> Source/JavaScriptCore/runtime/JSDateMath.cpp:220
> +    bool isLocalTime;
> +    double value = WTF::parseES5DateFromNullTerminatedCharacters(dateString, isLocalTime);
> +    if (std::isnan(value))
> +        return std::numeric_limits<double>::quiet_NaN();
> +
> +    if (isLocalTime)
> +        value -= localTimeOffset(vm, value, WTF::LocalTime).offset;
> +
> +    return value;

Had a small thought for refinement:

- When parseES5DateFromNullTerminatedCharacters returns a NaN, there’s no need to "sanitize" it.
- Calling localTimeOffset has no side effects.
- Subtracting a number from a NaN has no effect on the NaN.

Assuming I am correct, and all three of these are in fact true, we can remove the special case if statement for NaN with these effects:

- Slightly smaller code.
- Slightly faster for the non-NaN case.
- Slightly slower for the NaN case.

So, I think we should remove it.
Comment 16 Ross Kirsling 2020-01-21 10:15:38 PST
(In reply to Darin Adler from comment #15)
> Comment on attachment 386783 [details]
> Patch for landing
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=386783&action=review
> 
> > Source/JavaScriptCore/runtime/JSDateMath.cpp:220
> > +    bool isLocalTime;
> > +    double value = WTF::parseES5DateFromNullTerminatedCharacters(dateString, isLocalTime);
> > +    if (std::isnan(value))
> > +        return std::numeric_limits<double>::quiet_NaN();
> > +
> > +    if (isLocalTime)
> > +        value -= localTimeOffset(vm, value, WTF::LocalTime).offset;
> > +
> > +    return value;
> 
> Had a small thought for refinement:
> 
> - When parseES5DateFromNullTerminatedCharacters returns a NaN, there’s no
> need to "sanitize" it.
> - Calling localTimeOffset has no side effects.
> - Subtracting a number from a NaN has no effect on the NaN.
> 
> Assuming I am correct, and all three of these are in fact true, we can
> remove the special case if statement for NaN with these effects:
> 
> - Slightly smaller code.
> - Slightly faster for the non-NaN case.
> - Slightly slower for the NaN case.
> 
> So, I think we should remove it.

Yeah, the NaN paths in both of those static functions are a bit sad since we call each from one place, and in particular, the latter only gets called if the former returned NaN.

I didn't realize that "[s]ubtracting a number from a NaN has no effect on the NaN" in C++, that's good to know. Calling `localTimeOffset` does have a caching side effect; not sure whether that changes your mind at all.

What I had considered myself here was merging the two static functions into one (and just returning the unsanitized NaN since we already know it's quiet_NaN), to avoid the icky redundancy of rechecking whether we took the NaN path on the first *after* returning from it. I could still do that.
Comment 17 Keith Miller 2020-01-21 10:21:40 PST
Comment on attachment 386783 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=386783&action=review

>> Source/JavaScriptCore/runtime/JSDateMath.cpp:220
>> +    return value;
> 
> Had a small thought for refinement:
> 
> - When parseES5DateFromNullTerminatedCharacters returns a NaN, there’s no need to "sanitize" it.
> - Calling localTimeOffset has no side effects.
> - Subtracting a number from a NaN has no effect on the NaN.
> 
> Assuming I am correct, and all three of these are in fact true, we can remove the special case if statement for NaN with these effects:
> 
> - Slightly smaller code.
> - Slightly faster for the non-NaN case.
> - Slightly slower for the NaN case.
> 
> So, I think we should remove it.

We do care about the type of NaN we pass to jsNumber because it doesn't check that the NaN is pure when boxing (other than a debug assert). Although, in theory the C++ spec doesn't guarantee quient_NaN is also a pure NaN, so it might be reasonable to change the `std::numeric_limits<double>::quiet_NaN()` to `PNaN`.
Comment 18 Keith Miller 2020-01-21 10:22:59 PST
Comment on attachment 386783 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=386783&action=review

>>>> Source/JavaScriptCore/runtime/JSDateMath.cpp:220
>>>> +    return value;
>>> 
>>> Had a small thought for refinement:
>>> 
>>> - When parseES5DateFromNullTerminatedCharacters returns a NaN, there’s no need to "sanitize" it.
>>> - Calling localTimeOffset has no side effects.
>>> - Subtracting a number from a NaN has no effect on the NaN.
>>> 
>>> Assuming I am correct, and all three of these are in fact true, we can remove the special case if statement for NaN with these effects:
>>> 
>>> - Slightly smaller code.
>>> - Slightly faster for the non-NaN case.
>>> - Slightly slower for the NaN case.
>>> 
>>> So, I think we should remove it.
>> 
>> Yeah, the NaN paths in both of those static functions are a bit sad since we call each from one place, and in particular, the latter only gets called if the former returned NaN.
>> 
>> I didn't realize that "[s]ubtracting a number from a NaN has no effect on the NaN" in C++, that's good to know. Calling `localTimeOffset` does have a caching side effect; not sure whether that changes your mind at all.
>> 
>> What I had considered myself here was merging the two static functions into one (and just returning the unsanitized NaN since we already know it's quiet_NaN), to avoid the icky redundancy of rechecking whether we took the NaN path on the first *after* returning from it. I could still do that.
> 
> We do care about the type of NaN we pass to jsNumber because it doesn't check that the NaN is pure when boxing (other than a debug assert). Although, in theory the C++ spec doesn't guarantee quient_NaN is also a pure NaN, so it might be reasonable to change the `std::numeric_limits<double>::quiet_NaN()` to `PNaN`.

I guess I should also mention that the result of this function is indirectly passed to jsNumber, in case that wasn't clear.
Comment 19 Darin Adler 2020-01-21 10:30:35 PST
That sounds wrong to me. I think we are guaranteed this is a pure NaN.
Comment 20 Keith Miller 2020-01-21 12:03:48 PST
(In reply to Darin Adler from comment #19)
> That sounds wrong to me. I think we are guaranteed this is a pure NaN.

I believe on MIPS, for instance, QNaN has the sign bit set so it will be impure. At least, that's my understanding from: https://github.com/WebAssembly/design/issues/976.
Comment 21 Keith Miller 2020-01-21 12:05:20 PST
(In reply to Keith Miller from comment #20)
> (In reply to Darin Adler from comment #19)
> > That sounds wrong to me. I think we are guaranteed this is a pure NaN.
> 
> I believe on MIPS, for instance, QNaN has the sign bit set so it will be
> impure. At least, that's my understanding from:
> https://github.com/WebAssembly/design/issues/976.

We could add a static_assert somewhere that quiet_NaN is pure though.
Comment 22 Ross Kirsling 2020-01-21 16:22:25 PST
Created bug 206560 to address the "excessive NaN checking" concern. I'm hoping the refactor presented there makes the direct issue moot.
Comment 23 Yusuke Suzuki 2020-01-21 22:41:47 PST
*** Bug 160287 has been marked as a duplicate of this bug. ***