RESOLVED FIXED Bug 170720
[JSC] Date.parse should accept wider range of representation
https://bugs.webkit.org/show_bug.cgi?id=170720
Summary [JSC] Date.parse should accept wider range of representation
Yusuke Suzuki
Reported 2017-04-11 02:12:43 PDT
Date.parse now conforms the spec. But it is a bit strict and users want to parse some more relaxed ones[1]. What we would like is adding a fallback path to try to parse it if the current Date.parse fails to parse. [1]: https://github.com/ariya/phantomjs/issues/11151
Attachments
Patch (8.07 KB, patch)
2017-04-13 05:23 PDT, Yusuke Suzuki
darin: review+
Yusuke Suzuki
Comment 1 2017-04-13 05:23:11 PDT
Darin Adler
Comment 2 2017-04-13 11:27:16 PDT
Comment on attachment 306985 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=306985&action=review > Source/WTF/wtf/DateMath.cpp:897 > + std::optional<int> year = std::nullopt; No need for "= std::nullopt". That’s what optionals do when not specified. > Source/WTF/wtf/DateMath.cpp:1132 > + // 1. According to the V8, this default value is compatible to the old KJS. While current V8 uses 2001, > + // SpiderMonkey does not have such a fallback path. So this reason is not big deal. I am a bit confused by this comment. V8 is the name of a JavaScript engine, not a person, so I don’t know what "according to the V8" means; maybe someone on the V8 team did some research about the behavior of older versions of WebKit? I don’t know what "compatible to the old KJS" means or why it’s relevant to our decision of what behavior is best. “The old KJS” seems quite unimportant; very few websites were tailored to assume the behavior of the KJS engine since it wasn’t used by all that many people. So maybe it’s simply a peculiar way to talk about older versions of WebKit and JavaScriptCore. The term “current V8” is not good since the comment is likely to last a long time and this may no longer be true in the future, so we should probably say this in a different way. Sometimes I use the phrase "as of this writing". Here’s a comment I might write, but I don’t know if it’s accurate: // 1. Year 2000 was used in older versions of JavaScriptCore, thus some older content // tested only with WebKit might rely on this behavior (according to research done by the V8 team). // (As of April 2017, V8 is using the year 2001 and Spider Monkey is not doing this kind of fallback.) But when written this way it becomes clear that it’s a little strange to rely on research done by the V8 team to know what the behavior of old versions of WebKit is. Can’t we just check this rather than relying on what they say? Also, it would be nice to be precise rather than just "older versions of JavaScriptCore".
Yusuke Suzuki
Comment 3 2017-04-14 00:56:14 PDT
Comment on attachment 306985 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=306985&action=review >> Source/WTF/wtf/DateMath.cpp:1132 >> + // SpiderMonkey does not have such a fallback path. So this reason is not big deal. > > I am a bit confused by this comment. V8 is the name of a JavaScript engine, not a person, so I don’t know what "according to the V8" means; maybe someone on the V8 team did some research about the behavior of older versions of WebKit? I don’t know what "compatible to the old KJS" means or why it’s relevant to our decision of what behavior is best. “The old KJS” seems quite unimportant; very few websites were tailored to assume the behavior of the KJS engine since it wasn’t used by all that many people. So maybe it’s simply a peculiar way to talk about older versions of WebKit and JavaScriptCore. The term “current V8” is not good since the comment is likely to last a long time and this may no longer be true in the future, so we should probably say this in a different way. Sometimes I use the phrase "as of this writing". Here’s a comment I might write, but I don’t know if it’s accurate: > > // 1. Year 2000 was used in older versions of JavaScriptCore, thus some older content > // tested only with WebKit might rely on this behavior (according to research done by the V8 team). > // (As of April 2017, V8 is using the year 2001 and Spider Monkey is not doing this kind of fallback.) > > But when written this way it becomes clear that it’s a little strange to rely on research done by the V8 team to know what the behavior of old versions of WebKit is. Can’t we just check this rather than relying on what they say? > > Also, it would be nice to be precise rather than just "older versions of JavaScriptCore". I've dived into old logs... This function is added at r34872 and not changed so much. While year is first initialized 0 and later it will be converted to 2000, we do not produce any values without parsing year before this patch. So, users won't see 2000 default value I think.
Yusuke Suzuki
Comment 4 2017-04-14 01:00:04 PDT
Simon Fraser (smfr)
Comment 5 2017-07-24 18:51:07 PDT
Darin Adler
Comment 6 2017-07-24 22:24:49 PDT
Do we know why it caused that, specifically what date strings were being passed?
Yusuke Suzuki
Comment 7 2017-07-25 05:04:34 PDT
Oh, that's very interesting b/c it means some libs would depend on super tricky behavior. I'll setup the environment to check what is going.
Simon Fraser (smfr)
Comment 8 2017-07-25 08:35:22 PDT
I didn't have time to do any investigation other than autospading.
Yusuke Suzuki
Comment 9 2017-07-25 11:35:59 PDT
I've investigated it. ## Situation This web site passes a string like "Jun. ‘15" to `Date.parse`. To make this site work correctly, we need to return NaN for this value. This patch relaxes Date parsing, then, for the above case, we return 2000/Jun/15 Date. This is expected behavior. As you can see the phantomjs issue in this bug, webdevs want to generate ????/Jun/15 for a string like "Jun 15". Why Chrome works fine is that it returns NaN for "Jun. ‘15". But it returns 2001/Jun/15 for "Jun. 15", "Jun 15" etc. ## In terms of the spec ECMA262's guarantee about Date.parse is not-strict. ECMA262 requires that some descriptive format can be parsed by `Date.parse` correctly. In the case of the other values, Date.parse may either succeed or fail. It's implementation dependent heuristic. So, with regard to the spec, our current behavior is correct. And ideally, users should not assume that the value must be rejected by Date.parse because the spec allows us to parse it with some heuristic. But the real world is not ideal....... :( https://tc39.github.io/ecma262/#sec-date.parse ## Possible solutions 1. Revert this patch By reverting this patch, we return NaN for "Jun. ‘15". But instead, we lose the ability to return 2001/Jun/15 for "Jun. 15", "Jun 15" etc. which is requested in phantomjs's issue (And sometimes we see this request in twitter). 2. Add a hack for "Jun. ‘15" case Add an additional heuristic for "Jun. ‘15" not to parse it as Date. Possible thing is that, if we see some leading text before the date (15), we do not recognize it as date. But it would break some existing heuristic. 3. Implement the heuristic completely same to the V8's one Strictly speaking, these heuristics are not specified in the spec. So, the request in the phantomjs issue is basically saying "We want to parse Date as if V8 parses it". The simplest solution for that is importing the heuristics in V8. 4. Request the site to fix it Ideally, anybody should not assume that Date.parse should fail with a specific format. The heuristics could recognize it. To me, for the short term solution, (1) should be appropriate. (2) is a bit too dangerous. For the long term solution, I think we have (1), (3), and (4). If we do not want to relax it, (1) and leave the current heuristic as is is the solution. If we would like to accept some form of Date which is requested in the original issue, we have (3) or (4). If we need to fix it quickly right now, we should take (1).
Simon Fraser (smfr)
Comment 10 2017-07-25 11:39:58 PDT
How do Gecko and Edge's Date parsing compare?
Yusuke Suzuki
Comment 11 2017-07-25 11:48:39 PDT
(In reply to Simon Fraser (smfr) from comment #10) > How do Gecko and Edge's Date parsing compare? Because I don't have Windows machine in my office, I cannot check it in Edge right now (anyone has Edge machine?). Firefox has different heuristic mechanism from V8. And it returns NaN for "Jun. ‘15", "Jun 15", and "Jun. 15".
Simon Fraser (smfr)
Comment 12 2017-07-25 11:52:19 PDT
Yusuke Suzuki
Comment 13 2017-07-25 12:46:45 PDT
(In reply to Simon Fraser (smfr) from comment #12) > You can get an Edge VM at > https://developer.microsoft.com/en-us/microsoft-edge/tools/vms/ Thank you! I've just downloaded and tested. Edge shows the same behavior to Firefox. And it returns NaN for "Jun. ‘15", "Jun 15", and "Jun. 15". So, not relaxing our heuristic is one idea. Basically, phantom.js's issue is raised because it is different from V8.
Note You need to log in before you can comment on or make changes to this bug.