Bug 173506 - ArrayPrototype methods should use JSValue::toLength for non-Arrays.
Summary: ArrayPrototype methods should use JSValue::toLength for non-Arrays.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Keith Miller
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-06-16 19:26 PDT by Keith Miller
Modified: 2017-06-20 16:04 PDT (History)
7 users (show)

See Also:


Attachments
Patch (27.65 KB, patch)
2017-06-16 20:23 PDT, Keith Miller
no flags Details | Formatted Diff | Diff
benchmark results (95.14 KB, text/plain)
2017-06-16 20:45 PDT, Keith Miller
no flags Details
Patch for landing (27.77 KB, patch)
2017-06-16 21:03 PDT, Keith Miller
no flags Details | Formatted Diff | Diff
Patch for landing (31.37 KB, patch)
2017-06-17 00:49 PDT, Keith Miller
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Keith Miller 2017-06-16 19:26:49 PDT
ArrayPrototype getLength method use toLength for non-Arrays.
Comment 1 Keith Miller 2017-06-16 20:23:36 PDT
Created attachment 313176 [details]
Patch
Comment 2 Keith Miller 2017-06-16 20:45:17 PDT
Created attachment 313177 [details]
benchmark results
Comment 3 Ryosuke Niwa 2017-06-16 20:54:26 PDT
Comment on attachment 313176 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=313176&action=review

> Source/JavaScriptCore/runtime/ArrayPrototype.cpp:1087
> +        if (UNLIKELY(doubleLength + static_cast<double>(nrArgs) > maxSafeInteger()))
> +            return throwVMTypeError(exec, scope, ASCIILiteral("Cannot shift to offset greater than (2 ** 53) - 1"));

It's probably worth pointing out in the change log that this is what you're fixing.

> Source/JavaScriptCore/runtime/JSCJSValue.cpp:65
> +        return maxSafeInteger(); // 2 ** 53 - 1

Seems like this comment is useless now. Remove?
Comment 4 Keith Miller 2017-06-16 21:00:00 PDT
Comment on attachment 313176 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=313176&action=review

>> Source/JavaScriptCore/runtime/ArrayPrototype.cpp:1087
>> +            return throwVMTypeError(exec, scope, ASCIILiteral("Cannot shift to offset greater than (2 ** 53) - 1"));
> 
> It's probably worth pointing out in the change log that this is what you're fixing.

Done.

>> Source/JavaScriptCore/runtime/JSCJSValue.cpp:65
>> +        return maxSafeInteger(); // 2 ** 53 - 1
> 
> Seems like this comment is useless now. Remove?

Removed.
Comment 5 Keith Miller 2017-06-16 21:03:23 PDT
Created attachment 313180 [details]
Patch for landing
Comment 6 Keith Miller 2017-06-16 21:06:57 PDT
Comment on attachment 313180 [details]
Patch for landing

ehh, gotta fix a bunch of sputnik tests :/
Comment 7 Keith Miller 2017-06-17 00:49:06 PDT
Created attachment 313190 [details]
Patch for landing
Comment 8 WebKit Commit Bot 2017-06-17 03:52:37 PDT
Comment on attachment 313190 [details]
Patch for landing

Clearing flags on attachment: 313190

Committed r218449: <http://trac.webkit.org/changeset/218449>
Comment 9 WebKit Commit Bot 2017-06-17 03:52:39 PDT
All reviewed patches have been landed.  Closing bug.
Comment 10 Darin Adler 2017-06-17 18:41:30 PDT
It’s peculiar that toLength has a return type of double, but it is guaranteed to return an integer. Almost all these call sites immediately put the result into an unsigned. Is it better to put the conversion from double to integer at all these call sites rather than inside the function? Are the some callers that are more efficient because the return type is double instead of unsigned or int?
Comment 11 Saam Barati 2017-06-19 09:15:06 PDT
Comment on attachment 313190 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=313190&action=review

> Source/JavaScriptCore/ChangeLog:10
> +        the getLength function, which was always incorrect to use, has

Why was this always incorrect?
Comment 12 Darin Adler 2017-06-19 21:40:20 PDT
Comment on attachment 313190 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=313190&action=review

Is

>> Source/JavaScriptCore/ChangeLog:10
>> +        the getLength function, which was always incorrect to use, has
> 
> Why was this always incorrect?

Presumably because the getLength function used toUInt32, which had the wrong semantic.
Comment 13 Saam Barati 2017-06-20 14:21:32 PDT
Comment on attachment 313190 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=313190&action=review

>>> Source/JavaScriptCore/ChangeLog:10
>>> +        the getLength function, which was always incorrect to use, has
>> 
>> Why was this always incorrect?
> 
> Presumably because the getLength function used toUInt32, which had the wrong semantic.

Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch.
Comment 14 Darin Adler 2017-06-20 15:13:51 PDT
Comment on attachment 313190 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=313190&action=review

>>>> Source/JavaScriptCore/ChangeLog:10
>>>> +        the getLength function, which was always incorrect to use, has
>>> 
>>> Why was this always incorrect?
>> 
>> Presumably because the getLength function used toUInt32, which had the wrong semantic.
> 
> Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch.

I, in turn, am still concerned that we have potentially lost a fast code path that does not convert the array length to floating point and then back to an integer.
Comment 15 Saam Barati 2017-06-20 15:42:15 PDT
Comment on attachment 313190 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=313190&action=review

>>>>> Source/JavaScriptCore/ChangeLog:10
>>>>> +        the getLength function, which was always incorrect to use, has
>>>> 
>>>> Why was this always incorrect?
>>> 
>>> Presumably because the getLength function used toUInt32, which had the wrong semantic.
>> 
>> Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch.
> 
> I, in turn, am still concerned that we have potentially lost a fast code path that does not convert the array length to floating point and then back to an integer.

I think you're correct. I wrote a test program that compiles with double->int and int->double conversions (that said, to be sure, I'd need to look at the disassembly in the various JS runtime functions).

```
#include <stdio.h>
#include <stdlib.h>

class C {
public:
    C(int x, double y)
        : m_bar(x)
        , m_foo(y)
    { }
    unsigned m_bar;
    double m_foo;
};

inline double toLength(C& test, bool b)
{
    if (b)
        return test.m_bar;
    return test.m_foo;
}

int main(int argc, char**argv)
{
    bool cond = !!atoi(argv[1]);
    unsigned i = atoi(argv[2]);
    C c(i, i + 0.5);
    unsigned test = toLength(c, cond);
    printf("%u\n", test);
    return 0;
}
```

and it generates this code under O2:
```
(__TEXT,__text) section
_main:
0000000100000f20	pushq	%rbp
0000000100000f21	movq	%rsp, %rbp
0000000100000f24	pushq	%r14
0000000100000f26	pushq	%rbx
0000000100000f27	movq	%rsi, %rbx
0000000100000f2a	movq	0x8(%rbx), %rdi
0000000100000f2e	callq	0x100000f6e
0000000100000f33	movl	%eax, %r14d
0000000100000f36	movq	0x10(%rbx), %rdi
0000000100000f3a	callq	0x100000f6e
0000000100000f3f	testl	%r14d, %r14d
0000000100000f42	movl	%eax, %eax
0000000100000f44	cvtsi2sdq	%rax, %xmm0
0000000100000f49	jne	0x100000f53
0000000100000f4b	addsd	0x4d(%rip), %xmm0
0000000100000f53	cvttsd2si	%xmm0, %rsi
0000000100000f58	leaq	0x49(%rip), %rdi
0000000100000f5f	xorl	%eax, %eax
0000000100000f61	callq	0x100000f74
0000000100000f66	xorl	%eax, %eax
0000000100000f68	popq	%rbx
0000000100000f69	popq	%r14
0000000100000f6b	popq	%rbp
0000000100000f6c	retq
```

A few thoughts:
- Can we write a microbenchmark that gets slower with the new code?
- What would the code look like if we wanted the correct semantics while avoiding the unnecessary conversions on the JSArray fast path. I haven't given it much thought, but nothing comes to mind to make it cleaner.
- My test program, and the actual toLength function, seem like the kind of thing LLVM should be smarter about avoiding the conversions.
Comment 16 Keith Miller 2017-06-20 15:44:30 PDT
(In reply to Darin Adler from comment #14)
> Comment on attachment 313190 [details]
> Patch for landing
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=313190&action=review
> 
> >>>> Source/JavaScriptCore/ChangeLog:10
> >>>> +        the getLength function, which was always incorrect to use, has
> >>> 
> >>> Why was this always incorrect?
> >> 
> >> Presumably because the getLength function used toUInt32, which had the wrong semantic.
> > 
> > Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch.
> 
> I, in turn, am still concerned that we have potentially lost a fast code
> path that does not convert the array length to floating point and then back
> to an integer.

Whoops, I meant to comment on this but I forgot. I think it would probably be fine to just return a uint64_t.
Comment 17 Keith Miller 2017-06-20 15:46:47 PDT
(In reply to Saam Barati from comment #15)
> Comment on attachment 313190 [details]
> Patch for landing
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=313190&action=review
> 
> >>>>> Source/JavaScriptCore/ChangeLog:10
> >>>>> +        the getLength function, which was always incorrect to use, has
> >>>> 
> >>>> Why was this always incorrect?
> >>> 
> >>> Presumably because the getLength function used toUInt32, which had the wrong semantic.
> >> 
> >> Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch.
> > 
> > I, in turn, am still concerned that we have potentially lost a fast code path that does not convert the array length to floating point and then back to an integer.
> 
> I think you're correct. I wrote a test program that compiles with
> double->int and int->double conversions (that said, to be sure, I'd need to
> look at the disassembly in the various JS runtime functions).
> 
> ```
> #include <stdio.h>
> #include <stdlib.h>
> 
> class C {
> public:
>     C(int x, double y)
>         : m_bar(x)
>         , m_foo(y)
>     { }
>     unsigned m_bar;
>     double m_foo;
> };
> 
> inline double toLength(C& test, bool b)
> {
>     if (b)
>         return test.m_bar;
>     return test.m_foo;
> }
> 
> int main(int argc, char**argv)
> {
>     bool cond = !!atoi(argv[1]);
>     unsigned i = atoi(argv[2]);
>     C c(i, i + 0.5);
>     unsigned test = toLength(c, cond);
>     printf("%u\n", test);
>     return 0;
> }
> ```
> 
> and it generates this code under O2:
> ```
> (__TEXT,__text) section
> _main:
> 0000000100000f20	pushq	%rbp
> 0000000100000f21	movq	%rsp, %rbp
> 0000000100000f24	pushq	%r14
> 0000000100000f26	pushq	%rbx
> 0000000100000f27	movq	%rsi, %rbx
> 0000000100000f2a	movq	0x8(%rbx), %rdi
> 0000000100000f2e	callq	0x100000f6e
> 0000000100000f33	movl	%eax, %r14d
> 0000000100000f36	movq	0x10(%rbx), %rdi
> 0000000100000f3a	callq	0x100000f6e
> 0000000100000f3f	testl	%r14d, %r14d
> 0000000100000f42	movl	%eax, %eax
> 0000000100000f44	cvtsi2sdq	%rax, %xmm0
> 0000000100000f49	jne	0x100000f53
> 0000000100000f4b	addsd	0x4d(%rip), %xmm0
> 0000000100000f53	cvttsd2si	%xmm0, %rsi
> 0000000100000f58	leaq	0x49(%rip), %rdi
> 0000000100000f5f	xorl	%eax, %eax
> 0000000100000f61	callq	0x100000f74
> 0000000100000f66	xorl	%eax, %eax
> 0000000100000f68	popq	%rbx
> 0000000100000f69	popq	%r14
> 0000000100000f6b	popq	%rbp
> 0000000100000f6c	retq
> ```
> 
> A few thoughts:
> - Can we write a microbenchmark that gets slower with the new code?
> - What would the code look like if we wanted the correct semantics while
> avoiding the unnecessary conversions on the JSArray fast path. I haven't
> given it much thought, but nothing comes to mind to make it cleaner.
> - My test program, and the actual toLength function, seem like the kind of
> thing LLVM should be smarter about avoiding the conversions.

Does llvm do the smart thing if you use a variant with an int and a double? 

I'm also not sure that it matters here. I don't think a caller will ever do anything interesting with a value > 2 ** 53 - 1.
Comment 18 Saam Barati 2017-06-20 16:04:57 PDT
Comment on attachment 313190 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=313190&action=review

>>>>>>>> Source/JavaScriptCore/ChangeLog:10
>>>>>>>> +        the getLength function, which was always incorrect to use, has
>>>>>>> 
>>>>>>> Why was this always incorrect?
>>>>>> 
>>>>>> Presumably because the getLength function used toUInt32, which had the wrong semantic.
>>>>> 
>>>>> Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch.
>>>> 
>>>> I, in turn, am still concerned that we have potentially lost a fast code path that does not convert the array length to floating point and then back to an integer.
>>> 
>>> I think you're correct. I wrote a test program that compiles with double->int and int->double conversions (that said, to be sure, I'd need to look at the disassembly in the various JS runtime functions).
>>> 
>>> ```
>>> #include <stdio.h>
>>> #include <stdlib.h>
>>> 
>>> class C {
>>> public:
>>>     C(int x, double y)
>>>         : m_bar(x)
>>>         , m_foo(y)
>>>     { }
>>>     unsigned m_bar;
>>>     double m_foo;
>>> };
>>> 
>>> inline double toLength(C& test, bool b)
>>> {
>>>     if (b)
>>>         return test.m_bar;
>>>     return test.m_foo;
>>> }
>>> 
>>> int main(int argc, char**argv)
>>> {
>>>     bool cond = !!atoi(argv[1]);
>>>     unsigned i = atoi(argv[2]);
>>>     C c(i, i + 0.5);
>>>     unsigned test = toLength(c, cond);
>>>     printf("%u\n", test);
>>>     return 0;
>>> }
>>> ```
>>> 
>>> and it generates this code under O2:
>>> ```
>>> (__TEXT,__text) section
>>> _main:
>>> 0000000100000f20	pushq	%rbp
>>> 0000000100000f21	movq	%rsp, %rbp
>>> 0000000100000f24	pushq	%r14
>>> 0000000100000f26	pushq	%rbx
>>> 0000000100000f27	movq	%rsi, %rbx
>>> 0000000100000f2a	movq	0x8(%rbx), %rdi
>>> 0000000100000f2e	callq	0x100000f6e
>>> 0000000100000f33	movl	%eax, %r14d
>>> 0000000100000f36	movq	0x10(%rbx), %rdi
>>> 0000000100000f3a	callq	0x100000f6e
>>> 0000000100000f3f	testl	%r14d, %r14d
>>> 0000000100000f42	movl	%eax, %eax
>>> 0000000100000f44	cvtsi2sdq	%rax, %xmm0
>>> 0000000100000f49	jne	0x100000f53
>>> 0000000100000f4b	addsd	0x4d(%rip), %xmm0
>>> 0000000100000f53	cvttsd2si	%xmm0, %rsi
>>> 0000000100000f58	leaq	0x49(%rip), %rdi
>>> 0000000100000f5f	xorl	%eax, %eax
>>> 0000000100000f61	callq	0x100000f74
>>> 0000000100000f66	xorl	%eax, %eax
>>> 0000000100000f68	popq	%rbx
>>> 0000000100000f69	popq	%r14
>>> 0000000100000f6b	popq	%rbp
>>> 0000000100000f6c	retq
>>> ```
>>> 
>>> A few thoughts:
>>> - Can we write a microbenchmark that gets slower with the new code?
>>> - What would the code look like if we wanted the correct semantics while avoiding the unnecessary conversions on the JSArray fast path. I haven't given it much thought, but nothing comes to mind to make it cleaner.
>>> - My test program, and the actual toLength function, seem like the kind of thing LLVM should be smarter about avoiding the conversions.
>> 
>> Whoops, I meant to comment on this but I forgot. I think it would probably be fine to just return a uint64_t.
> 
> Does llvm do the smart thing if you use a variant with an int and a double? 
> 
> I'm also not sure that it matters here. I don't think a caller will ever do anything interesting with a value > 2 ** 53 - 1.

After reading the spec, I'm not wondering if it's spec compliant for our code to just cast the toLength result to uint32_t. Why is this allowed?
This seems observable. e.g, if https://tc39.github.io/ecma262/#sec-array.prototype.includes is called with a Proxy |this|.