RESOLVED FIXED Bug 173506
ArrayPrototype methods should use JSValue::toLength for non-Arrays.
https://bugs.webkit.org/show_bug.cgi?id=173506
Summary ArrayPrototype methods should use JSValue::toLength for non-Arrays.
Keith Miller
Reported 2017-06-16 19:26:49 PDT
ArrayPrototype getLength method use toLength for non-Arrays.
Attachments
Patch (27.65 KB, patch)
2017-06-16 20:23 PDT, Keith Miller
no flags
benchmark results (95.14 KB, text/plain)
2017-06-16 20:45 PDT, Keith Miller
no flags
Patch for landing (27.77 KB, patch)
2017-06-16 21:03 PDT, Keith Miller
no flags
Patch for landing (31.37 KB, patch)
2017-06-17 00:49 PDT, Keith Miller
no flags
Keith Miller
Comment 1 2017-06-16 20:23:36 PDT
Keith Miller
Comment 2 2017-06-16 20:45:17 PDT
Created attachment 313177 [details] benchmark results
Ryosuke Niwa
Comment 3 2017-06-16 20:54:26 PDT
Comment on attachment 313176 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=313176&action=review > Source/JavaScriptCore/runtime/ArrayPrototype.cpp:1087 > + if (UNLIKELY(doubleLength + static_cast<double>(nrArgs) > maxSafeInteger())) > + return throwVMTypeError(exec, scope, ASCIILiteral("Cannot shift to offset greater than (2 ** 53) - 1")); It's probably worth pointing out in the change log that this is what you're fixing. > Source/JavaScriptCore/runtime/JSCJSValue.cpp:65 > + return maxSafeInteger(); // 2 ** 53 - 1 Seems like this comment is useless now. Remove?
Keith Miller
Comment 4 2017-06-16 21:00:00 PDT
Comment on attachment 313176 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=313176&action=review >> Source/JavaScriptCore/runtime/ArrayPrototype.cpp:1087 >> + return throwVMTypeError(exec, scope, ASCIILiteral("Cannot shift to offset greater than (2 ** 53) - 1")); > > It's probably worth pointing out in the change log that this is what you're fixing. Done. >> Source/JavaScriptCore/runtime/JSCJSValue.cpp:65 >> + return maxSafeInteger(); // 2 ** 53 - 1 > > Seems like this comment is useless now. Remove? Removed.
Keith Miller
Comment 5 2017-06-16 21:03:23 PDT
Created attachment 313180 [details] Patch for landing
Keith Miller
Comment 6 2017-06-16 21:06:57 PDT
Comment on attachment 313180 [details] Patch for landing ehh, gotta fix a bunch of sputnik tests :/
Keith Miller
Comment 7 2017-06-17 00:49:06 PDT
Created attachment 313190 [details] Patch for landing
WebKit Commit Bot
Comment 8 2017-06-17 03:52:37 PDT
Comment on attachment 313190 [details] Patch for landing Clearing flags on attachment: 313190 Committed r218449: <http://trac.webkit.org/changeset/218449>
WebKit Commit Bot
Comment 9 2017-06-17 03:52:39 PDT
All reviewed patches have been landed. Closing bug.
Darin Adler
Comment 10 2017-06-17 18:41:30 PDT
It’s peculiar that toLength has a return type of double, but it is guaranteed to return an integer. Almost all these call sites immediately put the result into an unsigned. Is it better to put the conversion from double to integer at all these call sites rather than inside the function? Are the some callers that are more efficient because the return type is double instead of unsigned or int?
Saam Barati
Comment 11 2017-06-19 09:15:06 PDT
Comment on attachment 313190 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=313190&action=review > Source/JavaScriptCore/ChangeLog:10 > + the getLength function, which was always incorrect to use, has Why was this always incorrect?
Darin Adler
Comment 12 2017-06-19 21:40:20 PDT
Comment on attachment 313190 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=313190&action=review Is >> Source/JavaScriptCore/ChangeLog:10 >> + the getLength function, which was always incorrect to use, has > > Why was this always incorrect? Presumably because the getLength function used toUInt32, which had the wrong semantic.
Saam Barati
Comment 13 2017-06-20 14:21:32 PDT
Comment on attachment 313190 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=313190&action=review >>> Source/JavaScriptCore/ChangeLog:10 >>> + the getLength function, which was always incorrect to use, has >> >> Why was this always incorrect? > > Presumably because the getLength function used toUInt32, which had the wrong semantic. Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch.
Darin Adler
Comment 14 2017-06-20 15:13:51 PDT
Comment on attachment 313190 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=313190&action=review >>>> Source/JavaScriptCore/ChangeLog:10 >>>> + the getLength function, which was always incorrect to use, has >>> >>> Why was this always incorrect? >> >> Presumably because the getLength function used toUInt32, which had the wrong semantic. > > Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch. I, in turn, am still concerned that we have potentially lost a fast code path that does not convert the array length to floating point and then back to an integer.
Saam Barati
Comment 15 2017-06-20 15:42:15 PDT
Comment on attachment 313190 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=313190&action=review >>>>> Source/JavaScriptCore/ChangeLog:10 >>>>> + the getLength function, which was always incorrect to use, has >>>> >>>> Why was this always incorrect? >>> >>> Presumably because the getLength function used toUInt32, which had the wrong semantic. >> >> Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch. > > I, in turn, am still concerned that we have potentially lost a fast code path that does not convert the array length to floating point and then back to an integer. I think you're correct. I wrote a test program that compiles with double->int and int->double conversions (that said, to be sure, I'd need to look at the disassembly in the various JS runtime functions). ``` #include <stdio.h> #include <stdlib.h> class C { public: C(int x, double y) : m_bar(x) , m_foo(y) { } unsigned m_bar; double m_foo; }; inline double toLength(C& test, bool b) { if (b) return test.m_bar; return test.m_foo; } int main(int argc, char**argv) { bool cond = !!atoi(argv[1]); unsigned i = atoi(argv[2]); C c(i, i + 0.5); unsigned test = toLength(c, cond); printf("%u\n", test); return 0; } ``` and it generates this code under O2: ``` (__TEXT,__text) section _main: 0000000100000f20 pushq %rbp 0000000100000f21 movq %rsp, %rbp 0000000100000f24 pushq %r14 0000000100000f26 pushq %rbx 0000000100000f27 movq %rsi, %rbx 0000000100000f2a movq 0x8(%rbx), %rdi 0000000100000f2e callq 0x100000f6e 0000000100000f33 movl %eax, %r14d 0000000100000f36 movq 0x10(%rbx), %rdi 0000000100000f3a callq 0x100000f6e 0000000100000f3f testl %r14d, %r14d 0000000100000f42 movl %eax, %eax 0000000100000f44 cvtsi2sdq %rax, %xmm0 0000000100000f49 jne 0x100000f53 0000000100000f4b addsd 0x4d(%rip), %xmm0 0000000100000f53 cvttsd2si %xmm0, %rsi 0000000100000f58 leaq 0x49(%rip), %rdi 0000000100000f5f xorl %eax, %eax 0000000100000f61 callq 0x100000f74 0000000100000f66 xorl %eax, %eax 0000000100000f68 popq %rbx 0000000100000f69 popq %r14 0000000100000f6b popq %rbp 0000000100000f6c retq ``` A few thoughts: - Can we write a microbenchmark that gets slower with the new code? - What would the code look like if we wanted the correct semantics while avoiding the unnecessary conversions on the JSArray fast path. I haven't given it much thought, but nothing comes to mind to make it cleaner. - My test program, and the actual toLength function, seem like the kind of thing LLVM should be smarter about avoiding the conversions.
Keith Miller
Comment 16 2017-06-20 15:44:30 PDT
(In reply to Darin Adler from comment #14) > Comment on attachment 313190 [details] > Patch for landing > > View in context: > https://bugs.webkit.org/attachment.cgi?id=313190&action=review > > >>>> Source/JavaScriptCore/ChangeLog:10 > >>>> + the getLength function, which was always incorrect to use, has > >>> > >>> Why was this always incorrect? > >> > >> Presumably because the getLength function used toUInt32, which had the wrong semantic. > > > > Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch. > > I, in turn, am still concerned that we have potentially lost a fast code > path that does not convert the array length to floating point and then back > to an integer. Whoops, I meant to comment on this but I forgot. I think it would probably be fine to just return a uint64_t.
Keith Miller
Comment 17 2017-06-20 15:46:47 PDT
(In reply to Saam Barati from comment #15) > Comment on attachment 313190 [details] > Patch for landing > > View in context: > https://bugs.webkit.org/attachment.cgi?id=313190&action=review > > >>>>> Source/JavaScriptCore/ChangeLog:10 > >>>>> + the getLength function, which was always incorrect to use, has > >>>> > >>>> Why was this always incorrect? > >>> > >>> Presumably because the getLength function used toUInt32, which had the wrong semantic. > >> > >> Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch. > > > > I, in turn, am still concerned that we have potentially lost a fast code path that does not convert the array length to floating point and then back to an integer. > > I think you're correct. I wrote a test program that compiles with > double->int and int->double conversions (that said, to be sure, I'd need to > look at the disassembly in the various JS runtime functions). > > ``` > #include <stdio.h> > #include <stdlib.h> > > class C { > public: > C(int x, double y) > : m_bar(x) > , m_foo(y) > { } > unsigned m_bar; > double m_foo; > }; > > inline double toLength(C& test, bool b) > { > if (b) > return test.m_bar; > return test.m_foo; > } > > int main(int argc, char**argv) > { > bool cond = !!atoi(argv[1]); > unsigned i = atoi(argv[2]); > C c(i, i + 0.5); > unsigned test = toLength(c, cond); > printf("%u\n", test); > return 0; > } > ``` > > and it generates this code under O2: > ``` > (__TEXT,__text) section > _main: > 0000000100000f20 pushq %rbp > 0000000100000f21 movq %rsp, %rbp > 0000000100000f24 pushq %r14 > 0000000100000f26 pushq %rbx > 0000000100000f27 movq %rsi, %rbx > 0000000100000f2a movq 0x8(%rbx), %rdi > 0000000100000f2e callq 0x100000f6e > 0000000100000f33 movl %eax, %r14d > 0000000100000f36 movq 0x10(%rbx), %rdi > 0000000100000f3a callq 0x100000f6e > 0000000100000f3f testl %r14d, %r14d > 0000000100000f42 movl %eax, %eax > 0000000100000f44 cvtsi2sdq %rax, %xmm0 > 0000000100000f49 jne 0x100000f53 > 0000000100000f4b addsd 0x4d(%rip), %xmm0 > 0000000100000f53 cvttsd2si %xmm0, %rsi > 0000000100000f58 leaq 0x49(%rip), %rdi > 0000000100000f5f xorl %eax, %eax > 0000000100000f61 callq 0x100000f74 > 0000000100000f66 xorl %eax, %eax > 0000000100000f68 popq %rbx > 0000000100000f69 popq %r14 > 0000000100000f6b popq %rbp > 0000000100000f6c retq > ``` > > A few thoughts: > - Can we write a microbenchmark that gets slower with the new code? > - What would the code look like if we wanted the correct semantics while > avoiding the unnecessary conversions on the JSArray fast path. I haven't > given it much thought, but nothing comes to mind to make it cleaner. > - My test program, and the actual toLength function, seem like the kind of > thing LLVM should be smarter about avoiding the conversions. Does llvm do the smart thing if you use a variant with an int and a double? I'm also not sure that it matters here. I don't think a caller will ever do anything interesting with a value > 2 ** 53 - 1.
Saam Barati
Comment 18 2017-06-20 16:04:57 PDT
Comment on attachment 313190 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=313190&action=review >>>>>>>> Source/JavaScriptCore/ChangeLog:10 >>>>>>>> + the getLength function, which was always incorrect to use, has >>>>>>> >>>>>>> Why was this always incorrect? >>>>>> >>>>>> Presumably because the getLength function used toUInt32, which had the wrong semantic. >>>>> >>>>> Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch. >>>> >>>> I, in turn, am still concerned that we have potentially lost a fast code path that does not convert the array length to floating point and then back to an integer. >>> >>> I think you're correct. I wrote a test program that compiles with double->int and int->double conversions (that said, to be sure, I'd need to look at the disassembly in the various JS runtime functions). >>> >>> ``` >>> #include <stdio.h> >>> #include <stdlib.h> >>> >>> class C { >>> public: >>> C(int x, double y) >>> : m_bar(x) >>> , m_foo(y) >>> { } >>> unsigned m_bar; >>> double m_foo; >>> }; >>> >>> inline double toLength(C& test, bool b) >>> { >>> if (b) >>> return test.m_bar; >>> return test.m_foo; >>> } >>> >>> int main(int argc, char**argv) >>> { >>> bool cond = !!atoi(argv[1]); >>> unsigned i = atoi(argv[2]); >>> C c(i, i + 0.5); >>> unsigned test = toLength(c, cond); >>> printf("%u\n", test); >>> return 0; >>> } >>> ``` >>> >>> and it generates this code under O2: >>> ``` >>> (__TEXT,__text) section >>> _main: >>> 0000000100000f20 pushq %rbp >>> 0000000100000f21 movq %rsp, %rbp >>> 0000000100000f24 pushq %r14 >>> 0000000100000f26 pushq %rbx >>> 0000000100000f27 movq %rsi, %rbx >>> 0000000100000f2a movq 0x8(%rbx), %rdi >>> 0000000100000f2e callq 0x100000f6e >>> 0000000100000f33 movl %eax, %r14d >>> 0000000100000f36 movq 0x10(%rbx), %rdi >>> 0000000100000f3a callq 0x100000f6e >>> 0000000100000f3f testl %r14d, %r14d >>> 0000000100000f42 movl %eax, %eax >>> 0000000100000f44 cvtsi2sdq %rax, %xmm0 >>> 0000000100000f49 jne 0x100000f53 >>> 0000000100000f4b addsd 0x4d(%rip), %xmm0 >>> 0000000100000f53 cvttsd2si %xmm0, %rsi >>> 0000000100000f58 leaq 0x49(%rip), %rdi >>> 0000000100000f5f xorl %eax, %eax >>> 0000000100000f61 callq 0x100000f74 >>> 0000000100000f66 xorl %eax, %eax >>> 0000000100000f68 popq %rbx >>> 0000000100000f69 popq %r14 >>> 0000000100000f6b popq %rbp >>> 0000000100000f6c retq >>> ``` >>> >>> A few thoughts: >>> - Can we write a microbenchmark that gets slower with the new code? >>> - What would the code look like if we wanted the correct semantics while avoiding the unnecessary conversions on the JSArray fast path. I haven't given it much thought, but nothing comes to mind to make it cleaner. >>> - My test program, and the actual toLength function, seem like the kind of thing LLVM should be smarter about avoiding the conversions. >> >> Whoops, I meant to comment on this but I forgot. I think it would probably be fine to just return a uint64_t. > > Does llvm do the smart thing if you use a variant with an int and a double? > > I'm also not sure that it matters here. I don't think a caller will ever do anything interesting with a value > 2 ** 53 - 1. After reading the spec, I'm not wondering if it's spec compliant for our code to just cast the toLength result to uint32_t. Why is this allowed? This seems observable. e.g, if https://tc39.github.io/ecma262/#sec-array.prototype.includes is called with a Proxy |this|.
Note You need to log in before you can comment on or make changes to this bug.