WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 173506
ArrayPrototype methods should use JSValue::toLength for non-Arrays.
https://bugs.webkit.org/show_bug.cgi?id=173506
Summary
ArrayPrototype methods should use JSValue::toLength for non-Arrays.
Keith Miller
Reported
2017-06-16 19:26:49 PDT
ArrayPrototype getLength method use toLength for non-Arrays.
Attachments
Patch
(27.65 KB, patch)
2017-06-16 20:23 PDT
,
Keith Miller
no flags
Details
Formatted Diff
Diff
benchmark results
(95.14 KB, text/plain)
2017-06-16 20:45 PDT
,
Keith Miller
no flags
Details
Patch for landing
(27.77 KB, patch)
2017-06-16 21:03 PDT
,
Keith Miller
no flags
Details
Formatted Diff
Diff
Patch for landing
(31.37 KB, patch)
2017-06-17 00:49 PDT
,
Keith Miller
no flags
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Keith Miller
Comment 1
2017-06-16 20:23:36 PDT
Created
attachment 313176
[details]
Patch
Keith Miller
Comment 2
2017-06-16 20:45:17 PDT
Created
attachment 313177
[details]
benchmark results
Ryosuke Niwa
Comment 3
2017-06-16 20:54:26 PDT
Comment on
attachment 313176
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=313176&action=review
> Source/JavaScriptCore/runtime/ArrayPrototype.cpp:1087 > + if (UNLIKELY(doubleLength + static_cast<double>(nrArgs) > maxSafeInteger())) > + return throwVMTypeError(exec, scope, ASCIILiteral("Cannot shift to offset greater than (2 ** 53) - 1"));
It's probably worth pointing out in the change log that this is what you're fixing.
> Source/JavaScriptCore/runtime/JSCJSValue.cpp:65 > + return maxSafeInteger(); // 2 ** 53 - 1
Seems like this comment is useless now. Remove?
Keith Miller
Comment 4
2017-06-16 21:00:00 PDT
Comment on
attachment 313176
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=313176&action=review
>> Source/JavaScriptCore/runtime/ArrayPrototype.cpp:1087 >> + return throwVMTypeError(exec, scope, ASCIILiteral("Cannot shift to offset greater than (2 ** 53) - 1")); > > It's probably worth pointing out in the change log that this is what you're fixing.
Done.
>> Source/JavaScriptCore/runtime/JSCJSValue.cpp:65 >> + return maxSafeInteger(); // 2 ** 53 - 1 > > Seems like this comment is useless now. Remove?
Removed.
Keith Miller
Comment 5
2017-06-16 21:03:23 PDT
Created
attachment 313180
[details]
Patch for landing
Keith Miller
Comment 6
2017-06-16 21:06:57 PDT
Comment on
attachment 313180
[details]
Patch for landing ehh, gotta fix a bunch of sputnik tests :/
Keith Miller
Comment 7
2017-06-17 00:49:06 PDT
Created
attachment 313190
[details]
Patch for landing
WebKit Commit Bot
Comment 8
2017-06-17 03:52:37 PDT
Comment on
attachment 313190
[details]
Patch for landing Clearing flags on attachment: 313190 Committed
r218449
: <
http://trac.webkit.org/changeset/218449
>
WebKit Commit Bot
Comment 9
2017-06-17 03:52:39 PDT
All reviewed patches have been landed. Closing bug.
Darin Adler
Comment 10
2017-06-17 18:41:30 PDT
Itβs peculiar that toLength has a return type of double, but it is guaranteed to return an integer. Almost all these call sites immediately put the result into an unsigned. Is it better to put the conversion from double to integer at all these call sites rather than inside the function? Are the some callers that are more efficient because the return type is double instead of unsigned or int?
Saam Barati
Comment 11
2017-06-19 09:15:06 PDT
Comment on
attachment 313190
[details]
Patch for landing View in context:
https://bugs.webkit.org/attachment.cgi?id=313190&action=review
> Source/JavaScriptCore/ChangeLog:10 > + the getLength function, which was always incorrect to use, has
Why was this always incorrect?
Darin Adler
Comment 12
2017-06-19 21:40:20 PDT
Comment on
attachment 313190
[details]
Patch for landing View in context:
https://bugs.webkit.org/attachment.cgi?id=313190&action=review
Is
>> Source/JavaScriptCore/ChangeLog:10 >> + the getLength function, which was always incorrect to use, has > > Why was this always incorrect?
Presumably because the getLength function used toUInt32, which had the wrong semantic.
Saam Barati
Comment 13
2017-06-20 14:21:32 PDT
Comment on
attachment 313190
[details]
Patch for landing View in context:
https://bugs.webkit.org/attachment.cgi?id=313190&action=review
>>> Source/JavaScriptCore/ChangeLog:10 >>> + the getLength function, which was always incorrect to use, has >> >> Why was this always incorrect? > > Presumably because the getLength function used toUInt32, which had the wrong semantic.
Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch.
Darin Adler
Comment 14
2017-06-20 15:13:51 PDT
Comment on
attachment 313190
[details]
Patch for landing View in context:
https://bugs.webkit.org/attachment.cgi?id=313190&action=review
>>>> Source/JavaScriptCore/ChangeLog:10 >>>> + the getLength function, which was always incorrect to use, has >>> >>> Why was this always incorrect? >> >> Presumably because the getLength function used toUInt32, which had the wrong semantic. > > Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch.
I, in turn, am still concerned that we have potentially lost a fast code path that does not convert the array length to floating point and then back to an integer.
Saam Barati
Comment 15
2017-06-20 15:42:15 PDT
Comment on
attachment 313190
[details]
Patch for landing View in context:
https://bugs.webkit.org/attachment.cgi?id=313190&action=review
>>>>> Source/JavaScriptCore/ChangeLog:10 >>>>> + the getLength function, which was always incorrect to use, has >>>> >>>> Why was this always incorrect? >>> >>> Presumably because the getLength function used toUInt32, which had the wrong semantic. >> >> Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch. > > I, in turn, am still concerned that we have potentially lost a fast code path that does not convert the array length to floating point and then back to an integer.
I think you're correct. I wrote a test program that compiles with double->int and int->double conversions (that said, to be sure, I'd need to look at the disassembly in the various JS runtime functions). ``` #include <stdio.h> #include <stdlib.h> class C { public: C(int x, double y) : m_bar(x) , m_foo(y) { } unsigned m_bar; double m_foo; }; inline double toLength(C& test, bool b) { if (b) return test.m_bar; return test.m_foo; } int main(int argc, char**argv) { bool cond = !!atoi(argv[1]); unsigned i = atoi(argv[2]); C c(i, i + 0.5); unsigned test = toLength(c, cond); printf("%u\n", test); return 0; } ``` and it generates this code under O2: ``` (__TEXT,__text) section _main: 0000000100000f20 pushq %rbp 0000000100000f21 movq %rsp, %rbp 0000000100000f24 pushq %r14 0000000100000f26 pushq %rbx 0000000100000f27 movq %rsi, %rbx 0000000100000f2a movq 0x8(%rbx), %rdi 0000000100000f2e callq 0x100000f6e 0000000100000f33 movl %eax, %r14d 0000000100000f36 movq 0x10(%rbx), %rdi 0000000100000f3a callq 0x100000f6e 0000000100000f3f testl %r14d, %r14d 0000000100000f42 movl %eax, %eax 0000000100000f44 cvtsi2sdq %rax, %xmm0 0000000100000f49 jne 0x100000f53 0000000100000f4b addsd 0x4d(%rip), %xmm0 0000000100000f53 cvttsd2si %xmm0, %rsi 0000000100000f58 leaq 0x49(%rip), %rdi 0000000100000f5f xorl %eax, %eax 0000000100000f61 callq 0x100000f74 0000000100000f66 xorl %eax, %eax 0000000100000f68 popq %rbx 0000000100000f69 popq %r14 0000000100000f6b popq %rbp 0000000100000f6c retq ``` A few thoughts: - Can we write a microbenchmark that gets slower with the new code? - What would the code look like if we wanted the correct semantics while avoiding the unnecessary conversions on the JSArray fast path. I haven't given it much thought, but nothing comes to mind to make it cleaner. - My test program, and the actual toLength function, seem like the kind of thing LLVM should be smarter about avoiding the conversions.
Keith Miller
Comment 16
2017-06-20 15:44:30 PDT
(In reply to Darin Adler from
comment #14
)
> Comment on
attachment 313190
[details]
> Patch for landing > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=313190&action=review
> > >>>> Source/JavaScriptCore/ChangeLog:10 > >>>> + the getLength function, which was always incorrect to use, has > >>> > >>> Why was this always incorrect? > >> > >> Presumably because the getLength function used toUInt32, which had the wrong semantic. > > > > Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch. > > I, in turn, am still concerned that we have potentially lost a fast code > path that does not convert the array length to floating point and then back > to an integer.
Whoops, I meant to comment on this but I forgot. I think it would probably be fine to just return a uint64_t.
Keith Miller
Comment 17
2017-06-20 15:46:47 PDT
(In reply to Saam Barati from
comment #15
)
> Comment on
attachment 313190
[details]
> Patch for landing > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=313190&action=review
> > >>>>> Source/JavaScriptCore/ChangeLog:10 > >>>>> + the getLength function, which was always incorrect to use, has > >>>> > >>>> Why was this always incorrect? > >>> > >>> Presumably because the getLength function used toUInt32, which had the wrong semantic. > >> > >> Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch. > > > > I, in turn, am still concerned that we have potentially lost a fast code path that does not convert the array length to floating point and then back to an integer. > > I think you're correct. I wrote a test program that compiles with > double->int and int->double conversions (that said, to be sure, I'd need to > look at the disassembly in the various JS runtime functions). > > ``` > #include <stdio.h> > #include <stdlib.h> > > class C { > public: > C(int x, double y) > : m_bar(x) > , m_foo(y) > { } > unsigned m_bar; > double m_foo; > }; > > inline double toLength(C& test, bool b) > { > if (b) > return test.m_bar; > return test.m_foo; > } > > int main(int argc, char**argv) > { > bool cond = !!atoi(argv[1]); > unsigned i = atoi(argv[2]); > C c(i, i + 0.5); > unsigned test = toLength(c, cond); > printf("%u\n", test); > return 0; > } > ``` > > and it generates this code under O2: > ``` > (__TEXT,__text) section > _main: > 0000000100000f20 pushq %rbp > 0000000100000f21 movq %rsp, %rbp > 0000000100000f24 pushq %r14 > 0000000100000f26 pushq %rbx > 0000000100000f27 movq %rsi, %rbx > 0000000100000f2a movq 0x8(%rbx), %rdi > 0000000100000f2e callq 0x100000f6e > 0000000100000f33 movl %eax, %r14d > 0000000100000f36 movq 0x10(%rbx), %rdi > 0000000100000f3a callq 0x100000f6e > 0000000100000f3f testl %r14d, %r14d > 0000000100000f42 movl %eax, %eax > 0000000100000f44 cvtsi2sdq %rax, %xmm0 > 0000000100000f49 jne 0x100000f53 > 0000000100000f4b addsd 0x4d(%rip), %xmm0 > 0000000100000f53 cvttsd2si %xmm0, %rsi > 0000000100000f58 leaq 0x49(%rip), %rdi > 0000000100000f5f xorl %eax, %eax > 0000000100000f61 callq 0x100000f74 > 0000000100000f66 xorl %eax, %eax > 0000000100000f68 popq %rbx > 0000000100000f69 popq %r14 > 0000000100000f6b popq %rbp > 0000000100000f6c retq > ``` > > A few thoughts: > - Can we write a microbenchmark that gets slower with the new code? > - What would the code look like if we wanted the correct semantics while > avoiding the unnecessary conversions on the JSArray fast path. I haven't > given it much thought, but nothing comes to mind to make it cleaner. > - My test program, and the actual toLength function, seem like the kind of > thing LLVM should be smarter about avoiding the conversions.
Does llvm do the smart thing if you use a variant with an int and a double? I'm also not sure that it matters here. I don't think a caller will ever do anything interesting with a value > 2 ** 53 - 1.
Saam Barati
Comment 18
2017-06-20 16:04:57 PDT
Comment on
attachment 313190
[details]
Patch for landing View in context:
https://bugs.webkit.org/attachment.cgi?id=313190&action=review
>>>>>>>> Source/JavaScriptCore/ChangeLog:10 >>>>>>>> + the getLength function, which was always incorrect to use, has >>>>>>> >>>>>>> Why was this always incorrect? >>>>>> >>>>>> Presumably because the getLength function used toUInt32, which had the wrong semantic. >>>>> >>>>> Right, this makes sense. When I posted this comment, I must've read the code for toLength(ExecState* exec, JSObject* obj) incorrectly. I thought it always performed a get("length". However, it also has a fast path for JSArray, which is what I was concerned we were loosing in the code switch. >>>> >>>> I, in turn, am still concerned that we have potentially lost a fast code path that does not convert the array length to floating point and then back to an integer. >>> >>> I think you're correct. I wrote a test program that compiles with double->int and int->double conversions (that said, to be sure, I'd need to look at the disassembly in the various JS runtime functions). >>> >>> ``` >>> #include <stdio.h> >>> #include <stdlib.h> >>> >>> class C { >>> public: >>> C(int x, double y) >>> : m_bar(x) >>> , m_foo(y) >>> { } >>> unsigned m_bar; >>> double m_foo; >>> }; >>> >>> inline double toLength(C& test, bool b) >>> { >>> if (b) >>> return test.m_bar; >>> return test.m_foo; >>> } >>> >>> int main(int argc, char**argv) >>> { >>> bool cond = !!atoi(argv[1]); >>> unsigned i = atoi(argv[2]); >>> C c(i, i + 0.5); >>> unsigned test = toLength(c, cond); >>> printf("%u\n", test); >>> return 0; >>> } >>> ``` >>> >>> and it generates this code under O2: >>> ``` >>> (__TEXT,__text) section >>> _main: >>> 0000000100000f20 pushq %rbp >>> 0000000100000f21 movq %rsp, %rbp >>> 0000000100000f24 pushq %r14 >>> 0000000100000f26 pushq %rbx >>> 0000000100000f27 movq %rsi, %rbx >>> 0000000100000f2a movq 0x8(%rbx), %rdi >>> 0000000100000f2e callq 0x100000f6e >>> 0000000100000f33 movl %eax, %r14d >>> 0000000100000f36 movq 0x10(%rbx), %rdi >>> 0000000100000f3a callq 0x100000f6e >>> 0000000100000f3f testl %r14d, %r14d >>> 0000000100000f42 movl %eax, %eax >>> 0000000100000f44 cvtsi2sdq %rax, %xmm0 >>> 0000000100000f49 jne 0x100000f53 >>> 0000000100000f4b addsd 0x4d(%rip), %xmm0 >>> 0000000100000f53 cvttsd2si %xmm0, %rsi >>> 0000000100000f58 leaq 0x49(%rip), %rdi >>> 0000000100000f5f xorl %eax, %eax >>> 0000000100000f61 callq 0x100000f74 >>> 0000000100000f66 xorl %eax, %eax >>> 0000000100000f68 popq %rbx >>> 0000000100000f69 popq %r14 >>> 0000000100000f6b popq %rbp >>> 0000000100000f6c retq >>> ``` >>> >>> A few thoughts: >>> - Can we write a microbenchmark that gets slower with the new code? >>> - What would the code look like if we wanted the correct semantics while avoiding the unnecessary conversions on the JSArray fast path. I haven't given it much thought, but nothing comes to mind to make it cleaner. >>> - My test program, and the actual toLength function, seem like the kind of thing LLVM should be smarter about avoiding the conversions. >> >> Whoops, I meant to comment on this but I forgot. I think it would probably be fine to just return a uint64_t. > > Does llvm do the smart thing if you use a variant with an int and a double? > > I'm also not sure that it matters here. I don't think a caller will ever do anything interesting with a value > 2 ** 53 - 1.
After reading the spec, I'm not wondering if it's spec compliant for our code to just cast the toLength result to uint32_t. Why is this allowed? This seems observable. e.g, if
https://tc39.github.io/ecma262/#sec-array.prototype.includes
is called with a Proxy |this|.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug