| Summary: | Allow WASM to use up to 4GB | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | WebKit | Reporter: | Robin Morisset <rmorisset> | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Component: | JavaScriptCore | Assignee: | Robin Morisset <rmorisset> | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Status: | RESOLVED FIXED | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Severity: | Normal | CC: | alecflett, beidson, benjamin, cdumez, cmarcelo, eric.carlson, ews-watchlist, glenn, jer.noble, jiewen_tan, jonlee, jsbell, keith_miller, mark.lam, mcatanzaro, msaboff, philipj, saam, sergio, tzagallo, webkit-bug-importer, ysuzuki | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Priority: | P1 | Keywords: | InRadar | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Version: | WebKit Nightly Build | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Hardware: | Unspecified | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| OS: | Unspecified | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| See Also: |
https://bugs.webkit.org/show_bug.cgi?id=231795 https://bugs.webkit.org/show_bug.cgi?id=231797 https://bugs.webkit.org/show_bug.cgi?id=232244 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Bug Depends on: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Bug Blocks: | 231795, 231797, 231935, 231975 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Description
Robin Morisset
2021-08-20 15:01:19 PDT
Created attachment 438994 [details]
WIP
Created attachment 439209 [details]
WIP
Created attachment 439218 [details]
WIP
Created attachment 439369 [details]
WIP
Created attachment 439798 [details]
WIP
Created attachment 439880 [details]
WIP
Created attachment 439884 [details]
WIP
Created attachment 439902 [details]
WIP
Created attachment 439926 [details]
WIP
Created attachment 439932 [details]
WIP
Created attachment 439945 [details]
WIP
Created attachment 439967 [details]
WIP
Created attachment 440112 [details]
WIP
Created attachment 440136 [details]
WIP
Created attachment 440258 [details]
Patch
Still missing a bunch of tests, but should otherwise be ready for review.
Comment on attachment 440258 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440258&action=review Drive-by comment. > Source/JavaScriptCore/ChangeLog:10 > + - 4GB is not representable in a uint32_t, so I changed all length of ArrayBuffer/TypedArray/etc.. to being size_t I wonder if we should use either UCPURegister or uintptr_t. If what we need to know is sizeof CPU register, then use UCPURegister. If what we need to know is sizeof points, then use uintptr_t. size_t sounds platform dependent. The size definition of it sounds vague. (for example, ARM64_32 / x32, what is sizeof(size_t)?) Here is information. If it is ARM64_32, then sizeof(UCPURegister) is 8, while sizeof(uintptr_t) is 4. If it is ARM64, then sizeof(UCPURegister) is 8, while sizeof(uintptr_t) is 8. If it is X64, then sizeof(UCPURegister) is 8, while sizeof(uintptr_t) is 8. BTW, USE(JSVALUE64) == CPU(REGISTER64). So in ARM64_32 environment, USE(JSVALUE64) is true (while sizeof(uintptr_t) is 4). > Source/JavaScriptCore/bytecode/AccessCase.cpp:1197 > +#if USE(JSVALUE64) How about using USE(LARGE_TYPED_ARRAYS)? > Source/JavaScriptCore/bytecode/AccessCase.cpp:1199 > + jit.load64(CCallHelpers::Address(baseGPR, JSArrayBufferView::offsetOfLength()), scratchGPR); What is the sizeof(size_t) in ARM64_32 environment? If it is 4, then let's not use USE(JSVALUE64) since ARM64_32 is using JSVALUE64 while it is using 4-byte pointers. While ARM64_32 is not using JIT yet, we had several refactoring for the preparation of that, so we should not break things. I think we should use either UCPURegister or uintptr_t. If we want to tie USE(JSVALUE64) to USE(LARGE_TYPED_ARRAYS), then we should use UCPURegister. > Source/JavaScriptCore/bytecode/AccessCase.cpp:1638 > +#if USE(JSVALUE64) Ditto. Comment on attachment 440258 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440258&action=review Post the current comments. > Source/WTF/wtf/CheckedArithmetic.h:1001 > +template<typename T> bool isSumSmallerThanOrEqual(T a, T b, T bound) > +{ > + Checked<T> sum = a; Should we take OverflowHandler like safeAdd and set that? Currently, Checked<T> is always using CrashOnOverflow. So, `return !sum.hasOverflowed() && sum.value() <= bound;` path is not used much (when overflow happens, we crash at `sum += b` place. > Source/WebCore/bindings/js/SerializedScriptValue.cpp:1268 > + Checked<uint32_t> dataLength = data->data().length(); > + write(dataLength); > + write(data->data().data(), dataLength); Can you make it DataClone failure instead of crash when transferring 4GB ArrayBuffer? > Source/WebCore/bindings/js/SerializedScriptValue.cpp:1317 > + Checked<uint32_t> byteLength = arrayBuffer->byteLength(); > + write(byteLength); > + write(static_cast<const uint8_t*>(arrayBuffer->data()), byteLength); Can you make it DataClone failure instead of crash when transferring 4GB ArrayBuffer? > Source/WebCore/fileapi/NetworkSendQueue.cpp:71 > + enqueue(JSC::ArrayBuffer::create(static_cast<size_t>(0U), 1), 0, 0); Probably this change not necessary. Thanks for your comment. I agree that UCPURegister makes more sense, I'll move to it. (In reply to Yusuke Suzuki from comment #17) > Comment on attachment 440258 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=440258&action=review > > Drive-by comment. > > > Source/JavaScriptCore/ChangeLog:10 > > + - 4GB is not representable in a uint32_t, so I changed all length of ArrayBuffer/TypedArray/etc.. to being size_t > > I wonder if we should use either UCPURegister or uintptr_t. > If what we need to know is sizeof CPU register, then use UCPURegister. > If what we need to know is sizeof points, then use uintptr_t. > size_t sounds platform dependent. The size definition of it sounds vague. > (for example, ARM64_32 / x32, what is sizeof(size_t)?) > > Here is information. > If it is ARM64_32, then sizeof(UCPURegister) is 8, while sizeof(uintptr_t) > is 4. > If it is ARM64, then sizeof(UCPURegister) is 8, while sizeof(uintptr_t) is 8. > If it is X64, then sizeof(UCPURegister) is 8, while sizeof(uintptr_t) is 8. > > BTW, USE(JSVALUE64) == CPU(REGISTER64). So in ARM64_32 environment, > USE(JSVALUE64) is true (while sizeof(uintptr_t) is 4). > > > Source/JavaScriptCore/bytecode/AccessCase.cpp:1197 > > +#if USE(JSVALUE64) > > How about using USE(LARGE_TYPED_ARRAYS)? > > > Source/JavaScriptCore/bytecode/AccessCase.cpp:1199 > > + jit.load64(CCallHelpers::Address(baseGPR, JSArrayBufferView::offsetOfLength()), scratchGPR); > > What is the sizeof(size_t) in ARM64_32 environment? > If it is 4, then let's not use USE(JSVALUE64) since ARM64_32 is using > JSVALUE64 while it is using 4-byte pointers. > While ARM64_32 is not using JIT yet, we had several refactoring for the > preparation of that, so we should not break things. > > I think we should use either UCPURegister or uintptr_t. If we want to tie > USE(JSVALUE64) to USE(LARGE_TYPED_ARRAYS), then we should use UCPURegister. > > > Source/JavaScriptCore/bytecode/AccessCase.cpp:1638 > > +#if USE(JSVALUE64) > > Ditto. > > Source/WTF/wtf/CheckedArithmetic.h:1001 > > +template<typename T> bool isSumSmallerThanOrEqual(T a, T b, T bound) > > +{ > > + Checked<T> sum = a; > > Should we take OverflowHandler like safeAdd and set that? > Currently, Checked<T> is always using CrashOnOverflow. > So, `return !sum.hasOverflowed() && sum.value() <= bound;` path is not used > much (when overflow happens, we crash at `sum += b` place. Oops, good catch, thank you. I'll replace Checked<T> by Checked<T, RecordOnOverflow>. > > Source/WebCore/bindings/js/SerializedScriptValue.cpp:1268 > > + Checked<uint32_t> dataLength = data->data().length(); > > + write(dataLength); > > + write(data->data().data(), dataLength); > > Can you make it DataClone failure instead of crash when transferring 4GB > ArrayBuffer? > > > Source/WebCore/bindings/js/SerializedScriptValue.cpp:1317 > > + Checked<uint32_t> byteLength = arrayBuffer->byteLength(); > > + write(byteLength); > > + write(static_cast<const uint8_t*>(arrayBuffer->data()), byteLength); > > Can you make it DataClone failure instead of crash when transferring 4GB > ArrayBuffer? will do. Comment on attachment 440258 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440258&action=review More drive-by comments. > Source/JavaScriptCore/runtime/JSGenericTypedArrayViewInlines.h:320 > + for (size_t i = 0; i < length; ++i) { > + // get() for now probably does not do the right thing for indices above UINT_MAX. > + RELEASE_ASSERT(isInBounds<unsigned>(i + objectOffset)); > + JSValue value = object->get(globalObject, static_cast<unsigned>(i + objectOffset)); Let's split this loop into two loops. One for unsigned region (up to UINT32_MAX - 1 since UINT32_MAX is not a right index), and one for non-unsigned region. Currently, we will crash if we have 4GB < index. And I think slow behavior is OK, but crash is not OK. JSObject::get is designed for JS's indexes. And 4GB region is not included in JS's index. So for typed-array, we should have special handling. By splitting the loop into two, we can avoid running the above RELEASE_ASSERT for each iteration in the first loop. And let's just perform super slow code, like, generating PropertyName via Identifier::from. >> Source/WTF/wtf/CheckedArithmetic.h:1001 >> + Checked<T> sum = a; > > Should we take OverflowHandler like safeAdd and set that? > Currently, Checked<T> is always using CrashOnOverflow. > So, `return !sum.hasOverflowed() && sum.value() <= bound;` path is not used much (when overflow happens, we crash at `sum += b` place. Probably, you need to specify RecordOverflow Comment on attachment 440258 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440258&action=review I think this is important question, so I'll post it quickly. > Source/JavaScriptCore/llint/LowLevelInterpreter.asm:1322 > - loadi JSArrayBufferView::m_length[base], length > - biaeq index, length, slowPath > + loadq JSArrayBufferView::m_length[base], length > + bqaeq index, length, slowPath > + elsif JSVALUE64 > + # length and scratch are intentionally undefined on this branch because they are not used on other platforms. > + bqaeq index, JSArrayBufferView::m_length[base], slowPath Question: do we already filter out negative int32 indexes before here? Previously, we didn't because we know that size limit is INT32_MAX: biaeq sees negative index as larger one than INT32_MAX. So we can filter negative indexes with this biaeq. But now, we are using bqaeq. So we need to ensure that index is either negative / positive 64bit integer, or we already filtered out negative int32. > Source/JavaScriptCore/llint/LowLevelInterpreter.asm:1332 > + if LARGE_TYPED_ARRAYS > + bqbeq index, SmallTypedArrayMaxLength, .smallTypedArray > + setLargeTypedArray() > +.smallTypedArray: > + end Ditto. Comment on attachment 440258 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440258&action=review Some comments on Checked. > Source/JavaScriptCore/runtime/ArrayBufferView.h:140 > + Checked<size_t> adjustedOffset = *offset; Use Checked<size_t, RecordOverflow>, or CheckedSize since default Checked<size_t> crashes when overflow happens. > Source/JavaScriptCore/runtime/JSArrayBufferView.cpp:87 > + Checked<size_t> size = length; > + size *= elementSize; > + if (size > MAX_ARRAY_BUFFER_SIZE) > return; Use Checked<size_t, RecordOverflow>, or CheckedSize since default Checked<size_t> crashes when overflow happens. And let's use if (size.hasOverflowed() || size.value() > MAX_ARRAY_BUFFER_SIZE) return; (In reply to Yusuke Suzuki from comment #22) > Comment on attachment 440258 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=440258&action=review > > I think this is important question, so I'll post it quickly. > > > Source/JavaScriptCore/llint/LowLevelInterpreter.asm:1322 > > - loadi JSArrayBufferView::m_length[base], length > > - biaeq index, length, slowPath > > + loadq JSArrayBufferView::m_length[base], length > > + bqaeq index, length, slowPath > > + elsif JSVALUE64 > > + # length and scratch are intentionally undefined on this branch because they are not used on other platforms. > > + bqaeq index, JSArrayBufferView::m_length[base], slowPath > > Question: do we already filter out negative int32 indexes before here? > Previously, we didn't because we know that size limit is INT32_MAX: biaeq > sees negative index as larger one than INT32_MAX. So we can filter negative > indexes with this biaeq. > But now, we are using bqaeq. So we need to ensure that index is either > negative / positive 64bit integer, or we already filtered out negative int32. Good catch, I've fixed it with a sign-extension as we've discussed. > > Source/JavaScriptCore/runtime/ArrayBufferView.h:140 > > + Checked<size_t> adjustedOffset = *offset; > > Use Checked<size_t, RecordOverflow>, or CheckedSize since default > Checked<size_t> crashes when overflow happens. Done. > > Source/JavaScriptCore/runtime/JSArrayBufferView.cpp:87 > > + Checked<size_t> size = length; > > + size *= elementSize; > > + if (size > MAX_ARRAY_BUFFER_SIZE) > > return; > > Use Checked<size_t, RecordOverflow>, or CheckedSize since default > Checked<size_t> crashes when overflow happens. > And let's use > > if (size.hasOverflowed() || size.value() > MAX_ARRAY_BUFFER_SIZE) > return; Done. Comment on attachment 440258 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440258&action=review >> Source/JavaScriptCore/llint/LowLevelInterpreter.asm:1322 >> + bqaeq index, JSArrayBufferView::m_length[base], slowPath > > Question: do we already filter out negative int32 indexes before here? > Previously, we didn't because we know that size limit is INT32_MAX: biaeq sees negative index as larger one than INT32_MAX. So we can filter negative indexes with this biaeq. > But now, we are using bqaeq. So we need to ensure that index is either negative / positive 64bit integer, or we already filtered out negative int32. If index is already sign-extended, then we can just use it. But if not, let's sign-extend it. Created attachment 440292 [details]
Patch
Adds a couple of new tests, and starts addressing Yusuke's comments.
Comment on attachment 440292 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440292&action=review OK, I think I reviewed runtime code. Will look into LLInt / Baseline / DFG / FTL and compilers next. > Source/JavaScriptCore/jit/JITOperations.cpp:2355 > We don't need i >= 0 below. > Source/JavaScriptCore/llint/LowLevelInterpreter.asm:1322 > + elsif JSVALUE64 Is using LARGE_TYPED_ARRAYS better? > Source/JavaScriptCore/runtime/JSObjectInlines.h:474 > + (void)arrayProfile; Use `UNUSED_PARAM(arrayProfile);` > > Source/JavaScriptCore/jit/JITOperations.cpp:2355 > > > > We don't need i >= 0 below. Fixed. > > Source/JavaScriptCore/llint/LowLevelInterpreter.asm:1322 > > + elsif JSVALUE64 > > Is using LARGE_TYPED_ARRAYS better? I don't think so: the change to the type of length is not gated by LARGE_TYPED_ARRAYS. What I think I should be using here is REGISTER64 or something like that. It does not seem to be available in the LLInt, but I can probably add it. I'll do so right now. > > Source/JavaScriptCore/runtime/JSObjectInlines.h:474 > > + (void)arrayProfile; > > Use `UNUSED_PARAM(arrayProfile);` Fixed. Comment on attachment 440292 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440292&action=review I've posted BaselineJIT related comments. > Source/JavaScriptCore/bytecode/AccessCase.cpp:1197 > +#if USE(JSVALUE64) Let's use USE(LARGE_TYPED_ARRAYS). > Source/JavaScriptCore/bytecode/AccessCase.cpp:1200 > + state.failAndRepatch.append(jit.branch64(CCallHelpers::AboveOrEqual, propertyGPR, scratchGPR)); The question is whether this propertyGPR is sign-extended int64_t or JSValue Int32. If it is JSValue Int32, then this branching is not correct since propertyGPR's upper 32bit can be JSValue Int32's tag. And I think this is probably not sign-extended since we are doing `jit.signExtend32ToPtr(propertyGPR, scratchGPR); in the following code. But be careful about sign-extension in this IC: IIUC, propertyGPR here needs to be unmodified in this IC since DFG / FTL are seeing this unmodified. So, if we would like to get sign-extended index, then we need to use scratch GPR. > Source/JavaScriptCore/bytecode/AccessCase.cpp:1638 > +#if USE(JSVALUE64) Let's use USE(LARGE_TYPED_ARRAYS). > Source/JavaScriptCore/bytecode/AccessCase.cpp:1642 > + state.failAndRepatch.append(jit.branch64(CCallHelpers::AboveOrEqual, propertyGPR, scratchGPR)); Ditto about sign-extension. Probably, this propertyGPR is not sign-extended since `jit.signExtend32ToPtr(propertyGPR, scratchGPR);` exists in the subsequent code. And please ensure that we are not modifying propertyGPR since DFG / FTL assumes that it is unmodified. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3252 > +#if USE(JSVALUE64) Let's use USE(LARGE_TYPED_ARRAYS). > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3261 > +#if USE(JSVALUE64) Let's use USE(LARGE_TYPED_ARRAYS). > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:8357 > +#if USE(JSVALUE64) Let's use USE(LARGE_TYPED_ARRAYS). > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:11136 > +#if USE(JSVALUE64) Let's use USE(LARGE_TYPED_ARRAYS). > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:11218 > +#if USE(JSVALUE64) Let's use USE(LARGE_TYPED_ARRAYS). > Source/JavaScriptCore/jit/IntrinsicEmitter.cpp:86 > +#if USE(JSVALUE64) Let's use USE(LARGE_TYPED_ARRAYS). > Source/JavaScriptCore/jit/IntrinsicEmitter.cpp:88 > + jit.load64(MacroAssembler::Address(state.baseGPR, JSArrayBufferView::offsetOfLength()), valueGPR); > + jit.boxInt52(valueGPR, valueGPR, state.scratchGPR, state.scratchFPR); This is not correct. Int52 is only available in DFG and FTL. So generating it in IC is not right since IC can be used in Baseline JIT too. > Source/JavaScriptCore/jit/IntrinsicEmitter.cpp:99 > +#if USE(JSVALUE64) Let's use USE(LARGE_TYPED_ARRAYS). > Source/JavaScriptCore/jit/IntrinsicEmitter.cpp:103 > + jit.boxInt52(valueGPR, valueGPR, state.scratchGPR, state.scratchFPR); Ditto. IC can be used in BaselineJIT. And Int52 is only available in DFG and FTL. > Source/JavaScriptCore/jit/IntrinsicEmitter.cpp:143 > +#if USE(JSVALUE64) Let's use USE(LARGE_TYPED_ARRAYS). > Source/JavaScriptCore/jit/IntrinsicEmitter.cpp:144 > + jit.boxInt52(valueGPR, valueGPR, state.scratchGPR, state.scratchFPR); Ditto. IC can be used in BaselineJIT. So always using Int52 here is not correct since Int52 is only available in DFG and FTL. Comment on attachment 440292 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440292&action=review >> Source/JavaScriptCore/jit/IntrinsicEmitter.cpp:88 >> + jit.boxInt52(valueGPR, valueGPR, state.scratchGPR, state.scratchFPR); > > This is not correct. Int52 is only available in DFG and FTL. So generating it in IC is not right since IC can be used in Baseline JIT too. Ignore this. BoxInt52 is boxing Int52 values into Double-encoded format. >> Source/JavaScriptCore/jit/IntrinsicEmitter.cpp:103 >> + jit.boxInt52(valueGPR, valueGPR, state.scratchGPR, state.scratchFPR); > > Ditto. IC can be used in BaselineJIT. And Int52 is only available in DFG and FTL. Ignore this. >> Source/JavaScriptCore/jit/IntrinsicEmitter.cpp:144 >> + jit.boxInt52(valueGPR, valueGPR, state.scratchGPR, state.scratchFPR); > > Ditto. IC can be used in BaselineJIT. So always using Int52 here is not correct since Int52 is only available in DFG and FTL. Ignore this. Comment on attachment 440292 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440292&action=review I've posted some more comments about DFG and FTL. I think we have several issues about type consistency between AI, constant-folding, and generated code. > Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:3486 > break; In GetArrayLength, we should constant-fold only when it is Int32 range. Let's consider about this. We have a code and we already removed `length > 0xfffffff0` since we said length is Int32. Then after that, if we convert it to non-Int32, then we break the invariant used in the previous optimization. > Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:3489 > + case GetArrayLengthAsInt52: { We should name it GetTypedArrayLengthAsInt52 since it only handles TypedArray. > Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:3832 > break; We should constant-fold only when it is in Int32 range. > Source/JavaScriptCore/dfg/DFGArrayMode.cpp:297 > + // FIXME: no idea why we only preserve this out-of-bounds information for PutByVal and not GetByVal as well. Can you file a bugzilla and paste a URL here? > Source/JavaScriptCore/dfg/DFGArrayMode.h:153 > + ArrayMode(Array::Type type, Array::Class arrayClass, Array::Speculation speculation, Array::Conversion conversion, Array::Action action, bool mayBeLargeTypedArray = false) Is there any code not passing `bool mayBeLargeTypedArray`? If there is no code doing that, can you change it to non-default parameter to avoid accidentally losing that information? > Source/JavaScriptCore/dfg/DFGClobberize.h:1339 > case GetTypedArrayByteOffset: > + case GetTypedArrayByteOffsetAsInt52: > read(MiscFields); > def(HeapLocation(TypedArrayByteOffsetLoc, MiscFields, node->child1()), LazyNode(node)); We should have different heap location for different types. See https://bugs.webkit.org/show_bug.cgi?id=170990. > Source/JavaScriptCore/dfg/DFGClobberize.h:1453 > + def(HeapLocation(ArrayLengthLoc, MiscFields, node->child1()), LazyNode(node)); We should have different heap location for different types. See https://bugs.webkit.org/show_bug.cgi?id=170990. > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1297 > + //if (child2->shouldSpeculateInt32() || !node->arrayMode().mayBeLargeTypedArray()) > + fixEdge<Int32Use>(child2); > + //else > + // fixEdge<Int52RepUse>(child2); Please remove comment-out part. > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:2806 > + //if (m_graph.varArgChild(node, 1)->shouldSpeculateInt32() || !node->arrayMode().mayBeLargeTypedArray()) > + fixEdge<Int32Use>(m_graph.varArgChild(node, 1)); > + //else > + // fixEdge<Int52RepUse>(m_graph.varArgChild(node, 1)); Please remove comment-out part. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:8354 > GPRTemporary result(this, Reuse, base); And compileGetTypedArrayByteOffset needs to perform OSR exit if it is out-of-range from Int32 since it is an invariant declared in AI. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:8359 > + speculationCheck(Overflow, JSValueSource(), nullptr, m_jit.branch64(MacroAssembler::Above, resultGPR, TrustedImm64(std::numeric_limits<uint32_t>::max()))); This is right. We need this check otherwise, it breaks AI's invariant that GetArrayLength returns Int32. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4887 > + LValue result = emitGetTypedArrayByteOffsetExceptCastingResult(); > + setInt32(m_out.castToInt32(result)); This is not correct. GetTypedArrayByteOffset must return Int32 valid value because AI says it returns SpecInt32Only. If the value is outside of Int32 range, then we should OSR exit because Int32 is an invariant (not prediction) used in an optimization. And we should check this OSR exit, and next time, we should emit AsInt52 version. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5083 > + speculate(Overflow, noValue(), nullptr, m_out.aboveOrEqual(length, m_out.constInt64(std::numeric_limits<int32_t>::max()))); Yes, this is correct. We need this Overflow check. And in DFGByteCodeParser, we should check this Overflow exit to emit GetTypedArrayLengthAsInt52 instead. > > Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:3486 > > break; > > In GetArrayLength, we should constant-fold only when it is Int32 range. > Let's consider about this. We have a code and we already removed `length > > 0xfffffff0` since we said length is Int32. Then after that, if we convert it > to non-Int32, then we break the invariant used in the previous optimization. Good point, fixed. > > Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:3489 > > + case GetArrayLengthAsInt52: { > > We should name it GetTypedArrayLengthAsInt52 since it only handles > TypedArray. Done. > > Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:3832 > > break; > > We should constant-fold only when it is in Int32 range. Fixed. > > Source/JavaScriptCore/dfg/DFGArrayMode.cpp:297 > > + // FIXME: no idea why we only preserve this out-of-bounds information for PutByVal and not GetByVal as well. > > Can you file a bugzilla and paste a URL here? Done: https://bugs.webkit.org/show_bug.cgi?id=231276 > > Source/JavaScriptCore/dfg/DFGArrayMode.h:153 > > + ArrayMode(Array::Type type, Array::Class arrayClass, Array::Speculation speculation, Array::Conversion conversion, Array::Action action, bool mayBeLargeTypedArray = false) > > Is there any code not passing `bool mayBeLargeTypedArray`? If there is no > code doing that, can you change it to non-default parameter to avoid > accidentally losing that information? Lots of code in DFG::ArrayMode::fromObserved does not pass mayBeLargeTypedArray. The flag is later taken from ArrayProfile when fromObserved calls withProfile. > > Source/JavaScriptCore/dfg/DFGClobberize.h:1339 > > case GetTypedArrayByteOffset: > > + case GetTypedArrayByteOffsetAsInt52: > > read(MiscFields); > > def(HeapLocation(TypedArrayByteOffsetLoc, MiscFields, node->child1()), LazyNode(node)); > > We should have different heap location for different types. See > https://bugs.webkit.org/show_bug.cgi?id=170990. Oh, I did not know that HeapLocation worked that way. Fixed. > > Source/JavaScriptCore/dfg/DFGClobberize.h:1453 > > + def(HeapLocation(ArrayLengthLoc, MiscFields, node->child1()), LazyNode(node)); > > We should have different heap location for different types. See > https://bugs.webkit.org/show_bug.cgi?id=170990. Fixed. > > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1297 > > + //if (child2->shouldSpeculateInt32() || !node->arrayMode().mayBeLargeTypedArray()) > > + fixEdge<Int32Use>(child2); > > + //else > > + // fixEdge<Int52RepUse>(child2); > > Please remove comment-out part. Oops, I thought I already did. Fixed. > > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:8354 > > GPRTemporary result(this, Reuse, base); > > And compileGetTypedArrayByteOffset needs to perform OSR exit if it is > out-of-range from Int32 since it is an invariant declared in AI. Good catch, I thought of it for GetArrayLength, but forgot it for GetTypedArrayByteOffset. Fixed. > > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4887 > > + LValue result = emitGetTypedArrayByteOffsetExceptCastingResult(); > > + setInt32(m_out.castToInt32(result)); > > This is not correct. GetTypedArrayByteOffset must return Int32 valid value > because AI says it returns SpecInt32Only. > If the value is outside of Int32 range, then we should OSR exit because > Int32 is an invariant (not prediction) used in an optimization. > And we should check this OSR exit, and next time, we should emit AsInt52 > version. Fixed (both here, and taking it into account in the ByteCodeParser). > > Source/JavaScriptCore/bytecode/AccessCase.cpp:1200 > > + state.failAndRepatch.append(jit.branch64(CCallHelpers::AboveOrEqual, propertyGPR, scratchGPR)); > > The question is whether this propertyGPR is sign-extended int64_t or JSValue > Int32. > If it is JSValue Int32, then this branching is not correct since > propertyGPR's upper 32bit can be JSValue Int32's tag. > And I think this is probably not sign-extended since we are doing > `jit.signExtend32ToPtr(propertyGPR, scratchGPR); in the following code. > > But be careful about sign-extension in this IC: IIUC, propertyGPR here needs > to be unmodified in this IC since DFG / FTL are seeing this unmodified. > So, if we would like to get sign-extended index, then we need to use scratch > GPR. Ah, this is tricky. I've fixed it, by moving the allocation of scratch2GPR a bit. It results in doing the sign-extension twice on 64-bit platforms, but I don't see how to avoid that without allocating a third scratch register. Do you think it would be better? (or is there a simpler way that I am missing?) > > Source/JavaScriptCore/bytecode/AccessCase.cpp:1642 > > + state.failAndRepatch.append(jit.branch64(CCallHelpers::AboveOrEqual, propertyGPR, scratchGPR)); > > Ditto about sign-extension. Probably, this propertyGPR is not sign-extended > since `jit.signExtend32ToPtr(propertyGPR, scratchGPR);` exists in the > subsequent code. > And please ensure that we are not modifying propertyGPR since DFG / FTL > assumes that it is unmodified. ditto. Created attachment 440336 [details]
Patch
Comment on attachment 440336 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440336&action=review > Source/JavaScriptCore/bytecode/AccessCase.cpp:1211 > +#if CPU(REGISTER64) Hmm, this sounds strange. Length is 64bit only when USE(LARGE_TYPED_ARRAYS) is true, correct? I think using USE(LARGE_TYPED_ARRAYS) is better. Anyway, we should not use different macro in different places. Let's use one of USE(LARGE_TYPED_ARRAYS), CPU(REGISTER64), or USE(JSVALUE64) exclusively. > Source/JavaScriptCore/bytecode/AccessCase.cpp:1219 > +#endif Ah, unfortunately, this does not work. This does not work. If we call preserveReusedRegistersByPushing thing, then when jumping to the different place (e.g. state.failAndRepatch.append()), we need to restore registers. See the case like IndexedScopedArgumentsLoad. > Source/JavaScriptCore/bytecode/AccessCase.cpp:1664 > +#endif Ditto. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3818 > +#if !USE(LARGE_TYPED_ARRAYS) > + if (!isInt32Speculation(prediction)) > + return false; > +#endif I think we should align this condition to `!isInt32Speculation(prediction) || m_inlineStackTop->m_exitProfile.hasExitSite(m_currentIndex, Overflow)`, or > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3854 > + if (!isInt32Speculation(prediction)) Ditto. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3879 > + if (!isInt32Speculation(prediction)) Ditto. > Source/JavaScriptCore/dfg/DFGOperations.cpp:1911 > +JSC_DEFINE_JIT_OPERATION(operationNewInt8ArrayWithSize, char*, (JSGlobalObject* globalObject, Structure* structure, UCPURegister length, char* vector)) Is UCPURegister correct? Isn't it CPURegister? (since newTypedArrayWithSize is using CPURegister and the original code was int32_t). > Source/JavaScriptCore/dfg/DFGOperations.cpp:1927 > +JSC_DEFINE_JIT_OPERATION(operationNewInt16ArrayWithSize, char*, (JSGlobalObject* globalObject, Structure* structure, UCPURegister length, char* vector)) Is UCPURegister correct? Isn't it CPURegister? (since newTypedArrayWithSize is using CPURegister and the original code was int32_t). > Source/JavaScriptCore/dfg/DFGOperations.cpp:1959 > +JSC_DEFINE_JIT_OPERATION(operationNewUint8ArrayWithSize, char*, (JSGlobalObject* globalObject, Structure* structure, UCPURegister length, char* vector)) Is UCPURegister correct? Isn't it CPURegister? (since newTypedArrayWithSize is using CPURegister and the original code was int32_t). > Source/JavaScriptCore/dfg/DFGOperations.cpp:1975 > +JSC_DEFINE_JIT_OPERATION(operationNewUint8ClampedArrayWithSize, char*, (JSGlobalObject* globalObject, Structure* structure, UCPURegister length, char* vector)) Is UCPURegister correct? Isn't it CPURegister? (since newTypedArrayWithSize is using CPURegister and the original code was int32_t). > Source/JavaScriptCore/dfg/DFGOperations.cpp:1991 > +JSC_DEFINE_JIT_OPERATION(operationNewUint16ArrayWithSize, char*, (JSGlobalObject* globalObject, Structure* structure, UCPURegister length, char* vector)) Is UCPURegister correct? Isn't it CPURegister? (since newTypedArrayWithSize is using CPURegister and the original code was int32_t). > Source/JavaScriptCore/dfg/DFGOperations.cpp:2007 > +JSC_DEFINE_JIT_OPERATION(operationNewUint32ArrayWithSize, char*, (JSGlobalObject* globalObject, Structure* structure, UCPURegister length, char* vector)) Is UCPURegister correct? Isn't it CPURegister? (since newTypedArrayWithSize is using CPURegister and the original code was int32_t). > Source/JavaScriptCore/dfg/DFGOperations.cpp:2023 > +JSC_DEFINE_JIT_OPERATION(operationNewFloat32ArrayWithSize, char*, (JSGlobalObject* globalObject, Structure* structure, UCPURegister length, char* vector)) Is UCPURegister correct? Isn't it CPURegister? (since newTypedArrayWithSize is using CPURegister and the original code was int32_t). > Source/JavaScriptCore/dfg/DFGOperations.cpp:2039 > +JSC_DEFINE_JIT_OPERATION(operationNewFloat64ArrayWithSize, char*, (JSGlobalObject* globalObject, Structure* structure, UCPURegister length, char* vector)) Is UCPURegister correct? Isn't it CPURegister? (since newTypedArrayWithSize is using CPURegister and the original code was int32_t). > Source/JavaScriptCore/dfg/DFGOperations.cpp:2055 > +JSC_DEFINE_JIT_OPERATION(operationNewBigInt64ArrayWithSize, char*, (JSGlobalObject* globalObject, Structure* structure, UCPURegister length, char* vector)) Is UCPURegister correct? Isn't it CPURegister? (since newTypedArrayWithSize is using CPURegister and the original code was int32_t). > Source/JavaScriptCore/dfg/DFGOperations.cpp:2071 > +JSC_DEFINE_JIT_OPERATION(operationNewBigUint64ArrayWithSize, char*, (JSGlobalObject* globalObject, Structure* structure, UCPURegister length, char* vector)) Is UCPURegister correct? Isn't it CPURegister? (since newTypedArrayWithSize is using CPURegister and the original code was int32_t). > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3252 > +#if CPU(REGISTER64) Let's use one of USE(LARGE_TYPED_ARRAYS), CPU(REGISTER64), or USE(JSVALUE64) exclusively. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3254 > + return m_jit.branch64( > + MacroAssembler::AboveOrEqual, indexGPR, MacroAssembler::Imm64(length)); I think this is strict-int32_t (please double-check). And if it is true, then this above-or-equal does not work since the upper bits are 0. So, negative int32_t will be interpret as large INT32_MAX < x <= UINT32_MAX range value incorrectly. See fillSpeculateInt32Internal for example. ``` 1108 case DataFormatJSInt32: { 1109 // In a strict fill we need to strip off the value tag. 1110 if (strict) { 1111 GPRReg gpr = info.gpr(); 1112 GPRReg result; 1113 // If the register has already been locked we need to take a copy. 1114 // If not, we'll zero extend in place, so mark on the info that this is now type DataFormatInt32, not DataFormatJSInt32. 1115 if (m_gprs.isLocked(gpr)) 1116 result = allocate(); 1117 else { 1118 m_gprs.lock(gpr); 1119 info.fillInt32(*m_stream, gpr); 1120 result = gpr; 1121 } 1122 m_jit.zeroExtend32ToWord(gpr, result); ``` StrictInt32 is int32_t, and it is zero-extended. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3261 > +#if CPU(REGISTER64) Ditto. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3264 > + return m_jit.branch64( > + MacroAssembler::AboveOrEqual, indexGPR, > + MacroAssembler::Address(baseGPR, JSArrayBufferView::offsetOfLength())); Ditto. This indexGPR is not correct. And please ensure that sign-extend indexGPR would change GPR which should not be changed. (since indexGPR is derived from some edge). > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:8362 > +#if CPU(REGISTER64) Ditto. We should use one macro exclusively. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:11141 > +#if CPU(REGISTER64) Ditto. We should use one macro exclusively. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:11143 > + m_jit.signExtend32ToPtr(sizeGPR, sizeGPR); This is not correct since sizeGPR is tied to node->child1() edge. So we cannot change it. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:11165 > slowCases.append(m_jit.branch32( > MacroAssembler::Above, sizeGPR, TrustedImm32(JSArrayBufferView::fastSizeLimit))); This is not correct since sizeGPR can include Int52 integer. If we use branch32, we only check slice of 32bit. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:11223 > +#if CPU(REGISTER64) Ditto. We should use one macro exclusively. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:2734 > + ASSERT(baseGPR != vectorGPR); > + ASSERT(baseGPR != dataGPR); > + ASSERT(vectorGPR != dataGPR); This is fundamental invariant in DFG. So we do not need ASSERT here. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:5688 > m_jit.zeroExtend32ToWord(indexGPR, t2); > if (data.byteSize > 1) > m_jit.add64(TrustedImm32(data.byteSize - 1), t2); > - m_jit.load32(MacroAssembler::Address(dataViewGPR, JSArrayBufferView::offsetOfLength()), t1); > + m_jit.load64(MacroAssembler::Address(dataViewGPR, JSArrayBufferView::offsetOfLength()), t1); > speculationCheck(OutOfBounds, JSValueRegs(), node, > m_jit.branch64(MacroAssembler::AboveOrEqual, t2, t1)); I think this is not correct. Previously, we are ensuring that t1's max value is INT32_MAX. Thus, if indexGPR is negative value, then t2 is always larger if upper 32bit is zero-extended. So that was fine. But now, t1 can be larger than INT32_MAX. So this speculation check is not correct if indexGPR is negative int32. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:5889 > - m_jit.load32(MacroAssembler::Address(dataViewGPR, JSArrayBufferView::offsetOfLength()), t1); > + m_jit.load64(MacroAssembler::Address(dataViewGPR, JSArrayBufferView::offsetOfLength()), t1); > speculationCheck(OutOfBounds, JSValueRegs(), node, > m_jit.branch64(MacroAssembler::AboveOrEqual, t2, t1)); Ditto. > Source/JavaScriptCore/jit/IntrinsicEmitter.cpp:86 > +#if CPU(REGISTER64) Ditto. We should use one macro exclusively. > Source/JavaScriptCore/jit/IntrinsicEmitter.cpp:99 > +#if CPU(REGISTER64) Ditto. We should use one macro exclusively. Comment on attachment 440336 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440336&action=review I think currently bound-check-lowering support in FTL is missing for TypedArray. While DFG performs bound check in GetByVal etc. itself, FTL decouples it from that node, and perform bound checking in CheckInBounds. And currently, CheckInBounds only accepts Int32. And we exclusively use GetArrayLength / GetVectorLength. See DFGSSALoweringPhase.cpp for more information. So, for larger TypedArray, I think these CheckInBounds always fail. So right now, probably, 4GB TypedArray always OSR exits on FTL. We also need to check whether the current (exiting) CheckInBounds is correctly implemented for new 4GB TypedArray. And if we implement Int52 version of CheckInBounds, then we also need to ensure that the other phase (e.g. DFGIntegerRangeOptimizationPhase) works correctly with that. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:15651 > LValue indexToCheck = m_out.zeroExtPtr(index); > if (data.byteSize > 1) I looks not correct. index is zero-extended. So if it is negative integer, it just becomes large index. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:15795 > - LValue length = m_out.zeroExtPtr(m_out.load32NonNegative(dataView, m_heaps.JSArrayBufferView_length)); > + LValue length = m_out.load64NonNegative(dataView, m_heaps.JSArrayBufferView_length); > LValue indexToCheck = m_out.zeroExtPtr(index); > if (data.byteSize > 1) > indexToCheck = m_out.add(indexToCheck, m_out.constInt64(data.byteSize - 1)); Ditto. These code is not correct. Created attachment 440817 [details]
Patch
WIP
> > Source/JavaScriptCore/bytecode/AccessCase.cpp:1211 > > +#if CPU(REGISTER64) > > Hmm, this sounds strange. Length is 64bit only when USE(LARGE_TYPED_ARRAYS) > is true, correct? > I think using USE(LARGE_TYPED_ARRAYS) is better. Anyway, we should not use > different macro in different places. > Let's use one of USE(LARGE_TYPED_ARRAYS), CPU(REGISTER64), or USE(JSVALUE64) > exclusively. Since using LARGE_TYPED_ARRAYS throughout was not practical (as the types change based on CPU(REGISTER64, not LARGE_TYPED_ARRAYS), and JSVALUE64 is not appropriate as we discussed earlier, I removed LARGE_TYPED_ARRAYS and switched everything to CPU(REGISTER64). > > Source/JavaScriptCore/bytecode/AccessCase.cpp:1219 > > +#endif > > Ah, unfortunately, this does not work. This does not work. If we call > preserveReusedRegistersByPushing thing, then when jumping to the different > place (e.g. state.failAndRepatch.append()), we need to restore registers. > See the case like IndexedScopedArgumentsLoad. I fixed this by moving the scratch register allocation back to where it was and instead relying on the fact that branch64/branch32 can apparently accept an Address. It is a bit messy (scratchGPR contains first the property then the length then the property again), but should work unless I missed something else. > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3818 > > +#if !USE(LARGE_TYPED_ARRAYS) > > + if (!isInt32Speculation(prediction)) > > + return false; > > +#endif > > I think we should align this condition to `!isInt32Speculation(prediction) > || m_inlineStackTop->m_exitProfile.hasExitSite(m_currentIndex, Overflow)` Good catch, I fixed this in all three places. > > Source/JavaScriptCore/dfg/DFGOperations.cpp:1911 > > +JSC_DEFINE_JIT_OPERATION(operationNewInt8ArrayWithSize, char*, (JSGlobalObject* globalObject, Structure* structure, UCPURegister length, char* vector)) > > Is UCPURegister correct? Isn't it CPURegister? (since newTypedArrayWithSize > is using CPURegister and the original code was int32_t). You are correct, I thought that the check for >= 0 was before that point, but it is after. Fixed. > > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3254 > > + return m_jit.branch64( > > + MacroAssembler::AboveOrEqual, indexGPR, MacroAssembler::Imm64(length)); > > I think this is strict-int32_t (please double-check). And if it is true, > then this above-or-equal does not work since the upper bits are 0. > So, negative int32_t will be interpret as large INT32_MAX < x <= UINT32_MAX > range value incorrectly. > See fillSpeculateInt32Internal for example. I thought that strictInt32 led to sign-extension. I fixed this by passing a scratch register and doing a sign-extension into it, then doing the test against it. > > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:11143 > > + m_jit.signExtend32ToPtr(sizeGPR, sizeGPR); > > This is not correct since sizeGPR is tied to node->child1() edge. So we > cannot change it. Fixed (using a scratch register). > > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:11165 > > slowCases.append(m_jit.branch32( > > MacroAssembler::Above, sizeGPR, TrustedImm32(JSArrayBufferView::fastSizeLimit))); > > This is not correct since sizeGPR can include Int52 integer. If we use > branch32, we only check slice of 32bit. Good catch, fixed. > > Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:2734 > > + ASSERT(baseGPR != vectorGPR); > > + ASSERT(baseGPR != dataGPR); > > + ASSERT(vectorGPR != dataGPR); > > This is fundamental invariant in DFG. So we do not need ASSERT here. I simply copied it from the original (non-Int52) version of the code. I removed it from both versions. > > Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:5688 > > m_jit.zeroExtend32ToWord(indexGPR, t2); > > if (data.byteSize > 1) > > m_jit.add64(TrustedImm32(data.byteSize - 1), t2); > > - m_jit.load32(MacroAssembler::Address(dataViewGPR, JSArrayBufferView::offsetOfLength()), t1); > > + m_jit.load64(MacroAssembler::Address(dataViewGPR, JSArrayBufferView::offsetOfLength()), t1); > > speculationCheck(OutOfBounds, JSValueRegs(), node, > > m_jit.branch64(MacroAssembler::AboveOrEqual, t2, t1)); > > I think this is not correct. Previously, we are ensuring that t1's max value > is INT32_MAX. Thus, if indexGPR is negative value, then t2 is always larger > if upper 32bit is zero-extended. So that was fine. > But now, t1 can be larger than INT32_MAX. So this speculation check is not > correct if indexGPR is negative int32. Fixed by using a sign-extension instead of a zero-extension. > > Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:5889 > > - m_jit.load32(MacroAssembler::Address(dataViewGPR, JSArrayBufferView::offsetOfLength()), t1); > > + m_jit.load64(MacroAssembler::Address(dataViewGPR, JSArrayBufferView::offsetOfLength()), t1); > > speculationCheck(OutOfBounds, JSValueRegs(), node, > > m_jit.branch64(MacroAssembler::AboveOrEqual, t2, t1)); > > Ditto. Ditto. Created attachment 440828 [details]
Patch
Comment on attachment 440828 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440828&action=review > Source/JavaScriptCore/bytecode/AccessCase.cpp:1198 > + // All of this code is a bit messy with length and property taking turns being in scratchGPR, but I don't see any simpler solution without requiring some extra scratch registers. I don't think this comment is necessary. > Source/JavaScriptCore/bytecode/AccessCase.cpp:1644 > + // All of this code is a bit messy with length and property taking turns being in scratchGPR, but I don't see any simpler solution without requiring some extra scratch registers. I don't think this comment is necessary. > Source/JavaScriptCore/llint/LowLevelInterpreter.asm:1320 > + # We have to sign-extend index here, so that the comparison with length will also catch any negative value > + sxi2q index, index I looked the code in LLInt and ensured that this is not necessary because this is already done by the caller of getByValTypedArray. Can you remove this, and add comment about it in getByValTypedArray's prologue as my earlier change did (https://bugs.webkit.org/attachment.cgi?id=440817&action=diff)? > Source/JavaScriptCore/llint/LowLevelInterpreter.asm:1325 > + # We have to sign-extend index here, so that the comparison with length will also catch any negative value > + sxi2q index, index Ditto. Created attachment 440832 [details]
Patch
rebased, fix missing include.
Created attachment 440836 [details]
Patch
Applying Yusuke's last comments
Comment on attachment 440836 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440836&action=review Can you check https://bugs.webkit.org/show_bug.cgi?id=229353#c37 's comment about FTL & CheckInBounds? > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1147 > + fixEdge<KnownCellUse>(m_graph.varArgChild(node, 0)); > + fixEdge<Int32Use>(m_graph.varArgChild(node, 1)); Let's add FIXME+bugzilla URL since currently we cannot handle Int52 index while it can be useful. > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:2170 > +#if CPU(REGISTER64) Let's remove this if-def switch and always compile GetTypedArrayLengthAsInt52 / GetTypedArrayByteOffsetAsInt52. This is cleaner, and they will not appear if `!CPU(REGISTER64)` > Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:5671 > +#if USE(REGISTER64) We will never use JSVALUE64 for non REGISTER64. So let's remove this if-def. We should always use `m_jit.signExtend32ToPtr(indexGPR, t2);` > Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:5876 > +#if USE(REGISTER64) Ditto. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:1028 > +#if CPU(REGISTER64) Let's remove this if-def. GetTypedArrayLengthAsInt52 is here only when CPU(REGISTER64) is true. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:1184 > +#if CPU(REGISTER64) Let's remove this if-def. GetTypedArrayByteOffsetAsInt52 is here only when CPU(REGISTER64) is true. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4887 > +#if CPU(REGISTER64) Let's remove this if-def. We already have code like `m_out.load64NonNegative(lowCell(m_node->child1()), m_heaps.JSArrayBufferView_length);` without this check. FTL is currently 64bit only. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4889 > + speculate(Overflow, noValue(), nullptr, m_out.aboveOrEqual(result, m_out.constInt64(std::numeric_limits<int32_t>::max()))); Why does it include Equal? > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4893 > +#if CPU(REGISTER64) Let's remove this if-def. GetTypedArrayByteOffsetAsInt52 is here only when CPU(REGISTER64) is true. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5087 > + speculate(Overflow, noValue(), nullptr, m_out.aboveOrEqual(length, m_out.constInt64(std::numeric_limits<int32_t>::max()))); Why does it include Equal? > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:8214 > +#if CPU(REGISTER64) Let's remove this if-def. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:15650 > LValue indexToCheck = m_out.zeroExtPtr(index); I think this needs to be sign extension, correct? > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:15793 > LValue indexToCheck = m_out.zeroExtPtr(index); Ditto. (In reply to Yusuke Suzuki from comment #37) > Comment on attachment 440336 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=440336&action=review > > I think currently bound-check-lowering support in FTL is missing for > TypedArray. While DFG performs bound check in GetByVal etc. itself, FTL > decouples it from that node, and perform bound checking in CheckInBounds. > And currently, CheckInBounds only accepts Int32. And we exclusively use > GetArrayLength / GetVectorLength. See DFGSSALoweringPhase.cpp for more > information. > So, for larger TypedArray, I think these CheckInBounds always fail. So right > now, probably, 4GB TypedArray always OSR exits on FTL. > > We also need to check whether the current (exiting) CheckInBounds is > correctly implemented for new 4GB TypedArray. > And if we implement Int52 version of CheckInBounds, then we also need to > ensure that the other phase (e.g. DFGIntegerRangeOptimizationPhase) works > correctly with that. Fixed, I added CheckInBoundsInt52. > > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:15651 > > LValue indexToCheck = m_out.zeroExtPtr(index); > > if (data.byteSize > 1) > > I looks not correct. index is zero-extended. So if it is negative integer, > it just becomes large index. Fixed, replaced by a sign extension. > > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:15795 > > - LValue length = m_out.zeroExtPtr(m_out.load32NonNegative(dataView, m_heaps.JSArrayBufferView_length)); > > + LValue length = m_out.load64NonNegative(dataView, m_heaps.JSArrayBufferView_length); > > LValue indexToCheck = m_out.zeroExtPtr(index); > > if (data.byteSize > 1) > > indexToCheck = m_out.add(indexToCheck, m_out.constInt64(data.byteSize - 1)); > > Ditto. These code is not correct. Fixed, replaced by a sign extension. > > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1147 > > + fixEdge<KnownCellUse>(m_graph.varArgChild(node, 0)); > > + fixEdge<Int32Use>(m_graph.varArgChild(node, 1)); > > Let's add FIXME+bugzilla URL since currently we cannot handle Int52 index > while it can be useful. I have a pretty long TODO-list of such things to improve in this patch, I'll file the bugzilla as soon as I get this patch across the finish line. > > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:2170 > > +#if CPU(REGISTER64) > > Let's remove this if-def switch and always compile > GetTypedArrayLengthAsInt52 / GetTypedArrayByteOffsetAsInt52. > This is cleaner, and they will not appear if `!CPU(REGISTER64)` Done. > > Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:5671 > > +#if USE(REGISTER64) > > We will never use JSVALUE64 for non REGISTER64. > So let's remove this if-def. We should always use > `m_jit.signExtend32ToPtr(indexGPR, t2);` Done. > > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:1028 > > +#if CPU(REGISTER64) > > Let's remove this if-def. GetTypedArrayLengthAsInt52 is here only when > CPU(REGISTER64) is true. Done. > > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:1184 > > +#if CPU(REGISTER64) > > Let's remove this if-def. GetTypedArrayByteOffsetAsInt52 is here only when > CPU(REGISTER64) is true. Done. > > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4887 > > +#if CPU(REGISTER64) > > Let's remove this if-def. We already have code like > `m_out.load64NonNegative(lowCell(m_node->child1()), > m_heaps.JSArrayBufferView_length);` without this check. > FTL is currently 64bit only. Done. > > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4889 > > + speculate(Overflow, noValue(), nullptr, m_out.aboveOrEqual(result, m_out.constInt64(std::numeric_limits<int32_t>::max()))); > > Why does it include Equal? I don't remember why I made that choice. Removing it. > > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4893 > > +#if CPU(REGISTER64) > > Let's remove this if-def. GetTypedArrayByteOffsetAsInt52 is here only when > CPU(REGISTER64) is true. Done. > > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5087 > > + speculate(Overflow, noValue(), nullptr, m_out.aboveOrEqual(length, m_out.constInt64(std::numeric_limits<int32_t>::max()))); > > Why does it include Equal? Removed. > > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:8214 > > +#if CPU(REGISTER64) > > Let's remove this if-def. Done. Created attachment 440916 [details]
Patch
Applied the last of Yusuke's comments, but there are still some frustrating LayoutTests failures for me to finish debugging.
Created attachment 440966 [details]
Patch
rebased, still has the LayoutTests failures and too few tests.
Comment on attachment 440966 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440966&action=review > Source/WebCore/bindings/js/SerializedScriptValue.cpp:1266 > + Checked<uint32_t, RecordOverflow> dataLength = data->data().length(); CheckedUint32 > Source/WebCore/bindings/js/SerializedScriptValue.cpp:1319 > + Checked<uint32_t, RecordOverflow> byteLength = arrayBuffer->byteLength(); CheckedUint32 Comment on attachment 440966 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=440966&action=review r=me. When landing, please ensure the following things too. 1. Land tests that exercise new paths. The test we discussed on slack should be included. 2. Make sure all JSC tests are passed. Currently, some tests are failing. > Source/JavaScriptCore/dfg/DFGIntegerRangeOptimizationPhase.cpp:1312 > + case CheckInBoundsInt52: { We should not enable range optimization for Int52 since currently range optimization is based on Int32. > Source/JavaScriptCore/dfg/DFGIntegerRangeOptimizationPhase.cpp:1387 > + case CheckInBoundsInt52: { We should not enable range optimization for Int52 since currently range optimization is based on Int32. > Source/JavaScriptCore/dfg/DFGIntegerRangeOptimizationPhase.cpp:1504 > + case GetTypedArrayLengthAsInt52: We should not enable range optimization for Int52 since currently range optimization is based on Int32. > Source/JavaScriptCore/dfg/DFGSSALoweringPhase.cpp:107 > + Node* length = m_insertionSet.insertNode( > + m_nodeIndex, SpecInt52Any, GetTypedArrayLengthAsInt52, m_node->origin, > + OpInfo(m_node->arrayMode().asWord()), base, storage); > + m_graph.varArgChild(m_node, 4) = Edge(length, Int52RepUse); Just changing this is not correct. It is modifying PutByVal's 5th child. So all the code using PutByVal / PutByValDirect needs to be revisited and ensured that length can be Int52. I've checked briefly and FTLLowerDFGToB3 is not handling this change. Please review all code using PutByVal / PutByValDirect to ensure that this change is correctly accepted. 6012 m_out.branch( 6013 m_out.aboveOrEqual(index, lowInt32(child5)), 6014 unsure(isOutOfBounds), unsure(isInBounds)); > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5149 > + speculate( > + OutOfBounds, noValue(), nullptr, > + m_out.aboveOrEqual(lowStrictInt52(m_node->child1()), lowStrictInt52(m_node->child2()))); I think using lowStrictInt52 for child1 is not correct because I don't see any conversion from Int32 to Int52. (And I think it should be Int32 right now since we do not have GetByVal etc. with Int52 index). I think it is Int32. And if it is Int32, please remember that we need to have sign-extension after lowInt32 to extend it to 64bit. And can you add a test to exercise this path? Created attachment 441120 [details]
Patch
Fixed both the LayoutTests and the failed jsc-tests.
Applied the last of Yusuke's comments, and fixed the style nits.
Still missing:
- Extra tests requested by Yusuke
- Updating the Changelogs
I am uploading this patch so that EWS can double-check that I fixed every outstanding bug while I take care of these two last things.
Comment on attachment 441120 [details]
Patch
Changes to SerializedScriptValue look good to me.
Created attachment 441154 [details]
Patch
Fix various issues found at the last minute by EWS.
Also updated the Changelogs, and added a test requested by Yusuke. I believe the other tests he asked for are already covered by the two tests I added earlier.
I'm going to wait for EWS before sending this to the commit queue, but hopefully the patch is now ready-to-land.
(In reply to Chris Dumez from comment #53) > Comment on attachment 441120 [details] > Patch > > Changes to SerializedScriptValue look good to me. Many thanks for your help in debugging that part! Created attachment 441157 [details]
Patch
Fix watch build issue.
Created attachment 441168 [details]
Patch
Attempt to fix build on 32-bit
Created attachment 441275 [details]
Patch
Fixes a couple LayoutTests failures which I recently introduced, fix a build issue on watch, and tries to fix 32-bit builds.
Created attachment 441303 [details]
Patch
The last patch did not seem to have any test failure, but it still had build issues on 32-bit and on watch.
In this patch, I used the more semantically correct size_t instead of UCPURegister, and ADDRESS64 instead of REGISTER64, I believe it should fix the problem at least on the watch.
Created attachment 441310 [details]
Patch
Fix the issue with builds on watch, and moves from using CPU(ADDRESS64) to a a proper USE(LARGE_TYPED_ARRAYS) relying on the size of size_t.
Created attachment 441311 [details]
Patch
Fix yet another build issue on watch.
Created attachment 441315 [details]
Patch
Fix yet another build issue on watch (caused by a remaining UCPURegister this time).
Comment on attachment 441315 [details]
Patch
It finally compiled fine on watch, sending it to the commit queue.
Committed r284230 (243039@main): <https://commits.webkit.org/243039@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 441315 [details]. This was reverted by https://bugs.webkit.org/show_bug.cgi?id=231797 Created attachment 441344 [details] Patch Another try, fixing the issue described in https://bugs.webkit.org/show_bug.cgi?id=231795. Created attachment 441392 [details]
Patch
Same as the last one, let's hope that EWS does not crash this time.
Reopened, I forgot to do it when I reverted this patch Comment on attachment 441392 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=441392&action=review > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4894 > +#pragma clang diagnostic push > +#pragma clang diagnostic ignored "-Wmissing-noreturn" These trigger -Wunknown-pragmas from GCC so can't do it unless you're sure you're building with clang. Use IGNORE_CLANG_WARNINGS_BEGIN/END instead it will take care of this for you. Created attachment 441431 [details]
Patch
Patch with a slight tweak in setLargeTypedArray in LowLevelInterpreter64.asm, in the hopes of fixing the GTK/WPE EWS build failures.
Created attachment 441435 [details]
Patch
Replacing "#pragma clang diagnostics ..." by "IGNORE_CLANG_WARNINGS..." as suggested by Michael Catanzaro to avoid causing issues on GCC builds.
Created attachment 441453 [details]
Patch
Now with uint64_t instead of unsigned long long in SerializedScriptValue.cpp to fix some build issues for the GTK EWS bot.
Created attachment 441454 [details]
patch
Now with uint64_t instead of unsigned long long in SerializedScriptValue.cpp to fix some build issues of the GTK EWS bot.
Created attachment 441504 [details]
Patch for landing
The only difference from the last patch I sent through EWS is fixing a missing initialization of a scratch register in the DFG (which was caught by the bots as a flakey/crashing test).
ChangeLog entry in LayoutTests/ChangeLog is not at the top of the file. Created attachment 441513 [details]
patch for landing
Fix changelogs which had been badly merged when I rebased the patch.
Committed r284330 (243125@main): <https://commits.webkit.org/243125@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 441513 [details]. |