Summary: | FTL is not working on Windows. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | peavo | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Component: | JavaScriptCore | Assignee: | Nobody <webkit-unassigned> | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Status: | NEW --- | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Severity: | Normal | CC: | achristensen, basile_clement, beidson, benjamin, bfulgham, cmarcelo, commit-queue, fpizlo, ggaren, gyuyoung.kim, Hironori.Fujii, ian.grunert, mark.lam, msaboff, oliver, ossy, sanxiyn | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Priority: | P2 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Attachments: |
|
Description
peavo
2015-05-25 02:25:14 PDT
Created attachment 253682 [details]
Patch
Created attachment 253683 [details]
Patch
Comment on attachment 253683 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=253683&action=review > Source/JavaScriptCore/ChangeLog:29 > + (JSC::FTL::canCompile): llvm does not support ArithPow on Windows. This is the wrong approach. You're preventing the compilation of an entire function because of an intrinsic. Instead you should implement ArithPow on Windows using a function call to a helper. (In reply to comment #3) > Comment on attachment 253683 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=253683&action=review > > > Source/JavaScriptCore/ChangeLog:29 > > + (JSC::FTL::canCompile): llvm does not support ArithPow on Windows. > > This is the wrong approach. You're preventing the compilation of an entire > function because of an intrinsic. Instead you should implement ArithPow on > Windows using a function call to a helper. Ok, thanks for reviewing :) I will look into this. (In reply to comment #4) > (In reply to comment #3) > > Comment on attachment 253683 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=253683&action=review > > > > > Source/JavaScriptCore/ChangeLog:29 > > > + (JSC::FTL::canCompile): llvm does not support ArithPow on Windows. > > > > This is the wrong approach. You're preventing the compilation of an entire > > function because of an intrinsic. Instead you should implement ArithPow on > > Windows using a function call to a helper. > > Ok, thanks for reviewing :) I will look into this. Note that you could literally call the pow() function. It's possible that if you emit a call to an extern function named "pow" then LLVM might try to turn this into an intrinsic and then fail in the backend if the intrinsic is unsupported (I don't know if this is happening here, but I remember such a LLVM JIT bug with other math functions). One way to avoid this bug is to grab a pointer to the pow() function, and then emit a call to that pointer constant. We've done this a lot in other such thorny cases. (In reply to comment #5) > (In reply to comment #4) > > (In reply to comment #3) > > > Comment on attachment 253683 [details] > > > Patch > > > > > > View in context: > > > https://bugs.webkit.org/attachment.cgi?id=253683&action=review > > > > > > > Source/JavaScriptCore/ChangeLog:29 > > > > + (JSC::FTL::canCompile): llvm does not support ArithPow on Windows. > > > > > > This is the wrong approach. You're preventing the compilation of an entire > > > function because of an intrinsic. Instead you should implement ArithPow on > > > Windows using a function call to a helper. > > > > Ok, thanks for reviewing :) I will look into this. > > Note that you could literally call the pow() function. It's possible that > if you emit a call to an extern function named "pow" then LLVM might try to > turn this into an intrinsic and then fail in the backend if the intrinsic is > unsupported (I don't know if this is happening here, but I remember such a > LLVM JIT bug with other math functions). One way to avoid this bug is to > grab a pointer to the pow() function, and then emit a call to that pointer > constant. We've done this a lot in other such thorny cases. Ok, thanks for the tip :) I also found this llvm bug: https://llvm.org/bugs/show_bug.cgi?id=8379 Be careful: _snprintf does not always null terminate (if the output is the exact length of the buffer), snprintf does. See https://bugzilla.mozilla.org/show_bug.cgi?id=821003 for more discussion. Created attachment 253732 [details]
Patch
(In reply to comment #7) > Be careful: _snprintf does not always null terminate (if the output is the > exact length of the buffer), snprintf does. > See https://bugzilla.mozilla.org/show_bug.cgi?id=821003 for more discussion. Thanks for the heads-up :) I will check these, and see if they're safe. (In reply to comment #5) > > Note that you could literally call the pow() function. It's possible that > if you emit a call to an extern function named "pow" then LLVM might try to > turn this into an intrinsic and then fail in the backend if the intrinsic is > unsupported (I don't know if this is happening here, but I remember such a > LLVM JIT bug with other math functions). One way to avoid this bug is to > grab a pointer to the pow() function, and then emit a call to that pointer > constant. We've done this a lot in other such thorny cases. I tried to fix this in llvm. You can see the proposed change in diff.txt. I'm not sure that it is correct, though. SunSpider runs fine. Does "__powidf2" and "pow" have the same signature? (In reply to comment #10) > (In reply to comment #5) > > > > Note that you could literally call the pow() function. It's possible that > > if you emit a call to an extern function named "pow" then LLVM might try to > > turn this into an intrinsic and then fail in the backend if the intrinsic is > > unsupported (I don't know if this is happening here, but I remember such a > > LLVM JIT bug with other math functions). One way to avoid this bug is to > > grab a pointer to the pow() function, and then emit a call to that pointer > > constant. We've done this a lot in other such thorny cases. > > I tried to fix this in llvm. You can see the proposed change in diff.txt. > I'm not sure that it is correct, though. SunSpider runs fine. Does > "__powidf2" and "pow" have the same signature? You'll need to verify much more than SunSpider. ;-) Can you run JSC stress tests? I don't know what __powidf2 is. (In reply to comment #11) > > You'll need to verify much more than SunSpider. ;-) > > Can you run JSC stress tests? > I get the same stress test results with and without this patch: ** The following JSC stress test failures have been introduced: jsc-layout-tests.yaml/js/script-tests/function-apply-aliased.js.layout-no-cjit jsc-layout-tests.yaml/js/script-tests/math.js.layout jsc-layout-tests.yaml/js/script-tests/math.js.layout-dfg-eager-no-cjit jsc-layout-tests.yaml/js/script-tests/math.js.layout-no-cjit jsc-layout-tests.yaml/js/script-tests/math.js.layout-no-llint Results for JSC stress tests: 5 failures found. How do I know that FTL kicks in as often as it should? > I don't know what __powidf2 is. I'm not sure, but I think __powidf2 is a function in the gcc c runtime lib, which llvm is using to implement the powi intrinsic. I tried to replace this with pow on Windows. (In reply to comment #12) > (In reply to comment #11) > > > > You'll need to verify much more than SunSpider. ;-) > > > > Can you run JSC stress tests? > > > > I get the same stress test results with and without this patch: > > ** The following JSC stress test failures have been introduced: > > jsc-layout-tests.yaml/js/script-tests/function-apply-aliased.js.layout-no- > cjit > jsc-layout-tests.yaml/js/script-tests/math.js.layout > > jsc-layout-tests.yaml/js/script-tests/math.js.layout-dfg-eager-no-cjit > jsc-layout-tests.yaml/js/script-tests/math.js.layout-no-cjit > jsc-layout-tests.yaml/js/script-tests/math.js.layout-no-llint > > Results for JSC stress tests: > 5 failures found. > > How do I know that FTL kicks in as often as it should? We have FTL-specific tests that follow this pattern: function foo() { test content here } noInline(foo) for (var i = 0; i < 10000; ++i) foo(); The test harness ensures that foo will be FTLed. > > > I don't know what __powidf2 is. > > I'm not sure, but I think __powidf2 is a function in the gcc c runtime lib, > which llvm is using to implement the powi intrinsic. I tried to replace this > with pow on Windows. OK. (In reply to comment #12) > (In reply to comment #11) > > > > You'll need to verify much more than SunSpider. ;-) > > > > Can you run JSC stress tests? > > > > I get the same stress test results with and without this patch: > > ** The following JSC stress test failures have been introduced: > > jsc-layout-tests.yaml/js/script-tests/function-apply-aliased.js.layout-no- > cjit > jsc-layout-tests.yaml/js/script-tests/math.js.layout > > jsc-layout-tests.yaml/js/script-tests/math.js.layout-dfg-eager-no-cjit > jsc-layout-tests.yaml/js/script-tests/math.js.layout-no-cjit > jsc-layout-tests.yaml/js/script-tests/math.js.layout-no-llint > > Results for JSC stress tests: > 5 failures found. > > How do I know that FTL kicks in as often as it should? > > > I don't know what __powidf2 is. > > I'm not sure, but I think __powidf2 is a function in the gcc c runtime lib, > which llvm is using to implement the powi intrinsic. I tried to replace this > with pow on Windows. Have you passed --ftl-jit to run-javascriptcore-tests? (Without this option FTL JIT is disabled on non Apple Mac platforms.) You can set JSC_ftlCrashes=1 environment variable to check if a test triggers FTL compilation or not. (In reply to comment #14) > > Have you passed --ftl-jit to run-javascriptcore-tests? > (Without this option FTL JIT is disabled on non Apple Mac platforms.) > > You can set JSC_ftlCrashes=1 environment variable > to check if a test triggers FTL compilation or not. Ah, ok, thanks, this explains a bit :) I did not pass this option. When I run the stress tests now, it only makes it to test number ~350 (v8) before it gets 5 failures, and then starts using excessive amounts of memory which stalls the test ... It looks like more work is required here. I will start looking into it :) Created attachment 254498 [details]
Patch
I've made a little progress here. The v8 crash is fixed, the next failing test is the v8 raytrace test, which renders incorrectly. I will have a look at that :) Created attachment 254499 [details]
Patch
Created attachment 254949 [details]
Patch
The v8 raytrace test is passing now. The next failing test is throw-from-ftl-call-ic-slow-path-undefined.js. I will try to debug that next :) Created attachment 255287 [details]
Patch
Created attachment 255288 [details]
Patch
Created attachment 255293 [details]
Patch
With the latest patch, I don't get any stress test regressions :) 20743/20743 (failed 7) ** The following JSC stress test failures have been introduced: jsc-layout-tests.yaml/js/script-tests/math.js.layout jsc-layout-tests.yaml/js/script-tests/math.js.layout-dfg-eager-no-cjit jsc-layout-tests.yaml/js/script-tests/math.js.layout-ftl jsc-layout-tests.yaml/js/script-tests/math.js.layout-ftl-eager-no-cjit jsc-layout-tests.yaml/js/script-tests/math.js.layout-ftl-no-cjit jsc-layout-tests.yaml/js/script-tests/math.js.layout-no-cjit jsc-layout-tests.yaml/js/script-tests/math.js.layout-no-llint Results for JSC stress tests: 7 failures found. (In reply to comment #24) > With the latest patch, I don't get any stress test regressions :) > > 20743/20743 (failed 7) > > ** The following JSC stress test failures have been introduced: > jsc-layout-tests.yaml/js/script-tests/math.js.layout > > jsc-layout-tests.yaml/js/script-tests/math.js.layout-dfg-eager-no-cjit > jsc-layout-tests.yaml/js/script-tests/math.js.layout-ftl > > jsc-layout-tests.yaml/js/script-tests/math.js.layout-ftl-eager-no-cjit > jsc-layout-tests.yaml/js/script-tests/math.js.layout-ftl-no-cjit > jsc-layout-tests.yaml/js/script-tests/math.js.layout-no-cjit > jsc-layout-tests.yaml/js/script-tests/math.js.layout-no-llint > > Results for JSC stress tests: > 7 failures found. What is the output of these tests? It could be the rounding mode you have set. Ah, I'm not sure what rounding mode I am using, how do I set it? --- ../.tests/jsc-layout-tests.yaml/js/math-expected.txt 2015-06-20 10:18:23.243854000 -0700 +++ ../jsc-layout-tests.yaml/js/script-tests/math.js.layout.out 2015-06-20 10:40:02.156854000 -0700 @@ -198,7 +198,7 @@ PASS Math.sign(-Infinity) is -1 PASS Math.sin(NaN) is NaN PASS Math.sin(0) is 0 -PASS Math.sin(-0) is -0 +FAIL Math.sin(-0) should be -0. Was 0. PASS Math.sin(Infinity) is NaN PASS Math.sin(-Infinity) is NaN PASS Math.sqrt(NaN) is NaN I can see that Visual Studio has a 'Floating Point Model' setting, which can be set to 'precise', 'strict', or 'fast'. It is currently set to 'precise' for JSC. I also found a function called _controlfp (https://msdn.microsoft.com/en-us/library/e9b52ceh.aspx), which seem to be able to set the rounding mode. I'm not sure what the default rounding mode is. I tried to set the rounding mode with _controlfp and the parameters _RC_CHOP, _RC_UP, _RC_DOWN, and _RC_NEAR. All other settings than _RC_NEAR resulted in more regressions. Setting rounding mode to _RC_NEAR gave the same results as when not setting the rounding mode, and did not fix the math.js.layout test. So I assume the default setting is _RC_NEAR. (In reply to comment #28) > I tried to set the rounding mode with _controlfp and the parameters > _RC_CHOP, _RC_UP, _RC_DOWN, and _RC_NEAR. All other settings than _RC_NEAR > resulted in more regressions. Setting rounding mode to _RC_NEAR gave the > same results as when not setting the rounding mode, and did not fix the > math.js.layout test. So I assume the default setting is _RC_NEAR. Looks like the default rounding mode is fine. If your only failure is sin(-0), I'd look at that path and see if there is some windows specific handling of sin that is causing the issue. (In reply to comment #29) > > Looks like the default rounding mode is fine. If your only failure is > sin(-0), I'd look at that path and see if there is some windows specific > handling of sin that is causing the issue. Yes, the only failure is sin(-0) (which also fails with FTL disabled). I investigated the path, and found nothing Windows specific, except for the sin() call itself from the function mathProtoFuncSin(). This makes me think that the issue is caused by differences in the MSVC and GCC C run time libraries. If MSVC runtime says sin(-0) = 0, this is incorrect (standard enforces sin(-0) = -0). According to http://stackoverflow.com/questions/30392832/sinminus-zero-does-not-return-the-expected-result-on-visual-studio-2013-64bi , this is the case at least for MSVC 2013 on Windows 7. Is this the MSVC/Windows version combination you are using? Can you confirm it is a runtime problem with a simple C program computing sin(-0)? (In reply to comment #30) > (In reply to comment #29) > > > > Looks like the default rounding mode is fine. If your only failure is > > sin(-0), I'd look at that path and see if there is some windows specific > > handling of sin that is causing the issue. > > Yes, the only failure is sin(-0) (which also fails with FTL disabled). > I investigated the path, and found nothing Windows specific, except for the > sin() call itself from the function mathProtoFuncSin(). > This makes me think that the issue is caused by differences in the MSVC and > GCC C run time libraries. (In reply to comment #31) > If MSVC runtime says sin(-0) = 0, this is incorrect (standard enforces > sin(-0) = -0). > According to > http://stackoverflow.com/questions/30392832/sinminus-zero-does-not-return- > the-expected-result-on-visual-studio-2013-64bi , this is the case at least > for MSVC 2013 on Windows 7. Is this the MSVC/Windows version combination you > are using? Can you confirm it is a runtime problem with a simple C program > computing sin(-0)? > Thanks for looking into this :) Yes, I'm using MSVC 2013 on Windows 7. I created a Win64 test application with the code from your stackoverflow link, and it does give the incorrect result sin(-0) = 0. The corresponding Win32 application gives the correct result, so it seems to be a problem with the Win64 C runtime libraries. Created attachment 255630 [details]
Patch
(In reply to comment #33) > Created attachment 255630 [details] > Patch Rebased. Created attachment 256090 [details]
Patch
(In reply to comment #35) > Created attachment 256090 [details] > Patch Rebased. For snprintf, can't one #include <wtf/StringExtras.h>? (In reply to comment #37) > For snprintf, can't one #include <wtf/StringExtras.h>? Thanks, this is a good suggestion, I will update the patch soon :) Created attachment 260695 [details]
Patch
Thanks for reviewing! I will upload a rebased patch to check that no builds are broken. Created attachment 260699 [details]
Patch
(In reply to comment #41) > Created attachment 260699 [details] > Patch Rebased patch. The prebuilt libllvmForJSC.dll is updated with LLVM revision 241791. For FTL to work properly on Windows, LLVM revision 241725 or later is required. Created attachment 260703 [details]
Patch
(In reply to comment #43) > Created attachment 260703 [details] > Patch To make the patch build on all platforms, I deleted the llvm api function BuildLandingPad from the list of api functions, since it has changed signature in recent llvm revisions. I am not sure just deleting it is the best approach, though? (In reply to comment #44) > (In reply to comment #43) > > Created attachment 260703 [details] > > Patch > > To make the patch build on all platforms, I deleted the llvm api function > BuildLandingPad from the list of api functions, since it has changed > signature in recent llvm revisions. I am not sure just deleting it is the > best approach, though? Yes that's the right approach. (In reply to comment #45) > (In reply to comment #44) > > (In reply to comment #43) > > > Created attachment 260703 [details] > > > Patch > > > > To make the patch build on all platforms, I deleted the llvm api function > > BuildLandingPad from the list of api functions, since it has changed > > signature in recent llvm revisions. I am not sure just deleting it is the > > best approach, though? > > Yes that's the right approach. Ok, thanks :) Do I need another r+, or should I go on and commit? When rebasing I got a couple of conflicts, you may want to do an additional check of those changes. Created attachment 260864 [details]
Patch
(In reply to comment #48) > Created attachment 260864 [details] > Patch Rebased, and removed the prebuilt libllvmForJSC.dll. I think it is better to manually build llvm, and avoid adding ~10MB to the source tree :) Created attachment 261971 [details]
Patch
Created attachment 261972 [details]
Patch
Created attachment 261978 [details]
Patch
(In reply to comment #52) > Created attachment 261978 [details] > Patch Rebased. I also had to make a few changes to the patch due to recent JSC changes. Created attachment 262334 [details]
Patch
(In reply to comment #54) > Created attachment 262334 [details] > Patch Rebased, and simplified a little bit :) Comment on attachment 262334 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=262334&action=review I have a few nits about the build system changes. > Source/JavaScriptCore/CMakeLists.txt:846 > + if (NOT LLVM_STATIC_LIBRARIES AND NOT MSVC) Please use WIN32 instead of MSVC. It causes fewer warnings when running cmake and until someone builds webkit with mingw or clang, they're the same thing. > Source/JavaScriptCore/CMakeLists.txt:876 > + target_link_libraries(llvmForJSC "LLVMAnalysis.lib;LLVMAsmParser.lib;LLVMAsmPrinter.lib;LLVMBitReader.lib;LLVMBitWriter.lib;LLVMCodeGen.lib;LLVMCore.lib;LLVMDebugInfo.lib;LLVMExecutionEngine.lib;LLVMInstCombine.lib;LLVMInstrumentation.lib;LLVMInterpreter.lib;LLVMipa.lib;LLVMipo.lib;LLVMIRReader.lib;LLVMJIT.lib;LLVMLinker.lib;LLVMLTO.lib;LLVMMC.lib;LLVMMCDisassembler.lib;LLVMMCJIT.lib;LLVMMCParser.lib;LLVMObjCARCOpts.lib;LLVMObject.lib;LLVMOption.lib;LLVMRuntimeDyld.lib;LLVMScalarOpts.lib;LLVMSelectionDAG.lib;LLVMSupport.lib;LLVMTableGen.lib;LLVMTarget.lib;LLVMTransformUtils.lib;LLVMVectorize.lib;LLVMX86AsmParser.lib;LLVMX86AsmPrinter.lib;LLVMX86CodeGen.lib;LLVMX86Desc.lib;LLVMX86Disassembler.lib;LLVMX86Info.lib;LLVMX86Utils.lib;LLVMProfileData.lib") These should be in LLVM_STATIC_LIBRARIES. > Source/cmake/OptionsWin.cmake:39 > +WEBKIT_OPTION_DEFAULT_PORT_VALUE(ENABLE_FTL_JIT PRIVATE ON) Shouldn't this only be enabled for the 64-bit build? > Tools/Scripts/webkitdirs.pm:1888 > + if (isWin64()) { && !canUseNinja() Created attachment 262379 [details]
Patch
(In reply to comment #56) > Comment on attachment 262334 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=262334&action=review > Thanks for reviewing! Updated patch :) Created attachment 262404 [details]
Patch
Created attachment 262630 [details]
Patch
(In reply to comment #60) > Created attachment 262630 [details] > Patch Rebased and removed a compile fix for VS2013 which is no longer needed for VS2015. Created attachment 263482 [details]
Patch
Comment on attachment 263482 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=263482&action=review > Source/JavaScriptCore/CMakeLists.txt:952 > + llvm/LLVMAPI.cpp Since this file is common to WIN32 and other ports, let's keep it in the common list above. > Source/JavaScriptCore/CMakeLists.txt:958 > + llvm/LLVMAPI.cpp Ditto. > Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h:722 > +#if CPU(X86_64) > + void loadDouble128(ImplicitAddress address, FPRegisterID src) > + { > + ASSERT(isSSE2Present()); > + m_assembler.movups_mr(src, address.offset, address.base); > + } > +#endif Why is this added? I don't see it used anywhere in this patch. > Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h:747 > +#if CPU(X86_64) > + void storeDouble128(FPRegisterID src, ImplicitAddress address) > + { > + ASSERT(isSSE2Present()); > + m_assembler.movups_rm(src, address.offset, address.base); > + } > +#endif Why is this added? I don't see it used anywhere in this patch. > Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h:197 > + Call callJIT() > + { > + DataLabelPtr label = moveWithPatch(TrustedImmPtr(0), scratchRegister); > + Call result = Call(m_assembler.call(scratchRegister), Call::Linkable); > + > + ASSERT_UNUSED(label, differenceBetween(label, result) == REPTACH_OFFSET_CALL_R11); > + return result; > + } Why is this added? I don't see it used anywhere in this patch. > Source/JavaScriptCore/assembler/X86Assembler.h:1848 > + void movups_rm(XMMRegisterID src, int offset, RegisterID base) > + { > + m_formatter.twoByteOp(OP2_MOVSD_WsdVsd, (RegisterID)src, base, offset); > + } > + > + void movups_mr(XMMRegisterID src, int offset, RegisterID base) > + { > + m_formatter.twoByteOp(OP2_MOVSD_VsdWsd, (RegisterID)src, base, offset); > + } Are these really needed since their only caller in this patch does not appear to be called by anyone. > Source/JavaScriptCore/ftl/FTLCapabilities.cpp:213 > + case ArithPow: > +#if OS(WINDOWS) > + // LLVM does not support powi on Windows. > + if (node->child2().useKind() == Int32Use) > + return CannotCompile; > +#endif > + break; Is this still necessary? In compileArithPow() below, you seem to have implemented support for (m_node->child2().useKind() == Int32Use) already (line 1843). Or am I misreading it? > Source/JavaScriptCore/ftl/FTLCompile.cpp:680 > + auto iter = recordMap.find(call.stackmapID()); > + if (iter != recordMap.end()) { > + for (unsigned i = 0; i < iter->value.size(); ++i) { > + StackMaps::Record& record = iter->value[i]; > + RegisterSet usedRegisters = usedRegistersFor(record); > + > + dataLog("Used registers = ", usedRegisters, "\n"); > + } > + } I think this code will clash with https://bugs.webkit.org/show_bug.cgi?id=149970 (which is fixing a bug in this area). We shouldn't land a change like this while the Windows 64bit build is still crashing in JavaScriptCore regression tests. Created attachment 263517 [details]
Patch
(In reply to comment #63) > Comment on attachment 263482 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=263482&action=review > Thanks for reviewing :) > > > Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h:722 > > +#if CPU(X86_64) > > + void loadDouble128(ImplicitAddress address, FPRegisterID src) > > + { > > + ASSERT(isSSE2Present()); > > + m_assembler.movups_mr(src, address.offset, address.base); > > + } > > +#endif > > Why is this added? I don't see it used anywhere in this patch. > Removed. This was in preparation for saving and restoring all 16 bytes of callee saved xmm registers. I'll create another patch for this soon. > > > Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h:197 > > + Call callJIT() > > + { > > + DataLabelPtr label = moveWithPatch(TrustedImmPtr(0), scratchRegister); > > + Call result = Call(m_assembler.call(scratchRegister), Call::Linkable); > > + > > + ASSERT_UNUSED(label, differenceBetween(label, result) == REPTACH_OFFSET_CALL_R11); > > + return result; > > + } > > Why is this added? I don't see it used anywhere in this patch. > Removed, no longer used. > > > Source/JavaScriptCore/ftl/FTLCapabilities.cpp:213 > > + case ArithPow: > > +#if OS(WINDOWS) > > + // LLVM does not support powi on Windows. > > + if (node->child2().useKind() == Int32Use) > > + return CannotCompile; > > +#endif > > + break; > > Is this still necessary? In compileArithPow() below, you seem to have > implemented support for (m_node->child2().useKind() == Int32Use) already > (line 1843). Or am I misreading it? > You're right. Both are not needed. I kept this, since implementing support for powi using doublePow did not produce the exact correct result, causing a stress test failure. (In reply to comment #64) > We shouldn't land a change like this while the Windows 64bit build is still > crashing in JavaScriptCore regression tests. I was not aware of any crashes in the JSC tests. Which tests are crashing? Do we have a stacktrace? I recently ran the JSC stress tests with FTL disabled without any failures. Maybe it has been introduced recently? I'm seeing 64-bit JSC (Release) crashes on the Apple Windows port. If you are not seeing them in the WinCairo build, it points another finger at an internal library that Mark and I were discussing as the likely culprit. (In reply to comment #68) > I'm seeing 64-bit JSC (Release) crashes on the Apple Windows port. If you > are not seeing them in the WinCairo build, it points another finger at an > internal library that Mark and I were discussing as the likely culprit. Ah, ok, I have only been testing WinCairo. Has it been introduced recently? (In reply to comment #69) > (In reply to comment #68) > > I'm seeing 64-bit JSC (Release) crashes on the Apple Windows port. If you > > are not seeing them in the WinCairo build, it points another finger at an > > internal library that Mark and I were discussing as the likely culprit. > > Ah, ok, I have only been testing WinCairo. Has it been introduced recently? I'm not sure. I'm trying to get our 64-bit testing infrastructure squared away so we can answer questions like these in the future. :-\ Also: I don't think we should land this until we can prove that the Windows 64-bit no-FTL configuration is as fast as - or almost as fast as - the OSX or Linux 64-bit no-FTL configuration. (In reply to comment #71) > Also: I don't think we should land this until we can prove that the Windows > 64-bit no-FTL configuration is as fast as - or almost as fast as - the OSX > or Linux 64-bit no-FTL configuration. Unfortunately, I don't currently have access to a Mac or a Linux machine, but maybe someone else has the opportunity to test this? :) What is the best way to compare two platforms? Created attachment 264235 [details]
Patch
Created attachment 264249 [details]
Patch
(In reply to comment #74) > Created attachment 264249 [details] > Patch Rebased, and resolved conflicts. I did not enable FTL in the latest patch, since we haven't compared DFG performance with other platforms yet. Created attachment 265292 [details]
Patch
Created attachment 265295 [details]
Patch
(In reply to comment #77) > Created attachment 265295 [details] > Patch Rebased and fixed a couple of recent stress test regressions. There are zero jsc stress test failures now. Still on LLVM revision 241791. 28632/28632 Results for JSC stress tests: 0 failures found. OK. Comment on attachment 265295 [details]
Patch
Let's revisit this in a few weeks/months once b3 is developed enough. Most of this will still be needed, but the llvm parts will not.
(In reply to comment #79) > Comment on attachment 265295 [details] > Patch > > Let's revisit this in a few weeks/months once b3 is developed enough. Most > of this will still be needed, but the llvm parts will not. Ok, sounds good, I might rebase from time to time, to avoid the relevant parts getting too stale :) Created attachment 267558 [details]
Patch
Created attachment 267560 [details]
Patch
Comment on attachment 267560 [details]
Patch
It is outdated, now LVM is gone. Could you update it to B3?
(In reply to comment #83) > Comment on attachment 267560 [details] > Patch > > It is outdated, now LVM is gone. Could you update it to B3? Thanks, I hope to look more into B3 on Windows soon. Created attachment 279685 [details]
Patch
Comment on attachment 279685 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=279685&action=review There is no need for X86_64 guards in FTL files, because FTL is 64 bit only by design. > Source/cmake/OptionsWin.cmake:33 > +WEBKIT_OPTION_DEFAULT_PORT_VALUE(ENABLE_FTL_JIT PRIVATE ON) Don't Windows ports support 32-bit x86 anymore? What about the conformance and performance results? (In reply to comment #86) > Comment on attachment 279685 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=279685&action=review > > > Source/cmake/OptionsWin.cmake:33 > > +WEBKIT_OPTION_DEFAULT_PORT_VALUE(ENABLE_FTL_JIT PRIVATE ON) > > Don't Windows ports support 32-bit x86 anymore? Based on https://bugs.webkit.org/show_bug.cgi?id=157931#c8 , it seems Apple Windows bots build x86 target, so enabling FTL JIT unconditionally isn't too good. But fortunately Platform.h disable FTL JIT on 32 bit platforms even if it is enabled by someone. /* The FTL *does not* work on 32-bit platforms. Disable it even if someone asked us to enable it. */ #if USE(JSVALUE32_64) #undef ENABLE_FTL_JIT #define ENABLE_FTL_JIT 0 #endif Ah, I got the goal of this ugly hack. :) It is used by Apple Mac build too. Apple Mac port enables FTL_JIT unconditionally in Source/JavaScriptCore/Configurations/FeatureDefines.xcconfig And then force disables in Platform.h for 32 bit platforms. I assume because xcconfig doesn't know if the current build is 32 or 64 bit. Am I right? Created attachment 279774 [details]
Patch
(In reply to comment #86) > Comment on attachment 279685 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=279685&action=review > > There is no need for X86_64 guards in FTL files, because FTL is 64 bit only > by design. > Thanks for reviewing :) Updated patch. (In reply to comment #87) > What about the conformance and performance results? There are no stress test regressions (I also applied the patch in https://bugs.webkit.org/show_bug.cgi?id=158073 when running stress tests). I will also run some performance tests. Benchmark report for Kraken on \c. VMs tested: "DFG" at /cygdrive/c/WebKit/WebKit3/WebKitBuild/ReleaseDFG/bin64/jsc "FTL" at /cygdrive/c/WebKit/WebKit3/WebKitBuild/Release/bin64/jsc Collected 4 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. DFG FTL ai-astar 366.796+-140.328 ^ 162.320+-12.679 ^ definitely 2.2597x faster audio-beat-detection 117.666+-5.419 ^ 89.257+-7.523 ^ definitely 1.3183x faster audio-dft 189.045+-11.426 ^ 145.779+-17.178 ^ definitely 1.2968x faster audio-fft 97.183+-7.117 ^ 65.513+-5.417 ^ definitely 1.4834x faster audio-oscillator 149.853+-55.328 94.431+-8.717 might be 1.5869x faster imaging-darkroom 189.104+-9.191 ^ 118.680+-1.601 ^ definitely 1.5934x faster imaging-desaturate 153.005+-4.605 ^ 102.285+-6.300 ^ definitely 1.4959x faster imaging-gaussian-blur 254.369+-5.462 ^ 131.986+-13.159 ^ definitely 1.9272x faster json-parse-financial 87.245+-4.712 ? 91.573+-1.767 ? might be 1.0496x slower json-stringify-tinderbox 67.304+-4.712 67.024+-2.299 stanford-crypto-aes 102.341+-9.824 ^ 81.158+-3.177 ^ definitely 1.2610x faster stanford-crypto-ccm 75.854+-7.986 ? 81.478+-2.980 ? might be 1.0741x slower stanford-crypto-pbkdf2 190.708+-7.233 ? 204.435+-44.800 ? might be 1.0720x slower stanford-crypto-sha256-iterative 76.860+-9.469 70.705+-3.160 might be 1.0871x faster <arithmetic> 151.238+-10.660 ^ 107.616+-4.342 ^ definitely 1.4053x faster Benchmark report for Octane on \c. VMs tested: "DFG" at /cygdrive/c/WebKit/WebKit3/WebKitBuild/ReleaseDFG/bin64/jsc "FTL" at /cygdrive/c/WebKit/WebKit3/WebKitBuild/Release/bin64/jsc Collected 4 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. DFG FTL encrypt 0.40643+-0.00614 ^ 0.30742+-0.00217 ^ definitely 1.3221x faster decrypt 7.42668+-0.06367 ^ 5.60406+-0.13478 ^ definitely 1.3252x faster deltablue x2 0.58301+-0.01605 ^ 0.28249+-0.01155 ^ definitely 2.0638x faster earley 0.89649+-0.01630 ^ 0.58405+-0.00759 ^ definitely 1.5350x faster boyer 12.80865+-0.48265 ^ 9.97758+-0.06177 ^ definitely 1.2837x faster navier-stokes x2 9.60640+-0.27730 ^ 7.11029+-0.21570 ^ definitely 1.3511x faster raytrace x2 3.86840+-0.15073 ^ 1.54362+-0.03002 ^ definitely 2.5061x faster richards x2 0.29908+-0.01145 ^ 0.16272+-0.00477 ^ definitely 1.8380x faster splay x2 0.89996+-0.04894 0.81134+-0.05202 might be 1.1092x faster regexp x2 35.52942+-1.02941 ! 39.02314+-1.40534 ! definitely 1.0983x slower pdfjs x2 77.58769+-5.01494 ? 79.13132+-2.39736 ? might be 1.0199x slower mandreel x2 138.25426+-8.46641 ^ 113.13855+-1.95979 ^ definitely 1.2220x faster gbemu x2 89.33162+-8.51151 ^ 77.26824+-0.84239 ^ definitely 1.1561x faster closure 1.03592+-0.05506 ? 1.03644+-0.01954 ? jquery 13.46860+-0.42885 13.33433+-0.27978 might be 1.0101x faster box2d x2 21.44274+-0.59958 ? 22.37898+-1.28378 ? might be 1.0437x slower zlib x2 783.26776+-42.17878 ^ 648.50476+-33.98770 ^ definitely 1.2078x faster typescript x2 1326.95959+-62.95861 ? 1377.46051+-25.29425 ? might be 1.0381x slower <geometric> 13.56436+-0.08183 ^ 10.65648+-0.06557 ^ definitely 1.2729x faster Created attachment 281349 [details]
Patch
What's the status of this bug? Can a JS person please approve/reject it? Comment on attachment 281349 [details]
Patch
r-, as this has been pending review for over a year now. It is near-impossible that this patch still applies to trunk and unlikely to still be relevant in its current form.
Created attachment 468205 [details] Patch Uploading a WIP patch for enabling FTL on Windows. Ran into a couple of problems which I'm currently stuck on. The patch enables webassembly at build time and disabling at runtime, as FTL makes assumptions that webassembly is enabled. Should be able to make a cleaner patch once https://bugs.webkit.org/show_bug.cgi?id=222315 lands (awaiting PR review). The BBQ JIT won't work on Windows even after the changes for enabling webassembly on Windows land - the BBQ JIT requires changes to GPR assignment so it can use callee saves. With two fewer GPRs to work with it runs out of GPRs to assign when running benchmarks. This should get separated out to a separate bug. The two problems I'm stuck on: Segfaults during JetStream2 when running stanford-crypto-aes (on most but not all runs, it occasionally works). I haven't tracked down the cause, it's a strange one. FTL is disabled on Windows for the EnumeratorNextUpdateIndexAndMode DFG operation. It crashes on Windows with a segfault on a simple test with JSC_useConcurrentJIT=false. If converted from UGPRPair return type back to JSValue it works - so either something is wrong in the FTL lowering, or the register allocator is having problems due to the extra return register required. Here's the test, the call to EnumeratorNextUpdateIndexAndMode ends up with 0 in the passed baseValue so it fails after attempting to decode and use it: ``` function getMappedArguments() { return arguments; } function forIn(object) { var keys = []; for (var key in object) keys.push(key); return keys; } noInline(forIn); (function() { var mappedArguments = getMappedArguments(0, 1, 2); for (var i = 0; i < 1e4; ++i) { forIn(mappedArguments); } print('done'); })(); ``` Draft PR: https://github.com/WebKit/WebKit/pull/19073 For those who prefer to look at a PR instead of a patch. |