Bug 160588 - Refactor MathIC compilation process in Baseline and DFG to turn temporary registers usage more flexible
Summary: Refactor MathIC compilation process in Baseline and DFG to turn temporary reg...
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 160012
  Show dependency treegraph
 
Reported: 2016-08-04 22:25 PDT by Caio Lima
Modified: 2018-04-09 17:42 PDT (History)
11 users (show)

See Also:


Attachments
WIP refactoring MathIC scratch rules (13.25 KB, patch)
2016-09-07 09:40 PDT, Caio Lima
no flags Details | Formatted Diff | Diff
Patch (13.66 KB, patch)
2016-09-08 23:46 PDT, Caio Lima
no flags Details | Formatted Diff | Diff
Patch (15.69 KB, patch)
2016-09-22 00:26 PDT, Caio Lima
no flags Details | Formatted Diff | Diff
Patch (15.71 KB, patch)
2016-09-22 00:39 PDT, Caio Lima
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews112 for mac-yosemite (2.07 MB, application/zip)
2016-09-22 01:37 PDT, Build Bot
no flags Details
Archive of layout-test-results from ews104 for mac-yosemite-wk2 (1.80 MB, application/zip)
2016-09-22 01:53 PDT, Build Bot
no flags Details
Archive of layout-test-results from ews102 for mac-yosemite (1.86 MB, application/zip)
2016-09-22 01:53 PDT, Build Bot
no flags Details
Patch (15.79 KB, patch)
2016-09-22 09:27 PDT, Caio Lima
no flags Details | Formatted Diff | Diff
Patch (19.74 KB, patch)
2016-09-25 19:15 PDT, Caio Lima
no flags Details | Formatted Diff | Diff
Patch (19.88 KB, patch)
2016-09-25 20:06 PDT, Caio Lima
no flags Details | Formatted Diff | Diff
Patch (19.88 KB, patch)
2016-09-25 20:38 PDT, Caio Lima
no flags Details | Formatted Diff | Diff
Patch (21.16 KB, patch)
2016-10-03 21:17 PDT, Caio Lima
no flags Details | Formatted Diff | Diff
Patch (21.16 KB, patch)
2016-10-10 09:02 PDT, Caio Lima
no flags Details | Formatted Diff | Diff
Patch (24.91 KB, patch)
2016-10-22 11:25 PDT, Caio Lima
no flags Details | Formatted Diff | Diff
Patch (24.83 KB, patch)
2016-11-06 11:44 PST, Caio Lima
no flags Details | Formatted Diff | Diff
Patch (27.24 KB, patch)
2016-12-01 14:19 PST, Caio Lima
no flags Details | Formatted Diff | Diff
Patch (26.68 KB, patch)
2016-12-28 18:06 PST, Caio Lima
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews112 for mac-elcapitan (1.78 MB, application/zip)
2016-12-28 19:20 PST, Build Bot
no flags Details
Patch (26.49 KB, patch)
2016-12-28 20:07 PST, Caio Lima
no flags Details | Formatted Diff | Diff
Patch (26.64 KB, patch)
2017-01-02 20:45 PST, Caio Lima
ticaiolima: review?
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Caio Lima 2016-08-04 22:25:59 PDT
Refactor MathIC compilation in LowerDFGToB3::compileMathIC and JIT::emitMathICFast to make temporary registers usage more flexible. In current implementation is it possible use one scratchGPR and one scratchFPR.
Comment 1 Caio Lima 2016-09-07 09:40:50 PDT
Created attachment 288148 [details]
WIP refactoring MathIC scratch rules

Tests are ok in x86_64 but lets see what EWS thinks about it.
Comment 2 Caio Lima 2016-09-08 23:46:01 PDT
Created attachment 288391 [details]
Patch
Comment 3 Caio Lima 2016-09-14 23:28:34 PDT
ping review
Comment 4 Saam Barati 2016-09-20 15:39:27 PDT
Comment on attachment 288391 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=288391&action=review

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3457
> +    for (int i = 0; i < MATH_IC_MAX_GPR; i++)

should be uint32_t or size_t. (For all loops below, too.)

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3482
> +    if (scratchGPRRegCount == 1)

Hmm, this doesn't like it'd scale well.

> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:1583
> +        patchpoint->numGPScratchRegisters = numGPScratchRegisters;
> +        patchpoint->numFPScratchRegisters = 2 + numFPScratchRegisters;

The idea of doing this is so that we don't require the 2 extra FP registers when not needed. So the caller should be responsible of providing these numbers.
Comment 5 Caio Lima 2016-09-21 00:16:39 PDT
Comment on attachment 288391 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=288391&action=review

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3457
>> +    for (int i = 0; i < MATH_IC_MAX_GPR; i++)
> 
> should be uint32_t or size_t. (For all loops below, too.)

Done.

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3482
>> +    if (scratchGPRRegCount == 1)
> 
> Hmm, this doesn't like it'd scale well.

I agree, however I am getting the following error because there is no sufficient register:

sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: ASSERTION FAILED: currentLowest != NUM_REGS && currentSpillOrder != SpillHintInvalid
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: /Users/caiolima/open_projects/WebKit/Source/JavaScriptCore/dfg/DFGRegisterBank.h(138) : RegID JSC::DFG::RegisterBank<JSC::GPRInfo>::allocate(JSC::VirtualRegister &) [BankInfo = JSC::GPRInfo]
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 1   0xe96ffa WTFCrash
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 2   0x6cf606 JSC::DFG::RegisterBank<JSC::GPRInfo>::allocate(JSC::VirtualRegister&)
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 3   0x6b45f5 JSC::DFG::SpeculativeJIT::allocate()
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 4   0x6fe7cd JSC::DFG::SpeculativeJIT::fillJSValue(JSC::DFG::Edge, JSC::X86Registers::RegisterID&, JSC::X86Registers::RegisterID&, JSC::X86Registers::XMMRegisterID&)
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 5   0x6d18c2 JSC::DFG::JSValueOperand::fill()
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 6   0x6b9c35 JSC::DFG::JSValueOperand::tagGPR()
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 7   0x6b4cda JSC::DFG::JSValueOperand::jsValueRegs()
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 8   0x6c4764 void JSC::DFG::SpeculativeJIT::compileMathIC<JSC::JITMulGenerator, long long (*)(JSC::ExecState*, long long, long long, JSC::JITMathIC<JSC::JITMulGenerator>*), long long (*)(JSC::ExecState*, long long, long long)>(JSC::DFG::Node*, JSC::JITMathIC<JSC::JITMulGenerator>*, unsigned char, unsigned char, long long (*)(JSC::ExecState*, long long, long long, JSC::JITMathIC<JSC::JITMulGenerator>*), long long (*)(JSC::ExecState*, long long, long long))
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 9   0x6926f2 JSC::DFG::SpeculativeJIT::compileArithMul(JSC::DFG::Node*)
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 10  0x70ea82 JSC::DFG::SpeculativeJIT::compile(JSC::DFG::Node*)
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 11  0x68130f JSC::DFG::SpeculativeJIT::compileCurrentBlock()
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 12  0x681c9f JSC::DFG::SpeculativeJIT::compile()
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 13  0x5264bc JSC::DFG::JITCompiler::compileBody()
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 14  0x52a08a JSC::DFG::JITCompiler::compileFunction()
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 15  0x631f70 JSC::DFG::Plan::compileInThreadImpl(JSC::DFG::LongLivedState&)
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 16  0x630a8a JSC::DFG::Plan::compileInThread(JSC::DFG::LongLivedState&, JSC::DFG::ThreadData*)
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 17  0x4a6280 JSC::DFG::compileImpl(JSC::VM&, JSC::CodeBlock*, JSC::CodeBlock*, JSC::DFG::CompilationMode, unsigned int, JSC::Operands<JSC::JSValue> const&, WTF::PassRefPtr<JSC::DeferredCompilationCallback>)
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 18  0x4a5c62 JSC::DFG::compile(JSC::VM&, JSC::CodeBlock*, JSC::CodeBlock*, JSC::DFG::CompilationMode, unsigned int, JSC::Operands<JSC::JSValue> const&, WTF::PassRefPtr<JSC::DeferredCompilationCallback>)
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 19  0x973336 operationOptimize
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 20  0x2b69d27
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 21  0xb63964 llint_entry
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 22  0xb639bb llint_entry
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 23  0x2b5f118
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 24  0xb639bb llint_entry
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 25  0xb5e41c vmEntryToJavaScript
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 26  0x95888f JSC::JITCode::execute(JSC::VM*, JSC::ProtoCallFrame*)
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 27  0x8f5943 JSC::Interpreter::execute(JSC::ProgramExecutable*, JSC::ExecState*, JSC::JSObject*)
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 28  0x2babb5 JSC::evaluate(JSC::ExecState*, JSC::SourceCode const&, JSC::JSValue, WTF::NakedPtr<JSC::Exception>&)
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 29  0x54ae8 runWithScripts(GlobalObject*, WTF::Vector<Script, 0ul, WTF::CrashOnOverflow, 16ul> const&, WTF::String const&, bool, bool, bool)
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 30  0x53c14 runJSC(JSC::VM*, CommandLine)
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: 31  0x52a69 jscmain(int, char**)
sunspider-1.0/3d-raytrace.js.no-cjit-validate-phases: test_script_19: line 2: 78419 Segmentation fault: 11  ( "$@" ../../.vm/JavaScriptCore.framework/Resources/jsc --useFTLJIT\=false --useFunctionDotArguments\=true --maxPerThreadStackUsage\=1572864 --validateBytecode\=true --validateGraphAtEachPhase\=true --useSourceProviderCache\=false --useConcurrentJIT\=false --thresholdForJITAfterWarmUp\=100 --scribbleFreeCells\=true 3d-raytrace.js )

One way I am thinking now is ASSERT(MATH_IC_MAX_GPR <= 2) for x86_32 cases, However it still doesn't solve our limitation on https://bugs.webkit.org/show_bug.cgi?id=160012 because we need 3 scratchGPRs there. Do you think spill the registers makes sense here?

>> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:1583
>> +        patchpoint->numFPScratchRegisters = 2 + numFPScratchRegisters;
> 
> The idea of doing this is so that we don't require the 2 extra FP registers when not needed. So the caller should be responsible of providing these numbers.

Done.
Comment 6 Caio Lima 2016-09-22 00:26:50 PDT
Created attachment 289528 [details]
Patch
Comment 7 WebKit Commit Bot 2016-09-22 00:29:14 PDT
Attachment 289528 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3506:  Should have a space between // and comment  [whitespace/comments] [4]
ERROR: Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3509:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3511:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3518:  Use std::min() or std::min<type>() instead of the MIN() macro.  [runtime/max_min_macros] [4]
Total errors found: 4 in 5 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 8 Caio Lima 2016-09-22 00:32:20 PDT
Since we have asserts on JIT*Generators that scratchGPR != (left/right).tagGPR, I didn't change the Generators because it is expected to allocate just 1 scratchGPR on current MathGenerators. I added the case to use left/right tagGPRs on 32-bits, because there are some potentials Generators that can use them in the future (it is the case of JITPowGenerator that is being developed on https://bugs.webkit.org/show_bug.cgi?id=160012).
Comment 9 Caio Lima 2016-09-22 00:39:21 PDT
Created attachment 289532 [details]
Patch
Comment 10 Build Bot 2016-09-22 01:37:21 PDT
Comment on attachment 289532 [details]
Patch

Attachment 289532 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/2123955

New failing tests:
fast/canvas/webgl/premultiplyalpha-test.html
fast/canvas/canvas-longlived-context.html
fast/forms/input-maxlength.html
fast/workers/dedicated-worker-lifecycle.html
js/for-in-modify-in-loop.html
fast/forms/input-implicit-length-limit.html
Comment 11 Build Bot 2016-09-22 01:37:24 PDT
Created attachment 289538 [details]
Archive of layout-test-results from ews112 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews112  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 12 Build Bot 2016-09-22 01:53:08 PDT
Comment on attachment 289532 [details]
Patch

Attachment 289532 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/2124024

New failing tests:
imported/w3c/web-platform-tests/html/dom/reflection-grouping.html
fast/canvas/webgl/premultiplyalpha-test.html
imported/w3c/web-platform-tests/html/dom/reflection-text.html
fast/canvas/canvas-longlived-context.html
imported/w3c/web-platform-tests/html/dom/reflection-misc.html
fast/forms/input-maxlength.html
imported/w3c/web-platform-tests/html/dom/reflection-tabular.html
imported/w3c/web-platform-tests/html/dom/reflection-sections.html
fast/frames/take-focus-from-iframe.html
imported/w3c/web-platform-tests/html/dom/reflection-embedded.html
fast/workers/dedicated-worker-lifecycle.html
js/for-in-modify-in-loop.html
fast/canvas/webgl/object-deletion-behaviour.html
fast/forms/input-implicit-length-limit.html
imported/w3c/web-platform-tests/html/dom/reflection-forms.html
Comment 13 Build Bot 2016-09-22 01:53:12 PDT
Created attachment 289540 [details]
Archive of layout-test-results from ews104 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews104  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 14 Build Bot 2016-09-22 01:53:13 PDT
Comment on attachment 289532 [details]
Patch

Attachment 289532 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/2124028

New failing tests:
imported/w3c/web-platform-tests/html/dom/reflection-grouping.html
fast/canvas/webgl/premultiplyalpha-test.html
imported/w3c/web-platform-tests/html/dom/reflection-text.html
fast/canvas/canvas-longlived-context.html
imported/w3c/web-platform-tests/html/dom/reflection-misc.html
fast/forms/input-maxlength.html
imported/w3c/web-platform-tests/html/dom/reflection-tabular.html
imported/w3c/web-platform-tests/html/dom/reflection-sections.html
imported/w3c/web-platform-tests/html/dom/reflection-embedded.html
fast/workers/dedicated-worker-lifecycle.html
js/for-in-modify-in-loop.html
fast/forms/input-implicit-length-limit.html
imported/w3c/web-platform-tests/html/dom/reflection-forms.html
Comment 15 Build Bot 2016-09-22 01:53:16 PDT
Created attachment 289541 [details]
Archive of layout-test-results from ews102 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews102  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 16 Caio Lima 2016-09-22 09:27:22 PDT
Created attachment 289565 [details]
Patch
Comment 17 Saam Barati 2016-09-24 15:55:22 PDT
Comment on attachment 289565 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=289565&action=review

Mostly LGTM, but I think the patch can become better.

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3525
> +    // If there is no more GPR available to allocate, we get tagGPR from
> +    // result, right and left operands. It is possible because they are speculated
> +    // and generators know how to restore the tags after operations using these GPRs.

This seems sketchy and probably a bad API. If the 32-bit implementations already do this, why do they even need us to pass in a scratch? They should just do it and have it be opaque to the thing allocating the scratches.

> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:1559
> +        compileMathIC(addIC, 1, 2, repatchingFunction, nonRepatchingFunction);

Style: Please give these names as local variables and pass in the variables to the function.

> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:1598
> +                for (int i = 0; i < MATH_IC_MAX_GPR; i++) {

Don't use "int"

> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:1600
> +                        scratchGPRReg[i] = params.gpScratch(0);

This is wrong.

> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:1606
> +                for (int i = 0; i < MATH_IC_MAX_GPR; i++) {

Don't use "int"

> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:1748
> +            compileMathIC(subIC, 1, 2, repatchingFunction, nonRepatchingFunction);

Ditto w.r.t names as above.
Also, why does this need 2 scratch FPRs, but the DFG version needs 1. They should be consistent. However, reading this code, I think the numberOfGPRScratches/numberOfFPRScratches should just be static fields on the various Generator classes, and not on the creators of the ICs.
This seems like a cleaner approach because each place a MulIC (for example) is created, the creator doesn't have to always specify 1 and 1. The Generator will just know.

> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:1842
> +            compileMathIC(mulIC, 1, 2, repatchingFunction, nonRepatchingFunction);

Ditto w.r.t names and using Generator to dictate how many scratches are needed.

> Source/JavaScriptCore/jit/JITMathIC.h:55
> +#define MATH_IC_MAX_FPR 3

Which IC uses 3?
Comment 18 Caio Lima 2016-09-25 14:56:31 PDT
Comment on attachment 289565 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=289565&action=review

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3525
>> +    // and generators know how to restore the tags after operations using these GPRs.
> 
> This seems sketchy and probably a bad API. If the 32-bit implementations already do this, why do they even need us to pass in a scratch? They should just do it and have it be opaque to the thing allocating the scratches.

Nice point. I totally agree with that. The current implementation already reuses the result.tagGPR as scratchGPR. I am going to move it to Generators and remove this allocation here.

>> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:1748
>> +            compileMathIC(subIC, 1, 2, repatchingFunction, nonRepatchingFunction);
> 
> Ditto w.r.t names as above.
> Also, why does this need 2 scratch FPRs, but the DFG version needs 1. They should be consistent. However, reading this code, I think the numberOfGPRScratches/numberOfFPRScratches should just be static fields on the various Generator classes, and not on the creators of the ICs.
> This seems like a cleaner approach because each place a MulIC (for example) is created, the creator doesn't have to always specify 1 and 1. The Generator will just know.

Actually, the correct semantic behavior is 0. If you check current implementation "patchpoint->numFPScratchRegisters = 2" because leftFPR and rightFPR are allocated using "patchpoint->numFPScratchRegisters". That was the reason that in my last Patch I was using "patchpoint->numFPScratchRegisters = 2 + numFPScratchRegisters;". I think that is better the way that I implemented before, because it is going to keep numFPScratchRegisters consistent between DFG and FTL layers.

>> Source/JavaScriptCore/jit/JITMathIC.h:55
>> +#define MATH_IC_MAX_FPR 3
> 
> Which IC uses 3?

It is because the MathIC for B3 allocates right, left and scratch FPRs using "patchpoint->numFPScratchRegisters = numFPScratchRegisters". For all generators scratchFPR are always InvalidFPRReg.
Comment 19 Caio Lima 2016-09-25 19:15:02 PDT
Created attachment 289792 [details]
Patch
Comment 20 Caio Lima 2016-09-25 20:06:46 PDT
Created attachment 289795 [details]
Patch
Comment 21 Caio Lima 2016-09-25 20:38:54 PDT
Created attachment 289796 [details]
Patch
Comment 22 Saam Barati 2016-10-03 18:16:10 PDT
Comment on attachment 289796 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=289796&action=review

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3434
> +    ASSERT(scratchFPRRegCount <= MATH_IC_MAX_GPR);
> +    ASSERT(scratchFPRRegCount <= MATH_IC_MAX_FPR);

Please make this a static_assert

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3512
> +    // Since registers on 32-bit architectures like x86_32 are scarce
> +    // we need to check if it is possible allocate new registers

It seems like all 32-bit snippets are able to use tags as their scratch, can we just remove this code?

> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:1569
> +        ASSERT(numGPScratchRegisters <= MATH_IC_MAX_GPR);
> +        ASSERT(numFPScratchRegisters <= MATH_IC_MAX_FPR);

Please make this a static_assert

> Source/JavaScriptCore/jit/JITMathIC.h:56
> +#define MATH_IC_MAX_GPR 1
> +#define MATH_IC_MAX_FPR 1

I would make these static fields inside JITMathIC instead of a define.

> Source/JavaScriptCore/jit/JITMulGenerator.cpp:41
> +    if (m_scratchGPR == InvalidGPRReg)
> +        m_scratchGPR = m_result.tagGPR();

Have you made sure it's always valid to do this?
Comment 23 Caio Lima 2016-10-03 18:36:35 PDT
Comment on attachment 289796 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=289796&action=review

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3434
>> +    ASSERT(scratchFPRRegCount <= MATH_IC_MAX_FPR);
> 
> Please make this a static_assert

Ok

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3512
>> +    // we need to check if it is possible allocate new registers
> 
> It seems like all 32-bit snippets are able to use tags as their scratch, can we just remove this code?

My intention is to enable allocate scratches and avoid use tagGPRs to avoid instructions to restore the tags. Current Snnipets uses just one scratch that is mapped to result .tagGPR , but others like JITPowGenerator can use more. If scratches are already allocated, we don't need to use x/y.tagGPR and consequently, we don't need to restore their tags on snnipets.

>> Source/JavaScriptCore/jit/JITMathIC.h:56
>> +#define MATH_IC_MAX_FPR 1
> 
> I would make these static fields inside JITMathIC instead of a define.

Good. I am going to do that.

>> Source/JavaScriptCore/jit/JITMulGenerator.cpp:41
>> +        m_scratchGPR = m_result.tagGPR();
> 
> Have you made sure it's always valid to do this?

The current code always  use result.tagGPR as scratch on 32bits. So I suppose that it is valid. Also, AFIK all results assign are boxed, so the tag is going to be restored properly. I can double check that.
Comment 24 Caio Lima 2016-10-03 21:17:04 PDT
Created attachment 290564 [details]
Patch
Comment 25 Caio Lima 2016-10-10 09:02:24 PDT
Created attachment 291111 [details]
Patch
Comment 26 Saam Barati 2016-10-11 21:09:54 PDT
Comment on attachment 291111 [details]
Patch

Something about this patch doesn't sit right with me, and I think I've finally figured out what it is.
This API doesn't seem scalable to adding more math ICs that need more scratch registers. I think we need
to move to a more dynamic API, by either passing in a vector of scratch registers, or a RegisterSet of used registers.
The reason this code feels so weird to me is that you've started to make this API more dynamic, yet you're still calling
the exact same constructors. So it's only pretending to be more dynamic, but in reality, it's not.
Also, the code is just much harder to follow now. 
I'm not sure what the best API is. Here are two options:
1. If we pass in a RegisterSet of used registers, each snippet will now be responsible for allocating scratch
registers using the scratch register allocator.
2. Alternatively, we can keep most of your patch like it is now, but pass in a vector of scratch registers.
Each snippet constructor can deconstruct the vector into member variables and assert that none are InvalidGPR/FPRReg.
The vector can have an inlinecapacity of scratchGPRRegCount/scratchFPRRegCount and just be the vector of registers.

I think I prefer 2. Maybe you have other ideas?
Comment 27 Caio Lima 2016-10-21 17:18:33 PDT
(In reply to comment #26)
> Comment on attachment 291111 [details]
> Patch
> 
> Something about this patch doesn't sit right with me, and I think I've
> finally figured out what it is.
> This API doesn't seem scalable to adding more math ICs that need more
> scratch registers. I think we need
> to move to a more dynamic API, by either passing in a vector of scratch
> registers, or a RegisterSet of used registers.
> The reason this code feels so weird to me is that you've started to make
> this API more dynamic, yet you're still calling
> the exact same constructors. So it's only pretending to be more dynamic, but
> in reality, it's not.
> Also, the code is just much harder to follow now. 
> I'm not sure what the best API is. Here are two options:
> 1. If we pass in a RegisterSet of used registers, each snippet will now be
> responsible for allocating scratch
> registers using the scratch register allocator.
> 2. Alternatively, we can keep most of your patch like it is now, but pass in
> a vector of scratch registers.
> Each snippet constructor can deconstruct the vector into member variables
> and assert that none are InvalidGPR/FPRReg.
> The vector can have an inlinecapacity of
> scratchGPRRegCount/scratchFPRRegCount and just be the vector of registers.
> 
> I think I prefer 2. Maybe you have other ideas?

I remember discussing with you a long time ago that I proposed to you use a vector, however you gave me an answer that convinced me. The problem is that the vector is going to inflate all ICs, mainly the ones that use just 1 scratch.

If I understood your idea, we could use GPRReg allocatedRegs[scratchGPRRegCount] and pass it to JIT*Generator and the Generator just validate if they are invalid or not. This implementation is going to use sizeof(GPRReg * scratchGPRRegCount) memory, which means sizeof(GPRReg) if scratchGPRRegCount = 1. Is it right? If yes, I can't think in a better solution right now and the idea of Generators being responsible to allocate its scratches doesn't sound right IMO.
Comment 28 Caio Lima 2016-10-22 11:25:47 PDT
Created attachment 292502 [details]
Patch
Comment 29 WebKit Commit Bot 2016-10-22 11:28:22 PDT
Attachment 292502 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/jit/JITMulGenerator.h:53:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITMulGenerator.h:54:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:52:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:53:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITAddGenerator.h:53:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITAddGenerator.h:54:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
Total errors found: 6 in 12 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 30 Caio Lima 2016-11-06 11:44:16 PST
Created attachment 294030 [details]
Patch
Comment 31 WebKit Commit Bot 2016-11-06 11:46:03 PST
Attachment 294030 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/jit/JITMulGenerator.h:53:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITMulGenerator.h:54:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:52:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:53:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITAddGenerator.h:53:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITAddGenerator.h:54:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
Total errors found: 6 in 12 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 32 Saam Barati 2016-11-06 14:49:12 PST
Comment on attachment 294030 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=294030&action=review

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3563
> +    GPRReg scratchGPR[Generator::scratchGPRRegCount];
> +    FPRReg scratchFPR[Generator::scratchFPRRegCount];

I'm more of a fan of using Vector<> with an inline capacity for this API. You'll still get the same property where you don't end up mallocing any memory.

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3600
> +    for (size_t i = 0; i < std::min<unsigned>(availableGPRs, scratchGPRRegCount); i++) {

It seems wrong if the min() expression ever returns availableGPRs instead of scratchGPRRegCount. I would just turn this into an assert that we have enough registers.

> Source/JavaScriptCore/jit/JITMulGenerator.h:45
> +        FPRReg leftFPR, FPRReg rightFPR, GPRReg *scratchGPRs, FPRReg *scratchFPRs)

See my comment about using Vector here instead. You could even templatize it over Vector type to not have to worry about what the inline capacity of it is.
Comment 33 Caio Lima 2016-11-06 15:19:10 PST
Comment on attachment 294030 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=294030&action=review

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3563
>> +    FPRReg scratchFPR[Generator::scratchFPRRegCount];
> 
> I'm more of a fan of using Vector<> with an inline capacity for this API. You'll still get the same property where you don't end up mallocing any memory.

Sorry for my ignorance here. I used Vector with inline capacity in another Patch this weekend, however I didn't totally understand it .Do you mind explain me what is the difference of use a Vector<int>, for example?

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3600
>> +    for (size_t i = 0; i < std::min<unsigned>(availableGPRs, scratchGPRRegCount); i++) {
> 
> It seems wrong if the min() expression ever returns availableGPRs instead of scratchGPRRegCount. I would just turn this into an assert that we have enough registers.

hum...We can reproduce it on x86_32, for sure, because old implementation uses result.tagGPR as scratchGPR because of limited registers. I am going to double check it.
Comment 34 Saam Barati 2016-11-06 15:45:28 PST
(In reply to comment #33)
> Comment on attachment 294030 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=294030&action=review
> 
> >> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3563
> >> +    FPRReg scratchFPR[Generator::scratchFPRRegCount];
> > 
> > I'm more of a fan of using Vector<> with an inline capacity for this API. You'll still get the same property where you don't end up mallocing any memory.
> 
> Sorry for my ignorance here. I used Vector with inline capacity in another
> Patch this weekend, however I didn't totally understand it .Do you mind
> explain me what is the difference of use a Vector<int>, for example?
Using a Vector with an inline capacity as a stack allocated variable will prevent it from mallocing a backing store until you've exceeded InlineCapacity number of elements. You should read the entire vector class to get a better idea of what I'm saying, but at a high level, it has a field like this:

template <typename T, size_t InlineCapacity>
class Vector {
     T m_inlineVector[InlineCapacity];
}

> 
> >> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3600
> >> +    for (size_t i = 0; i < std::min<unsigned>(availableGPRs, scratchGPRRegCount); i++) {
> > 
> > It seems wrong if the min() expression ever returns availableGPRs instead of scratchGPRRegCount. I would just turn this into an assert that we have enough registers.
> 
> hum...We can reproduce it on x86_32, for sure, because old implementation
> uses result.tagGPR as scratchGPR because of limited registers. I am going to
> double check it.
Comment 35 Caio Lima 2016-12-01 14:19:17 PST
Created attachment 295895 [details]
Patch
Comment 36 WebKit Commit Bot 2016-12-01 14:21:30 PST
Attachment 295895 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/jit/JITMulGenerator.h:60:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:59:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITAddGenerator.h:60:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
Total errors found: 3 in 12 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 37 Caio Lima 2016-12-28 18:06:35 PST
Created attachment 297817 [details]
Patch
Comment 38 Caio Lima 2016-12-28 18:08:27 PST
(In reply to comment #37)
> Created attachment 297817 [details]
> Patch

Rebased with master and created guards for GPR/FPR Scratch Count = 0.

Let's see what EWS thinks about this Patch.
Comment 39 Build Bot 2016-12-28 19:20:23 PST
Comment on attachment 297817 [details]
Patch

Attachment 297817 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/2800668

New failing tests:
svg/in-html/sizing/svg-inline.html
webgl/1.0.2/conformance/uniforms/out-of-bounds-uniform-array-access.html
Comment 40 Build Bot 2016-12-28 19:20:29 PST
Created attachment 297818 [details]
Archive of layout-test-results from ews112 for mac-elcapitan

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews112  Port: mac-elcapitan  Platform: Mac OS X 10.11.6
Comment 41 Caio Lima 2016-12-28 20:07:18 PST
Created attachment 297820 [details]
Patch
Comment 42 WebKit Commit Bot 2016-12-28 20:09:46 PST
Attachment 297820 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/jit/JITMulGenerator.h:60:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:59:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITAddGenerator.h:60:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
Total errors found: 3 in 11 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 43 Saam Barati 2017-01-02 16:42:20 PST
Comment on attachment 297820 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=297820&action=review

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3572
> +        } else 
> +            scratchGPR[i] = InvalidGPRReg;

Is this side of the branch ever taken? It seems like we should almost assert against it, and if a generator knows how to handle using the tags, it should just *always* do that and report that it needs fewer registers.

> Source/JavaScriptCore/jit/JITAddGenerator.h:52
> +        FPRReg leftFPR, FPRReg rightFPR, Vector<GPRReg, scratchGPRRegCount> scratchGPRs, Vector<FPRReg, scratchFPRRegCount> scratchFPRs)

You shouldn't be copying these vectors for these constructors. I'd use "const Vector<...>&" as the type here.

> Source/JavaScriptCore/jit/JITArithmetic.cpp:777
> +        else
> +            scratchGPR[i] = InvalidGPRReg;

Ditto as the DFG code that does this.

> Source/JavaScriptCore/jit/JITMulGenerator.cpp:41
> +    // Maybe it was not possible allocate scratchGPR. In this case, we use result.tagGPR
> +    if (m_scratchGPR == InvalidGPRReg)
> +        m_scratchGPR = m_result.tagGPR();

Ditto, we should just always use tag if we're 32-bit and we know we need a scratch and that the code is valid when using the tag as a scratch.

> Source/JavaScriptCore/jit/JITMulGenerator.h:52
> +        FPRReg leftFPR, FPRReg rightFPR, Vector<GPRReg, scratchGPRRegCount> scratchGPRs, Vector<FPRReg, scratchFPRRegCount> scratchFPRs)

Ditto about copying the vectors.

> Source/JavaScriptCore/jit/JITSubGenerator.h:51
> +        FPRReg leftFPR, FPRReg rightFPR, Vector<GPRReg, scratchGPRRegCount> scratchGPRs, Vector<FPRReg, scratchFPRRegCount> scratchFPRs)

Ditto about copying the vectors.
Comment 44 Caio Lima 2017-01-02 20:42:13 PST
Comment on attachment 297820 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=297820&action=review

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3572
>> +            scratchGPR[i] = InvalidGPRReg;
> 
> Is this side of the branch ever taken? It seems like we should almost assert against it, and if a generator knows how to handle using the tags, it should just *always* do that and report that it needs fewer registers.

The problem here actually isn't the DG version, but the Baseline version on JITArithmetic.cpp. Take a look at https://github.com/caiolima/webkit/blob/math-ic-scratch/Source/JavaScriptCore/jit/JITArithmetic.cpp#L761
In this case, resultRegs = leftRegs, which makes impossible use resultRegs.tagGPR in that case. This way, we really need this mechanism to make sure that Api works across DFG and Baseline.
Comment 45 Caio Lima 2017-01-02 20:42:21 PST
View in context: https://bugs.webkit.org/attachment.cgi?id=297820&action=review

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:3572
>> +            scratchGPR[i] = InvalidGPRReg;
> 
> Is this side of the branch ever taken? It seems like we should almost assert against it, and if a generator knows how to handle using the tags, it should just *always* do that and report that it needs fewer registers.

The problem here actually isn't the DG version, but the Baseline version on JITArithmetic.cpp. Take a look at https://github.com/caiolima/webkit/blob/math-ic-scratch/Source/JavaScriptCore/jit/JITArithmetic.cpp#L761
In this case, resultRegs = leftRegs, which makes impossible use resultRegs.tagGPR in that case. This way, we really need this mechanism to make sure that Api works across DFG and Baseline.
Comment 46 Caio Lima 2017-01-02 20:45:25 PST
Created attachment 297922 [details]
Patch
Comment 47 Caio Lima 2017-01-18 15:17:38 PST
ping review
Comment 48 zooz hada 2018-04-02 16:50:43 PDT
تصحيح
Comment 49 Michael Catanzaro 2018-04-09 17:42:33 PDT
Thanks for bumping this, mysterious spammer. Caio probably still wants a review!