Bug 188402 - [WHLSL] Local variables should be statically allocated
Summary: [WHLSL] Local variables should be statically allocated
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebGPU (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Thomas Denney
URL:
Keywords: InRadar
: 189107 (view as bug list)
Depends on: 188641 189326
Blocks: 176199 189202
  Show dependency treegraph
 
Reported: 2018-08-07 20:03 PDT by Thomas Denney
Modified: 2018-10-13 15:40 PDT (History)
8 users (show)

See Also:


Attachments
WIP (137.98 KB, patch)
2018-09-05 17:49 PDT, Thomas Denney
no flags Details | Formatted Diff | Diff
WIP (165.36 KB, patch)
2018-09-10 17:10 PDT, Thomas Denney
no flags Details | Formatted Diff | Diff
Patch (28.29 KB, patch)
2018-09-10 20:16 PDT, Thomas Denney
no flags Details | Formatted Diff | Diff
Patch (30.28 KB, patch)
2018-09-19 00:24 PDT, Thomas Denney
no flags Details | Formatted Diff | Diff
Patch (30.20 KB, patch)
2018-09-19 23:33 PDT, Thomas Denney
mmaxfield: review+
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (30.98 KB, patch)
2018-09-21 14:02 PDT, Thomas Denney
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (30.98 KB, patch)
2018-09-21 14:07 PDT, Thomas Denney
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Denney 2018-08-07 20:03:55 PDT
The interpreter conforms to the spec; the Metal compiler doesn't (necessarily). For example, a call to foo() should return 1:

    thread int* bar(bool flag)
    {
        int x = 0;
        if (flag)
            x = 1;
        return &x;
    }

    int foo()
    {
        thread int* x = bar(false);
        thread int* y = bar(true);
        return (*x) * (*y);
    }

The interpreter gets this right; when a VariableDecl is visited in Evaluator.js it only allocates a buffer for that variable if there wasn't one already. The compiler doesn't get it right; local variables are emitted in local scope and therefore the Metal compiler *can* choose to store them however it likes. They could be independent, or another function bar2() could alias its local variables with bar() if bar and bar2 are never both called.

Metal Shading Language only permits constant variables to be static if they are declared in global scope. It doesn't permit statically declared variables in local scope.

When functions are inlined this isn't a problem; all variables can be "statically" allocated by having them as local variables declared at the top of a shader function. However, not all functions can be (efficiently) inlined in the MSL output if they don't have reducible control flow. An inefficient solution would "statically" allocate all variables in the main shading functions and pass references to them to non-inlined functions. This awful approach could be mitigated by conservatively finding local variables that references are never created for, but it is far from ideal.

It is worth noting that successive executions of the same function cannot rely on previous values stored at the same local variable because local variables are *always* zero-initialized when they are declared. The compiler and the interpreter behave correctly on the following program:

    thread int* bar(bool flag)
    {
        int x; // x is zero initialized twice
        if (flag)
            x = 1;
        return &x;
    }

    int foo()
    {
        thread int* x = bar(true);
        thread int* y = bar(false);
        return (*x) * (*y);
    }

The result of foo() should be 0.

Given that local variable values do not persist between calls, the largest benefit of statically allocating local variables is that references to local variables can be returned from a function, or passed out via "out" parameters. Many programming languages do not permit this, so a possible mitigation would be to disallow it.
Comment 1 Thomas Denney 2018-09-05 17:49:45 PDT
Created attachment 348987 [details]
WIP
Comment 2 Thomas Denney 2018-09-05 17:50:56 PDT
(In reply to Thomas Denney from comment #1)
> Created attachment 348987 [details]
> WIP

This patch doesn’t yet support function arguments having their address taken, and I haven’t done any work on array references yet. It also contains my modified version of the standard library for faster parsing, so this is very much WIP.
Comment 3 Thomas Denney 2018-09-10 17:10:03 PDT
Created attachment 349358 [details]
WIP
Comment 4 Thomas Denney 2018-09-10 17:11:10 PDT
(In reply to Thomas Denney from comment #3)
> Created attachment 349358 [details]
> WIP

This most recent patch is basically complete, but I’m going to wait on the two dependencies of this bug to be resolved before I put this up for review.
Comment 5 Thomas Denney 2018-09-10 20:16:11 PDT
Created attachment 349372 [details]
Patch
Comment 6 Radar WebKit Bug Importer 2018-09-12 18:51:52 PDT
<rdar://problem/44403028>
Comment 7 Thomas Denney 2018-09-19 00:24:12 PDT
Created attachment 350096 [details]
Patch
Comment 8 Myles C. Maxfield 2018-09-19 14:25:39 PDT
Comment on attachment 350096 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=350096&action=review

Cool patch.

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:29
> +    const entryPoints = [];

This is a fairly self-contained block of code. I'd recommend moving it to its own function.

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:30
> +    class OnlyVisitFuncDefsThatAreEntryPoints extends Visitor {

How about "gatherEntryPointDefs" since visitors visit everything?

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:41
> +    const allVariablesAndFunctionParameters = new Set();
> +    const functionsThatAreCalledByEntryPoints = new Set();
> +    class FindAllVariablesAndFunctionParameters extends Visitor {

This is a fairly self-contained block of code. I'd recommend moving it to its own function.

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:53
> +                node.func.visit(new FindAllVariablesAndFunctionParameters(node.func));

Doesn't this have exponential runtime because it doesn't dedup functions?

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:59
> +            node._func = this._currentFunc;

A more descriptive name, please

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:68
> +            node._func = this._currentFunc;
> +            if (!this._currentFunc.isEntryPoint)
> +                allVariablesAndFunctionParameters.add(node);

Why does visitVariableDecl() have a super call but visitFuncParameter not?

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:83
> +    program.add(ptrToGlobalStructType);

This doesn't seem right, because the parser will never put a PtrType at the global level, so we probably shouldn't do that either.

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:100
> +    let counter = 0;
> +    const varToFieldMap = new Map();
> +    for (let varOrParam of allVariablesAndFunctionParameters) {
> +        const fieldName = `field${counter++}_${varOrParam._func.name}_${varOrParam.name}`;
> +        globalStructType.add(new Field(varOrParam.origin, fieldName, varOrParam.type));
> +        varToFieldMap.set(varOrParam, fieldName);
> +    }
> +
> +    for (let func of functionsThatAreCalledByEntryPoints) {
> +        if (func.returnType.name !== "void") {
> +            const fieldName = `field${counter++}_return_${func.name}`;
> +            globalStructType.add(new Field(func.origin, fieldName, func.returnType));
> +            func.returnFieldName = fieldName;
> +        }
> +    }

This is a fairly self-contained block of code. I'd recommend moving it to its own function.

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:139
> +            get func()
> +            {
> +                return this._func;
> +            }

Not sure if this is necessary if all callers are local to the class.

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:157
> +                const possibleAndOverloads = program.globalNameContext.get(Func, functionName);
> +                const callExpressionResolution = CallExpression.resolve(node.origin, possibleAndOverloads, functionName, [ this.globalStructVariableRef ], [ ptrToGlobalStructTypeRef ]);

It's kind of unfortunate we have to reverse-engineer what would have happened earlier in the compiler. Can we run this stage earlier so we don't have to do this?

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:171
> +                    return super.visitVariableRef(node);

Isn't this an error?

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:180
> +                    return new Assignment(node.origin, this._dereferencedCallExpressionForFieldName(node, node.type, varToFieldMap.get(node)), node.initializer.visit(this), node.type);

Nodes need to get assigned (zero-filled) even if they don't have an initializer. We should add a test for this.

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:183
> +                else if (node == this.variableDecl)
> +                    return node;

I'd move this first

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:199
> +                        const anonymousVariable = new AnonymousVariable(node.origin, type);

What is the purpose of the anonymous variables? Why not assign directly into the global struct?

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:208
> +                        exprs.push(this._dereferencedCallExpressionForFieldName(node.func, node.func.returnType, node.func.returnFieldName));

Neat.

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:210
> +                    node.argumentList = [ this.globalStructVariableRef ];

Are you sure it's wise for them all to be using the exact same VariableRef? Seems like we should store a creation lambda instead of the raw variable itself.

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:220
> +                if (node.value && this._func.returnFieldName)

If these don't match, seems like this should be an error.

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:223
> +                return new CommaExpression(node.origin, [
> +                    new Assignment(node.origin, this._dereferencedCallExpressionForFieldName(this._func, this._func.returnType, this._func.returnFieldName), node.value, this._func.returnType),
> +                    new Return(node.origin) ]);

Indentation

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:240
> +            if (node._newParameters)

This pollutes the FuncDef nodes. I'd prefer a side-table.

> Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:250
> +            if (node.func.returnFieldName)
> +                node._returnType = node.resultType = TypeRef.wrap(program.types.get("void"));

Cool.

> Tools/WebGPUShadingLanguageRI/EBufferBuilder.js:-33
> -    constructor(program)
> -    {
> -        super();
> -        this._program = program;
> -    }
> -    

What? What's the point of this class if you can never construct it? I don't see any other constructors or static functions.

> Tools/WebGPUShadingLanguageRI/Func.js:57
> +    set parameters(newValue)

Not a great variable name.

> Tools/WebGPUShadingLanguageRI/SynthesizeStructAccessors.js:51
> +        function createFieldType()
> +        {
> +            return field.type.visit(new Rewriter());
> +        }
> +
> +        function createTypeRef()
> +        {
> +            return TypeRef.wrap(type);
> +        }

Do we need these?

> Tools/WebGPUShadingLanguageRI/SynthesizeStructAccessors.js:101
>              nativeFunc = new NativeFunc(
> -                field.origin, "operator." + field.name + "=", createTypeRef(),
> +                field.origin, "operator&." + field.name, new PtrType(field.origin, addressSpace, createFieldType()),
>                  [
> -                    new FuncParameter(field.origin, null, createTypeRef()),
> -                    new FuncParameter(field.origin, null, createFieldType())
> +                    new FuncParameter(
> +                        field.origin, null,
> +                        new PtrType(field.origin, addressSpace, createTypeRef()))
>                  ],
>                  isCast, shaderType);
> -            setupImplementationData(nativeFunc, ([base, value], offset, structSize, fieldSize) => {
> -                let result = new EPtr(new EBuffer(structSize), 0);
> -                result.copyFrom(base, structSize);
> -                result.plus(offset).copyFrom(value, fieldSize);
> -                return result;
> +            setupImplementationData(nativeFunc, ([base], offset, structSize, fieldSize) => {
> +                base = base.loadValue();
> +                if (!base)
> +                    throw new WTrapError(field.origin.originString, "Null dereference");
> +                return EPtr.box(base.plus(offset));
>              });
>              program.add(nativeFunc);

diff really made a mess of things, didn't it
Comment 9 Thomas Denney 2018-09-19 19:49:41 PDT
(In reply to Myles C. Maxfield from comment #8)
> > Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:53
> > +                node.func.visit(new FindAllVariablesAndFunctionParameters(node.func));
> 
> Doesn't this have exponential runtime because it doesn't dedup functions?

Damn, good catch.

> > Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:157
> > +                const possibleAndOverloads = program.globalNameContext.get(Func, functionName);
> > +                const callExpressionResolution = CallExpression.resolve(node.origin, possibleAndOverloads, functionName, [ this.globalStructVariableRef ], [ ptrToGlobalStructTypeRef ]);
> 
> It's kind of unfortunate we have to reverse-engineer what would have
> happened earlier in the compiler. Can we run this stage earlier so we don't
> have to do this?
>

Annoyingly we need types for this stage, which are only fully annotated in the Checker stage. An earlier version of this patch tried doing this allocation but I couldn’t get it working reliably.
 
> > Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:171
> > +                    return super.visitVariableRef(node);
> 
> Isn't this an error?

No, anonymous variables can be wrapped in VariableRefs.

> > Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:199
> > +                        const anonymousVariable = new AnonymousVariable(node.origin, type);
> 
> What is the purpose of the anonymous variables? Why not assign directly into
> the global struct?

I’m going to add a comment into the code explaining why not, because it has now caught me out several times. Consider the case foo(foo(a, b), c). We initially evaluate c, and then evaluate foo(a, b) per the RTL calling convention. To evaluate foo(a, b) we evaluate b, then a, and place them in the global struct for the call to foo. However, this would mean that if c had previously been placed in the global struct then the outer foo wouldn’t see the value of evaluating c, but the value of evaluating b. Therefore *all* the arguments have to be evaluated into anonymous variables, then copied into the global struct, and then the function has to be called. The existing Metal code generator (MSLStatementEmitter.visitCallExpression) and interpreter (Evaluator._evaluateArguments) both respect this behavior and there are tests that catch this.

> > Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:210
> > +                    node.argumentList = [ this.globalStructVariableRef ];
> 
> Are you sure it's wise for them all to be using the exact same VariableRef?
> Seems like we should store a creation lambda instead of the raw variable
> itself.
>

There’s nothing in the compiler/interpreter at the moment that memoizes the evaluation or compilation of a node, so it would always be re-evaluated/re-compiled wherever it occurs, but this change seems harmless.
 
> > Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js:240
> > +            if (node._newParameters)
> 
> This pollutes the FuncDef nodes. I'd prefer a side-table.

Cool, will do.

> > Tools/WebGPUShadingLanguageRI/EBufferBuilder.js:-33
> > -    constructor(program)
> > -    {
> > -        super();
> > -        this._program = program;
> > -    }
> > -    
> 
> What? What's the point of this class if you can never construct it? I don't
> see any other constructors or static functions.
> 

There is a default constructor (equivalent to constructor () { super(); }), and there are no construction sites that actually passed in the program object any more, nor is this._program used anywhere in the class.

> > Tools/WebGPUShadingLanguageRI/SynthesizeStructAccessors.js:51
> > +        function createFieldType()
> > +        {
> > +            return field.type.visit(new Rewriter());
> > +        }
> > +
> > +        function createTypeRef()
> > +        {
> > +            return TypeRef.wrap(type);
> > +        }
> 
> Do we need these?

createFieldType() isn’t necessary (it works fine to continue using field.type) and createTypeRef is literally just used as a utility function 3 times (as you noticed, diff had a bad time with this file — I didn’t write these functions). I’ll get rid of them.
Comment 10 Thomas Denney 2018-09-19 23:33:31 PDT
Created attachment 350175 [details]
Patch
Comment 11 WebKit Commit Bot 2018-09-20 19:10:04 PDT
Comment on attachment 350175 [details]
Patch

Rejecting attachment 350175 [details] from commit-queue.

Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-02', 'land-attachment', '--force-clean', '--non-interactive', '--parent-command=commit-queue', 350175, '--port=mac']" exit_code: 2 cwd: /Volumes/Data/EWS/WebKit

Logging in as commit-queue@webkit.org...
Fetching: https://bugs.webkit.org/attachment.cgi?id=350175&action=edit
Fetching: https://bugs.webkit.org/show_bug.cgi?id=188402&ctype=xml&excludefield=attachmentdata
Processing 1 patch from 1 bug.
Updating working directory
Processing patch 350175 from bug 188402.
Fetching: https://bugs.webkit.org/attachment.cgi?id=350175
Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force', '--reviewer', u'Myles C. Maxfield']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit

Parsed 14 diffs from patch file(s).
patching file Tools/ChangeLog
Hunk #1 succeeded at 1 with fuzz 3.
patching file Tools/WebGPUShadingLanguageRI/All.js
patching file Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js
patching file Tools/WebGPUShadingLanguageRI/CallExpression.js
patching file Tools/WebGPUShadingLanguageRI/EBufferBuilder.js
patching file Tools/WebGPUShadingLanguageRI/Func.js
patching file Tools/WebGPUShadingLanguageRI/FuncDef.js
patching file Tools/WebGPUShadingLanguageRI/Prepare.js
patching file Tools/WebGPUShadingLanguageRI/Rewriter.js
patching file Tools/WebGPUShadingLanguageRI/SPIRV.html
patching file Tools/WebGPUShadingLanguageRI/SynthesizeStructAccessors.js
Hunk #1 FAILED at 20.
1 out of 1 hunk FAILED -- saving rejects to file Tools/WebGPUShadingLanguageRI/SynthesizeStructAccessors.js.rej
patching file Tools/WebGPUShadingLanguageRI/Test.html
patching file Tools/WebGPUShadingLanguageRI/Test.js
Hunk #1 succeeded at 8081 (offset 164 lines).
patching file Tools/WebGPUShadingLanguageRI/index.html

Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force', '--reviewer', u'Myles C. Maxfield']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit

Parsed 14 diffs from patch file(s).
patching file Tools/ChangeLog
Hunk #1 succeeded at 1 with fuzz 3.
patching file Tools/WebGPUShadingLanguageRI/All.js
patching file Tools/WebGPUShadingLanguageRI/AllocateAtEntryPoints.js
patching file Tools/WebGPUShadingLanguageRI/CallExpression.js
patching file Tools/WebGPUShadingLanguageRI/EBufferBuilder.js
patching file Tools/WebGPUShadingLanguageRI/Func.js
patching file Tools/WebGPUShadingLanguageRI/FuncDef.js
patching file Tools/WebGPUShadingLanguageRI/Prepare.js
patching file Tools/WebGPUShadingLanguageRI/Rewriter.js
patching file Tools/WebGPUShadingLanguageRI/SPIRV.html
patching file Tools/WebGPUShadingLanguageRI/SynthesizeStructAccessors.js
Hunk #1 FAILED at 20.
1 out of 1 hunk FAILED -- saving rejects to file Tools/WebGPUShadingLanguageRI/SynthesizeStructAccessors.js.rej
patching file Tools/WebGPUShadingLanguageRI/Test.html
patching file Tools/WebGPUShadingLanguageRI/Test.js
Hunk #1 succeeded at 8081 (offset 164 lines).
patching file Tools/WebGPUShadingLanguageRI/index.html

Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force', '--reviewer', u'Myles C. Maxfield']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit
Updating OpenSource
From https://git.webkit.org/git/WebKit
   2a2836e6631..dee36913aef  master     -> origin/master
Partial-rebuilding .git/svn/refs/remotes/origin/master/.rev_map.268f45cc-cd09-0410-ab3c-d52691b4dbfc ...
Currently at 236296 = 2a2836e6631fd50250fbec7774a49ee368daa97b
r236297 = 560dda40a46c0fea73db1cb6365debaee9273c3a
r236298 = 9ff9defcd0e9c7e3712bd2f28cc498e9e99f7902
r236299 = dee36913aefb932eb3d82e2b3510193dac0212ff
Done rebuilding .git/svn/refs/remotes/origin/master/.rev_map.268f45cc-cd09-0410-ab3c-d52691b4dbfc
First, rewinding head to replay your work on top of it...
Fast-forwarded master to refs/remotes/origin/master.

Full output: https://webkit-queues.webkit.org/results/9290392
Comment 12 EWS 2018-09-20 23:55:22 PDT
Comment on attachment 350175 [details]
Patch

Rejecting attachment 350175 [details] from commit-queue.

tdenney@apple.com does not have committer permissions according to https://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/common/config/contributors.json.

- If you do not have committer rights please read http://webkit.org/coding/contributing.html for instructions on how to use bugzilla flags.

- If you have committer rights please correct the error in Tools/Scripts/webkitpy/common/config/contributors.json by adding yourself to the file (no review needed).  The commit-queue restarts itself every 2 hours.  After restart the commit-queue will correctly respect your committer rights.
Comment 13 Thomas Denney 2018-09-21 14:02:29 PDT
Created attachment 350419 [details]
Patch
Comment 14 EWS 2018-09-21 14:03:16 PDT
Comment on attachment 350419 [details]
Patch

Rejecting attachment 350419 [details] from commit-queue.

tdenney@apple.com does not have committer permissions according to https://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/common/config/contributors.json.

- If you do not have committer rights please read http://webkit.org/coding/contributing.html for instructions on how to use bugzilla flags.

- If you have committer rights please correct the error in Tools/Scripts/webkitpy/common/config/contributors.json by adding yourself to the file (no review needed).  The commit-queue restarts itself every 2 hours.  After restart the commit-queue will correctly respect your committer rights.
Comment 15 Thomas Denney 2018-09-21 14:07:04 PDT
Created attachment 350421 [details]
Patch
Comment 16 WebKit Commit Bot 2018-09-21 14:46:10 PDT
Comment on attachment 350421 [details]
Patch

Clearing flags on attachment: 350421

Committed r236361: <https://trac.webkit.org/changeset/236361>
Comment 17 Myles C. Maxfield 2018-09-22 03:03:00 PDT
*** Bug 189107 has been marked as a duplicate of this bug. ***
Comment 18 Myles C. Maxfield 2018-10-13 15:40:01 PDT
Migrated to https://github.com/gpuweb/WHLSL/issues/105