Bug 160681 - Implement WASM Parser and B3 IR generator
Summary: Implement WASM Parser and B3 IR generator
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Keith Miller
URL:
Keywords:
Depends on:
Blocks: 159775
  Show dependency treegraph
 
Reported: 2016-08-08 17:23 PDT by Keith Miller
Modified: 2016-08-15 14:36 PDT (History)
10 users (show)

See Also:


Attachments
WIP (30.44 KB, patch)
2016-08-08 17:24 PDT, Keith Miller
no flags Details | Formatted Diff | Diff
WIP (54.25 KB, patch)
2016-08-10 17:13 PDT, Keith Miller
no flags Details | Formatted Diff | Diff
Patch (73.59 KB, patch)
2016-08-11 22:49 PDT, Keith Miller
no flags Details | Formatted Diff | Diff
Patch (73.41 KB, patch)
2016-08-15 09:24 PDT, Keith Miller
no flags Details | Formatted Diff | Diff
Patch (73.43 KB, patch)
2016-08-15 09:29 PDT, Keith Miller
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Keith Miller 2016-08-08 17:23:37 PDT
Implement WASM Parser and B3 IR generator
Comment 1 Keith Miller 2016-08-08 17:24:14 PDT
Created attachment 285616 [details]
WIP
Comment 2 Keith Miller 2016-08-10 17:13:10 PDT
Created attachment 285787 [details]
WIP
Comment 3 Keith Miller 2016-08-11 22:49:31 PDT
Created attachment 285898 [details]
Patch
Comment 4 WebKit Commit Bot 2016-08-11 22:50:49 PDT
Attachment 285898 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/wasm/WASMFunctionParser.h:118:  Non-label code inside switch statements should be indented.  [whitespace/indent] [4]
Total errors found: 1 in 19 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 5 Benjamin Poulain 2016-08-12 17:40:31 PDT
Comment on attachment 285898 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=285898&action=review

Quick comments:

> Source/JavaScriptCore/wasm/WASMModuleParser.h:50
> +    Vector<WASMFunctionInformation> functionInformation() { return m_functions; }

Copying a vector can be quite expensive.

"Vector<WASMFunctionInformation>" -> "const Vector<WASMFunctionInformation>&"?

> Source/JavaScriptCore/wasm/WASMParser.h:56
> +    bool WARN_UNUSED_RETURN matchCharacter(char);
> +    bool WARN_UNUSED_RETURN matchString(const char*);
> +
> +    bool WARN_UNUSED_RETURN parseVarUInt1(uint8_t& result);
> +    bool WARN_UNUSED_RETURN parseUInt7(uint8_t& result);
> +    bool WARN_UNUSED_RETURN parseUInt32(uint32_t& result);
> +    bool WARN_UNUSED_RETURN parseVarUInt32(uint32_t& result) { return decodeUInt32(m_source.data(), m_sourceLength, m_offset, result); }
> +
> +
> +    bool WARN_UNUSED_RETURN parseValueType(WASMValueType& result);

Have you considered returning Optional<type> instead of having each parse function a bool parse(type&)?

I have not yet checked the call sites to see if that would help...

> Source/JavaScriptCore/wasm/WASMPlan.cpp:55
> +    for (WASMFunctionInformation info : moduleParser.functionInformation()) {

const WASMFunctionInformation& info?

> Source/JavaScriptCore/wasm/WASMSections.cpp:56
> +    // I guess we could do something better than linear search here but I doubt it matters.

Let's remove the comment.
We can use gperf here if needed.

> Source/JavaScriptCore/wasm/WASMSections.h:39
> +macro(FunctionTypes, "type") \
> +macro(Signatures, "function") \
> +macro(Definitions, "code") \
> +macro(End, "end")

Indent them?

> Source/WTF/wtf/LEBDecoder.h:37
> +inline bool WARN_UNUSED_RETURN decodeUInt32(const uint8_t* bytes, const size_t& length, size_t& offset, uint32_t& result)

const size_t& -> size_t (since size_t is a single register already).

> Source/WTF/wtf/LEBDecoder.h:56
> -inline bool WARN_UNUSED_RETURN decodeInt32(const Vector<uint8_t>& bytes, size_t& offset, int32_t& result)
> +inline bool WARN_UNUSED_RETURN decodeInt32(const uint8_t* bytes, const size_t& length, size_t& offset, int32_t& result)

const size_t& -> size_t
Comment 6 Benjamin Poulain 2016-08-12 19:14:14 PDT
Comment on attachment 285898 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=285898&action=review

Some more comments:

> Source/JavaScriptCore/ChangeLog:50
> +        * wasm/WASMB3ParserContext.cpp: Added.

WASMB3ParserContext.h/WASMB3ParserContext.cpp are misnamed IMHO.
The ParserContext is an implementation detail.

> Source/JavaScriptCore/wasm/WASMB3ParserContext.cpp:85
> +
> +
> +

Empty lines here.

> Source/JavaScriptCore/wasm/WASMB3ParserContext.cpp:90
> +        result = m_currentBlock->appendNew<Value>(m_proc, Add, Origin(), left, right);

It would be easy to check the input type here.
You can return false if left->type() or right->type() are not Int32.

> Source/JavaScriptCore/wasm/WASMB3ParserContext.cpp:115
> +    m_controlStack.append(std::make_pair(m_proc.addBlock(), Nullopt));

Is it correct to create a new basic block here?

If I understand correctly, you cannot jump to the beginning of a block, you can only jump out of a block. The block split point would be every branch and the end of the block if there are any branch.

Say we have:
    foo;
    block {
        bar;
    }
    baz;
That's one basic block.

Now:
    foo;
    block {
       br_if
       bar
    }
    baz;
has basic blocks at foo, bar, and baz.

I suspect it works now in test because m_currentBlock is not changed and the values in the blocks are added to block#0.

> Source/JavaScriptCore/wasm/WASMB3ParserContext.cpp:133
> +bool B3ParserContext::addReturn(Vector<ExpressionType, 1>&& returnValues)

You could take a Vector& as argument to avoid copying the inline buffer.

> Source/JavaScriptCore/wasm/WASMFunctionParser.h:41
> +    typedef typename Context::BlockType BlockType;

BlockType -> BasicBlockType?

"Block" is an ambiguous concept that means different things at different places.
BasicBlock has a precise definition.

> Source/JavaScriptCore/wasm/WASMFunctionParser.h:62
> +WASMFunctionParser<Context>::WASMFunctionParser(Context& context, const Vector<uint8_t>& sourceBuffer, WASMFunctionInformation info)

const WASMFunctionInformation&
?

> Source/JavaScriptCore/wasm/WASMFunctionParser.h:67
> +    ASSERT(info.end <= sourceBuffer.size());
> +    ASSERT(info.start < info.end);

This looks like condition that could be always enforced by WASMParser.

> Source/JavaScriptCore/wasm/WASMFunctionParser.h:141
> +    case WASMOpType::Block: {
> +        if (!m_context.addBlock())
> +            return false;
> +        return parseBlock();
> +    }

No need for brackets here.

> Source/JavaScriptCore/wasm/WASMFunctionParser.h:143
> +        case WASMOpType::Return: {

The indentation of this case is wrong.

> Source/JavaScriptCore/wasm/WASMFunctionParser.h:147
> +            ASSERT(returnCount <= 1);

I think it's fine to assume parseVarUInt1() works.
You can even have it return a bool if you want the compiler to enforce 0/1 values.

> Source/JavaScriptCore/wasm/WASMFunctionParser.h:157
> +    case WASMOpType::End: {
> +        return m_context.endBlock(m_expressionStack);
> +    }

No need for brackets here.

> Source/JavaScriptCore/wasm/WASMParser.h:71
> +ALWAYS_INLINE bool WASMParser::matchCharacter(char c)

I find the name a bit odd because you increment the offset.
Would it be better to name this "consumeCharacter()" or something similar?
Comment 7 Benjamin Poulain 2016-08-12 21:06:05 PDT
Comment on attachment 285898 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=285898&action=review

Ok, I think I have been through everything now.
Can you check my comments about basic blocks earlier? That's the main doubt I have.

> Source/JavaScriptCore/wasm/WASMB3ParserContext.cpp:150
> +    Vector<Variable*> result;

+ result.reserveInitialCapacity(source.size(());

> Source/JavaScriptCore/wasm/WASMB3ParserContext.cpp:182
> +BasicBlock*& B3ParserContext::blockForControlLevel(unsigned level)

BasicBlock*& -> BasicBlock*?
Comment 8 Keith Miller 2016-08-15 09:24:58 PDT
Created attachment 286058 [details]
Patch
Comment 9 WebKit Commit Bot 2016-08-15 09:27:46 PDT
Attachment 286058 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/wasm/WASMFunctionParser.h:114:  Non-label code inside switch statements should be indented.  [whitespace/indent] [4]
Total errors found: 1 in 19 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 10 Keith Miller 2016-08-15 09:29:35 PDT
Created attachment 286059 [details]
Patch
Comment 11 WebKit Commit Bot 2016-08-15 09:32:31 PDT
Attachment 286059 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/wasm/WASMFunctionParser.h:114:  Non-label code inside switch statements should be indented.  [whitespace/indent] [4]
Total errors found: 1 in 19 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 12 Keith Miller 2016-08-15 09:44:48 PDT
(In reply to comment #6)
> > Source/JavaScriptCore/wasm/WASMB3ParserContext.cpp:115
> > +    m_controlStack.append(std::make_pair(m_proc.addBlock(), Nullopt));
> 
> Is it correct to create a new basic block here?
> 
> If I understand correctly, you cannot jump to the beginning of a block, you
> can only jump out of a block. The block split point would be every branch
> and the end of the block if there are any branch.
> 
> Say we have:
>     foo;
>     block {
>         bar;
>     }
>     baz;
> That's one basic block.
> 
> Now:
>     foo;
>     block {
>        br_if
>        bar
>     }
>     baz;
> has basic blocks at foo, bar, and baz.
> 
> I suspect it works now in test because m_currentBlock is not changed and the
> values in the blocks are added to block#0.
> 

The current code will create two basic blocks in either case. We create a root basic block for the function then for each `block` opcode we create a continuation basic block for a potential branch (which we don't support yet). In the future, we should create blocks lazily but for now I'm not too worried.
Comment 13 Keith Miller 2016-08-15 09:48:58 PDT
Comment on attachment 285898 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=285898&action=review

>> Source/JavaScriptCore/wasm/WASMB3ParserContext.cpp:90
>> +        result = m_currentBlock->appendNew<Value>(m_proc, Add, Origin(), left, right);
> 
> It would be easy to check the input type here.
> You can return false if left->type() or right->type() are not Int32.

In theory this should already be validated but I'll add an assert.

>> Source/JavaScriptCore/wasm/WASMB3ParserContext.cpp:150
>> +    Vector<Variable*> result;
> 
> + result.reserveInitialCapacity(source.size(());

fixed.

>> Source/JavaScriptCore/wasm/WASMFunctionParser.h:141
>> +    }
> 
> No need for brackets here.

It seems like WebKit style should favor having the braces; effectively, each case is it's own control clause. I also personally prefer it.

>> Source/JavaScriptCore/wasm/WASMParser.h:56
>> +    bool WARN_UNUSED_RETURN parseValueType(WASMValueType& result);
> 
> Have you considered returning Optional<type> instead of having each parse function a bool parse(type&)?
> 
> I have not yet checked the call sites to see if that would help...

This is something that ggaren and I talked about offline previously. My we ended up going with the current system because it adds a compile time check that the failure case has been managed. A runtime check might be acceptable but it doesn't do a good job when things are decoded only on some control flow paths.

>> Source/JavaScriptCore/wasm/WASMParser.h:71
>> +ALWAYS_INLINE bool WASMParser::matchCharacter(char c)
> 
> I find the name a bit odd because you increment the offset.
> Would it be better to name this "consumeCharacter()" or something similar?

switched.
Comment 14 Benjamin Poulain 2016-08-15 13:48:55 PDT
Comment on attachment 286059 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=286059&action=review

LGTM. Let's do it.

> Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj:3381
> +                53F40E921D5A4AB30099A1B6 /* WASMB3IRGenerator.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = WASMB3IRGenerator.h; sourceTree = "<group>"; };

It's bizarre. XCode indented 53F40E921D5A4AB30099A1B6 in all places.

> Source/JavaScriptCore/wasm/WASMB3IRGenerator.cpp:127
> +    m_currentBlock = m_controlStack.takeLast().first;

Shouldn't you link the m_currentBlock to the continuation through a jump?
Comment 15 Keith Miller 2016-08-15 14:17:27 PDT
(In reply to comment #14)
> > Source/JavaScriptCore/wasm/WASMB3IRGenerator.cpp:127
> > +    m_currentBlock = m_controlStack.takeLast().first;
> 
> Shouldn't you link the m_currentBlock to the continuation through a jump?

I'll fix that issue when we add real control flow. There are a couple of other issues that need to be handled with control flow as well. Off the top of my head, we need to parse but not validate everything in a block after a return. We also need to upgrade the control stack to handle loops.
Comment 16 WebKit Commit Bot 2016-08-15 14:36:41 PDT
Comment on attachment 286059 [details]
Patch

Clearing flags on attachment: 286059

Committed r204484: <http://trac.webkit.org/changeset/204484>
Comment 17 WebKit Commit Bot 2016-08-15 14:36:47 PDT
All reviewed patches have been landed.  Closing bug.