Bug 142514

Summary:

DFABytecode should have jump table

Product:

WebKit

Reporter:

Alex Christensen <achristensen>

Component:

WebCore Misc.

Assignee:

Alex Christensen <achristensen>

Status:

RESOLVED DUPLICATE

Severity:

Normal

CC:

benjamin, mark.lam

Priority:

Version:

528+ (Nightly build)

Hardware:

Unspecified

OS:

Unspecified

See Also:

https://bugs.webkit.org/show_bug.cgi?id=142977

Attachments:

Description	Flags
Patch	none
Patch	none
Patch	none
Patch	achristensen: review-

Alex Christensen

Reported 2015-03-09 16:27:36 PDT

Often when interpreting DFA byte code, we check many values sequentially. To speed up interpreting, let's implement a jump table if there are lots of values to add. This might make the byte code grow a lot, so let's not do this too much. If the node does not have a fallback transition, I jump to a Terminate instruction which uses an extra byte in the instruction. This byte could be removed by using -1 as a special jump value meaning terminate, or by having a byte code for a branch table with a fallback transition and a byte code for a branch table without a fallback transition. Well, this works. Let's hear comments.

Attachments
Patch (9.43 KB, patch) 2015-03-09 16:31 PDT, Alex Christensen	no flags	Details Formatted Diff Diff
Patch (9.61 KB, patch) 2015-03-09 16:38 PDT, Alex Christensen	no flags	Details Formatted Diff Diff
Patch (12.96 KB, patch) 2015-03-18 14:57 PDT, Alex Christensen	no flags	Details Formatted Diff Diff
Patch (7.42 KB, patch) 2015-04-02 14:58 PDT, Alex Christensen	achristensen: review-	Details Formatted Diff Diff
Show Obsolete (3) View All Add attachment proposed patch, testcase, etc.

Alex Christensen

Comment 1 2015-03-09 16:31:49 PDT

Created attachment 248287 [details] Patch

Alex Christensen

Comment 2 2015-03-09 16:38:18 PDT

Created attachment 248288 [details] Patch

Benjamin Poulain

Comment 3 2015-03-09 22:22:21 PDT

Comment on attachment 248288 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=248288&action=review I am 100% behind this but you'll have to tune it a bit :) We have a balance between binary size and runtime speed. My guess is only a few hundred nodes should use the new instruction. > Source/WebCore/contentextensions/DFABytecode.h:46 > + // The index to jump to if the current character is outside the bounds of the branch table (4 bytes), Do you actually need this? You could define BranchTable such that it does nothing if the character is out of bound. You can then add a unconditional Jump after the BranchTable. > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:80 > + append<DFABytecodeInstruction>(m_bytecode, DFABytecodeInstruction::Terminate); That's not pretty but it works. Ideally you would link to the terminate instruction at the end of the node. To do that, each node would need to give some new info to the linker (or link internally). I am fine with the current approach though, that's just one byte. > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:85 > + append<unsigned>(m_bytecode, m_bytecode.size() - 1); // Jump to the Terminate instruction if out of bounds. I would prefer something more explicit than m_bytecode.size() - 1. > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:95 > + append<char>(m_bytecode, minValue); > + append<int8_t>(m_bytecode, maxValue - minValue + 1); First is char, next is int8_t. > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:104 > + append<unsigned>(m_bytecode, m_bytecode.size() - 1); // Jump to the Terminate instruction if out of bounds. Explicit > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:122 > + const int maxSequentialCheckValueOps = 6; // Guess for a good optimal value based on no actual data. Should get some data :) > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:123 > + if (node.transitions.size() <= maxSequentialCheckValueOps) { You should capture ranges instead of just the size. If you are curious, LLVM has a bunch of range optimizations to look at.

Alex Christensen

Comment 4 2015-03-10 11:44:15 PDT

Comment on attachment 248288 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=248288&action=review I'll look into measuring performance and size, and tweaking this implementation a little. Probably next week. >> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:104 >> + append<unsigned>(m_bytecode, m_bytecode.size() - 1); // Jump to the Terminate instruction if out of bounds. > > Explicit This one is also incorrect. This will jump to some byte in the middle of the branch table. We need to save the index of the terminate instruction and use it instead of m_bytecode.size() - 1 in both places.

Alex Christensen

Comment 5 2015-03-18 14:57:41 PDT

Created attachment 248968 [details] Patch

Benjamin Poulain

Comment 6 2015-03-19 14:03:20 PDT

Comment on attachment 248968 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=248968&action=review We reaaally need data on performance. I strongly believe the initial conditions should be smarter. For example, if there is a dense area for transitions and a few outliers, there should be a branch table for the dense area and a few value checks for the outliers. > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:150 > + // Lots of duplicate ranges are caused by upper case and lower case characters. > + // FIXME: Implement case-insensitive range checks. Oh, that's a good idea. > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:184 > + append<unsigned>(m_bytecode, UINT_MAX); // Tell interpreter this value is not in the branch table. Let's use std::numeric_limits here too.

Alex Christensen

Comment 7 2015-04-02 14:58:34 PDT

Created attachment 250011 [details] Patch

Alex Christensen

Comment 8 2015-04-02 14:59:44 PDT

Comment on attachment 250011 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=250011&action=review > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:216 > + if (ranges.size() > 3) > + emitBranchTable(ranges); I'm looking into a better condition here. I just uploaded the patch to not lose my recent work.

Alex Christensen

Comment 9 2015-07-20 11:27:14 PDT

*** This bug has been marked as a duplicate of bug 147099 ***

Note You need to log in before you can comment on or make changes to this bug.