RESOLVED FIXED 145619
[Content Extensions] Implement branch compaction for DFA byte code
https://bugs.webkit.org/show_bug.cgi?id=145619
Summary [Content Extensions] Implement branch compaction for DFA byte code
Alex Christensen
Reported 2015-06-03 15:14:38 PDT
We use lots of 0's for the high bits of jumps in the DFA byte code. Let's compact this down. This patch reduced the total byte code size in a large test case from 15123003 bytes to 14085291.
Attachments
Patch (44.35 KB, patch)
2015-06-03 15:28 PDT, Alex Christensen
no flags
Patch (44.19 KB, patch)
2015-06-16 13:31 PDT, Alex Christensen
no flags
Patch (45.61 KB, patch)
2015-06-16 14:30 PDT, Alex Christensen
no flags
Alex Christensen
Comment 1 2015-06-03 15:28:27 PDT
Alex Christensen
Comment 2 2015-06-03 16:40:14 PDT
Int24 should be added in a following patch
Benjamin Poulain
Comment 3 2015-06-05 15:07:50 PDT
Comment on attachment 254217 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=254217&action=review > Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:237 > + unsigned totalBytecodeSize = 0; Don't forget to revisit this since I landed the same thing. > Source/WebCore/contentextensions/DFABytecode.h:83 > +enum DFABytecodeJumpSize { May I suggest using exclusive bits? ARM64 has a test-bit-and-branch instruction for those cases to avoid mask+jump. > Source/WebCore/contentextensions/DFABytecode.h:94 > + if (longestPossibleJump <= std::numeric_limits<int8_t>::max() && longestPossibleJump >= std::numeric_limits<int8_t>::min()) > + return Int8; > + if (longestPossibleJump <= std::numeric_limits<int16_t>::max() && longestPossibleJump >= std::numeric_limits<int16_t>::min()) > + return Int16; I was gonna complain about using masks instead...but actually this is way cleaner. > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:64 > + *reinterpret_cast<IntType*>(&bytecode[index]) = value; Can we assert we have enough free space to write the value? Too slow? > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:69 > + if (action & 0xFFFF00000000) Assert the top 2 bytes are zero? Can you move 0xFFFF00000000 to a constant with a good name? > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:104 > + return m_maxNodeStartOffsets[destinationNodeIndex] - jumpLocation; This is very pessimistic. As the compaction progress, the jumpLocation drift further appart from the m_maxNodeStartOffsets of its own node. The jump length should be the worst distance between the two nodes as computed for m_maxNodeStartOffsets - the offset since the start of the current node. (m_maxNodeStartOffsets[destinationNodeIndex] - m_maxNodeStartOffsets[sourceNodeIndex] - (m_nodeStartOffsets[sourceNodeIndex] - jumpLocation)). > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:114 > + uint32_t jumpLocation = m_bytecode.size() + sizeof(uint8_t); Suggestion: -Have the instruction offset and jumpLocation in LinkRecord. -Do the relative jump from the instruction position instead of the relative address position. That should remove all the +sizeof(thingy) that are scaring the hell out of me. > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:128 > + append<uint8_t>(m_bytecode, static_cast<uint8_t>(caseSensitive ? DFABytecodeInstruction::CheckValueCaseSensitive : DFABytecodeInstruction::CheckValueCaseInsensitive) | jumpSize); Can you move "caseSensitive ? DFABytecodeInstruction::CheckValueCaseSensitive : DFABytecodeInstruction::CheckValueCaseInsensitive" to its own temporary variable? That line does too much. > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:138 > + uint32_t jumpLocation = m_bytecode.size() + 3 * sizeof(uint8_t); This is the kind of code I don't like regarding the base of the jumps. Too much magic. > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:141 > + append<uint8_t>(m_bytecode, static_cast<uint8_t>(caseSensitive ? DFABytecodeInstruction::CheckValueRangeCaseSensitive : DFABytecodeInstruction::CheckValueRangeCaseInsensitive) | jumpSize); Temp variable with the instruction? > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:195 > - memset(destinations, 0xff, sizeof(destinations)); > + memset(destinations, 0xFF, sizeof(destinations)); WHYYYYYY. > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:318 > + unsigned nextIndex = sizeof(DFAHeader) + compiledNodeMaxBytecodeSize(m_dfa.root); I don't understand this. Shouldn't the start offset of the root be at the offset of the header + the actions of the root? Isn't compiledNodeMaxBytecodeSize(m_dfa.root) too far? > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:319 > + m_maxNodeStartOffsets[m_dfa.root] = nextIndex; I have the feeling this case should be split for root: m_maxNodeStartOffsets[m_dfa.root] = offset of header + initial actions; nextIndex += size of actions; > Source/WebCore/contentextensions/DFABytecodeInterpreter.cpp:53 > +// FIXME: Get rid of pagesUsed. We don't need to keep track of the pages used. But, but, I really like it. > Source/WebCore/contentextensions/DFABytecodeInterpreter.cpp:205 > + uint32_t jumpLocation = programCounter + sizeof(DFABytecodeInstruction) + sizeof(uint8_t); jumpLocation -> jumpBase? To keep the pattern destination = base + offset.
Alex Christensen
Comment 4 2015-06-16 13:31:03 PDT
Alex Christensen
Comment 5 2015-06-16 13:39:19 PDT
I addressed everything with these notes: (In reply to comment #3) > > Source/WebCore/contentextensions/DFABytecode.h:83 > > +enum DFABytecodeJumpSize { > > May I suggest using exclusive bits? > ARM64 has a test-bit-and-branch instruction for those cases to avoid > mask+jump. Good idea, but the compiler cannot prove that this will always have a valid value in it, and it needs a default: RELEASE_ASSERT_NOT_REACHED() in case we are reading invalid byte code. This would not improve anything. > > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:318 > > + unsigned nextIndex = sizeof(DFAHeader) + compiledNodeMaxBytecodeSize(m_dfa.root); > > I don't understand this. > > Shouldn't the start offset of the root be at the offset of the header + the > actions of the root? > > Isn't compiledNodeMaxBytecodeSize(m_dfa.root) too far? That's right, this is an overestimate, but the root is always compiled first so this is never an issue. Even if it weren't compiled first, this would only be slightly pessimistic. I don't think it's worth making the code much more complicated just to make this a little more accurate when it doesn't matter. > > Source/WebCore/contentextensions/DFABytecodeInterpreter.cpp:53 > > +// FIXME: Get rid of pagesUsed. We don't need to keep track of the pages used. > > But, but, I really like it. It's annoying and not even accurate ever since I made the jumps relative to the current DFA's byte code. > > > Source/WebCore/contentextensions/DFABytecodeInterpreter.cpp:205 > > + uint32_t jumpLocation = programCounter + sizeof(DFABytecodeInstruction) + sizeof(uint8_t); > > jumpLocation -> jumpBase? To keep the pattern destination = base + offset. I made the jumps relative from the current program counter instead.
Alex Christensen
Comment 6 2015-06-16 14:30:43 PDT
WebKit Commit Bot
Comment 7 2015-06-16 15:29:59 PDT
Comment on attachment 254969 [details] Patch Clearing flags on attachment: 254969 Committed r185621: <http://trac.webkit.org/changeset/185621>
WebKit Commit Bot
Comment 8 2015-06-16 15:30:02 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.