WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
145619
[Content Extensions] Implement branch compaction for DFA byte code
https://bugs.webkit.org/show_bug.cgi?id=145619
Summary
[Content Extensions] Implement branch compaction for DFA byte code
Alex Christensen
Reported
2015-06-03 15:14:38 PDT
We use lots of 0's for the high bits of jumps in the DFA byte code. Let's compact this down. This patch reduced the total byte code size in a large test case from 15123003 bytes to 14085291.
Attachments
Patch
(44.35 KB, patch)
2015-06-03 15:28 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(44.19 KB, patch)
2015-06-16 13:31 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(45.61 KB, patch)
2015-06-16 14:30 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Alex Christensen
Comment 1
2015-06-03 15:28:27 PDT
Created
attachment 254217
[details]
Patch
Alex Christensen
Comment 2
2015-06-03 16:40:14 PDT
Int24 should be added in a following patch
Benjamin Poulain
Comment 3
2015-06-05 15:07:50 PDT
Comment on
attachment 254217
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=254217&action=review
> Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:237 > + unsigned totalBytecodeSize = 0;
Don't forget to revisit this since I landed the same thing.
> Source/WebCore/contentextensions/DFABytecode.h:83 > +enum DFABytecodeJumpSize {
May I suggest using exclusive bits? ARM64 has a test-bit-and-branch instruction for those cases to avoid mask+jump.
> Source/WebCore/contentextensions/DFABytecode.h:94 > + if (longestPossibleJump <= std::numeric_limits<int8_t>::max() && longestPossibleJump >= std::numeric_limits<int8_t>::min()) > + return Int8; > + if (longestPossibleJump <= std::numeric_limits<int16_t>::max() && longestPossibleJump >= std::numeric_limits<int16_t>::min()) > + return Int16;
I was gonna complain about using masks instead...but actually this is way cleaner.
> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:64 > + *reinterpret_cast<IntType*>(&bytecode[index]) = value;
Can we assert we have enough free space to write the value? Too slow?
> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:69 > + if (action & 0xFFFF00000000)
Assert the top 2 bytes are zero? Can you move 0xFFFF00000000 to a constant with a good name?
> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:104 > + return m_maxNodeStartOffsets[destinationNodeIndex] - jumpLocation;
This is very pessimistic. As the compaction progress, the jumpLocation drift further appart from the m_maxNodeStartOffsets of its own node. The jump length should be the worst distance between the two nodes as computed for m_maxNodeStartOffsets - the offset since the start of the current node. (m_maxNodeStartOffsets[destinationNodeIndex] - m_maxNodeStartOffsets[sourceNodeIndex] - (m_nodeStartOffsets[sourceNodeIndex] - jumpLocation)).
> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:114 > + uint32_t jumpLocation = m_bytecode.size() + sizeof(uint8_t);
Suggestion: -Have the instruction offset and jumpLocation in LinkRecord. -Do the relative jump from the instruction position instead of the relative address position. That should remove all the +sizeof(thingy) that are scaring the hell out of me.
> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:128 > + append<uint8_t>(m_bytecode, static_cast<uint8_t>(caseSensitive ? DFABytecodeInstruction::CheckValueCaseSensitive : DFABytecodeInstruction::CheckValueCaseInsensitive) | jumpSize);
Can you move "caseSensitive ? DFABytecodeInstruction::CheckValueCaseSensitive : DFABytecodeInstruction::CheckValueCaseInsensitive" to its own temporary variable? That line does too much.
> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:138 > + uint32_t jumpLocation = m_bytecode.size() + 3 * sizeof(uint8_t);
This is the kind of code I don't like regarding the base of the jumps. Too much magic.
> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:141 > + append<uint8_t>(m_bytecode, static_cast<uint8_t>(caseSensitive ? DFABytecodeInstruction::CheckValueRangeCaseSensitive : DFABytecodeInstruction::CheckValueRangeCaseInsensitive) | jumpSize);
Temp variable with the instruction?
> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:195 > - memset(destinations, 0xff, sizeof(destinations)); > + memset(destinations, 0xFF, sizeof(destinations));
WHYYYYYY.
> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:318 > + unsigned nextIndex = sizeof(DFAHeader) + compiledNodeMaxBytecodeSize(m_dfa.root);
I don't understand this. Shouldn't the start offset of the root be at the offset of the header + the actions of the root? Isn't compiledNodeMaxBytecodeSize(m_dfa.root) too far?
> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:319 > + m_maxNodeStartOffsets[m_dfa.root] = nextIndex;
I have the feeling this case should be split for root: m_maxNodeStartOffsets[m_dfa.root] = offset of header + initial actions; nextIndex += size of actions;
> Source/WebCore/contentextensions/DFABytecodeInterpreter.cpp:53 > +// FIXME: Get rid of pagesUsed. We don't need to keep track of the pages used.
But, but, I really like it.
> Source/WebCore/contentextensions/DFABytecodeInterpreter.cpp:205 > + uint32_t jumpLocation = programCounter + sizeof(DFABytecodeInstruction) + sizeof(uint8_t);
jumpLocation -> jumpBase? To keep the pattern destination = base + offset.
Alex Christensen
Comment 4
2015-06-16 13:31:03 PDT
Created
attachment 254965
[details]
Patch
Alex Christensen
Comment 5
2015-06-16 13:39:19 PDT
I addressed everything with these notes: (In reply to
comment #3
)
> > Source/WebCore/contentextensions/DFABytecode.h:83 > > +enum DFABytecodeJumpSize { > > May I suggest using exclusive bits? > ARM64 has a test-bit-and-branch instruction for those cases to avoid > mask+jump.
Good idea, but the compiler cannot prove that this will always have a valid value in it, and it needs a default: RELEASE_ASSERT_NOT_REACHED() in case we are reading invalid byte code. This would not improve anything.
> > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:318 > > + unsigned nextIndex = sizeof(DFAHeader) + compiledNodeMaxBytecodeSize(m_dfa.root); > > I don't understand this. > > Shouldn't the start offset of the root be at the offset of the header + the > actions of the root? > > Isn't compiledNodeMaxBytecodeSize(m_dfa.root) too far?
That's right, this is an overestimate, but the root is always compiled first so this is never an issue. Even if it weren't compiled first, this would only be slightly pessimistic. I don't think it's worth making the code much more complicated just to make this a little more accurate when it doesn't matter.
> > Source/WebCore/contentextensions/DFABytecodeInterpreter.cpp:53 > > +// FIXME: Get rid of pagesUsed. We don't need to keep track of the pages used. > > But, but, I really like it.
It's annoying and not even accurate ever since I made the jumps relative to the current DFA's byte code.
> > > Source/WebCore/contentextensions/DFABytecodeInterpreter.cpp:205 > > + uint32_t jumpLocation = programCounter + sizeof(DFABytecodeInstruction) + sizeof(uint8_t); > > jumpLocation -> jumpBase? To keep the pattern destination = base + offset.
I made the jumps relative from the current program counter instead.
Alex Christensen
Comment 6
2015-06-16 14:30:43 PDT
Created
attachment 254969
[details]
Patch
WebKit Commit Bot
Comment 7
2015-06-16 15:29:59 PDT
Comment on
attachment 254969
[details]
Patch Clearing flags on attachment: 254969 Committed
r185621
: <
http://trac.webkit.org/changeset/185621
>
WebKit Commit Bot
Comment 8
2015-06-16 15:30:02 PDT
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug