WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
144485
[Content Extensions] Redo NFA generation
https://bugs.webkit.org/show_bug.cgi?id=144485
Summary
[Content Extensions] Redo NFA generation
Alex Christensen
Reported
2015-04-30 18:48:39 PDT
We want to delete the prefix tree as we generate the NFA to decrease the max memory usage, and we also want to combine all the regexes with the same prefix up to the last quantified term into one NFA to decrease the number of NFAs.
Attachments
Patch
(15.04 KB, patch)
2015-04-30 18:56 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(12.14 KB, patch)
2015-05-01 11:45 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(12.43 KB, patch)
2015-05-02 11:13 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(19.66 KB, patch)
2015-05-02 23:31 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(19.83 KB, patch)
2015-05-02 23:38 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(20.01 KB, patch)
2015-05-02 23:53 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(20.01 KB, patch)
2015-05-04 08:40 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(22.42 KB, patch)
2015-05-04 08:55 PDT
,
Alex Christensen
benjamin
: review+
Details
Formatted Diff
Diff
Show Obsolete
(7)
View All
Add attachment
proposed patch, testcase, etc.
Alex Christensen
Comment 1
2015-04-30 18:56:21 PDT
Created
attachment 252129
[details]
Patch
Alex Christensen
Comment 2
2015-04-30 18:58:56 PDT
Comment on
attachment 252129
[details]
Patch This patch doesn't quite pass all the api tests, and it uses recursion instead of a stack and a loop, but it's *pretty good* work in progress. Good enough to upload.
Alex Christensen
Comment 3
2015-05-01 11:45:33 PDT
Created
attachment 252160
[details]
Patch
Alex Christensen
Comment 4
2015-05-01 11:46:42 PDT
This last patch passes most of the tests, but when I make multiple NFAs it generates too many NFA nodes. I think something is wrong with my calls to generateGraph. Ben, could you take a look?
Benjamin Poulain
Comment 5
2015-05-01 14:12:02 PDT
Comment on
attachment 252160
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=252160&action=review
Some quick comments:
> Source/WebCore/contentextensions/CombinedURLFilters.cpp:171 > + if (edge.term.quantity() != AtomQuantifier::One) > + continue; // This will be a part of another NFA.
That does not look correct. You can have quantifier in subterm of a group. That's why hasFixedLength() exists.
> Source/WebCore/contentextensions/CombinedURLFilters.cpp:173 > + edge.term.destroy(); // Mark this leaf for deleting.
Instead of making destroy() public, can't you just have: edge.term = Term(DeletedValue); ?
> Source/WebCore/contentextensions/CombinedURLFilters.cpp:202 > + while (!stack.isEmpty() && stack.last().vertex->edges.last().term.quantity() == AtomQuantifier::One)
Please don't use quantifiers directly.
Alex Christensen
Comment 6
2015-05-02 11:13:45 PDT
Created
attachment 252240
[details]
Patch
Alex Christensen
Comment 7
2015-05-02 11:15:33 PDT
Comment on
attachment 252240
[details]
Patch This latest patch passes all the current tests (HOORAY!). I still need to add some tests, make it non-recursive, and address Ben's comments.
Alex Christensen
Comment 8
2015-05-02 12:20:40 PDT
We will still need to combine small DFAs even after this patch.
Alex Christensen
Comment 9
2015-05-02 23:31:42 PDT
Created
attachment 252260
[details]
Patch
Alex Christensen
Comment 10
2015-05-02 23:38:25 PDT
Created
attachment 252261
[details]
Patch
Alex Christensen
Comment 11
2015-05-02 23:38:52 PDT
This last patch should be pretty final and ready for real review.
Alex Christensen
Comment 12
2015-05-02 23:53:59 PDT
Created
attachment 252262
[details]
Patch
Alex Christensen
Comment 13
2015-05-04 08:40:52 PDT
Created
attachment 252312
[details]
Patch
Alex Christensen
Comment 14
2015-05-04 08:55:25 PDT
Created
attachment 252313
[details]
Patch
Simon Fraser (smfr)
Comment 15
2015-05-04 16:23:37 PDT
Comment on
attachment 252313
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=252313&action=review
> Source/WebCore/contentextensions/CombinedURLFilters.cpp:170 > + /* > + // Non-recursive form of this: > + for (auto& edge : root.edges) { > + if (!edge.term.hasFixedLength()) > + continue; // This will be a part of another NFA. > + unsigned subtreeRootId = edge.term.generateGraph(nfa, nfaRootId, edge.child->finalActions); > + generateNFAForSubtree(nfa, subtreeRootId, *edge.child.get()); > + if (edge.child->edges.isEmpty()) > + edge.term = Term(Term::DeletedValue); // Mark this leaf for deleting. > + } > + root.edges.removeAllMatching([](PrefixTreeEdge& edge) > + { > + return edge.term.isDeletedValue(); > + }); > + */
Commented-out code?
Benjamin Poulain
Comment 16
2015-05-04 22:10:02 PDT
Comment on
attachment 252313
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=252313&action=review
Looks great. Sorry for the review delay.
> Source/WebCore/contentextensions/CombinedURLFilters.cpp:157 > + /* > + // Non-recursive form of this:
Did you intend to leave this? I think a description of the algorithm in a few paragraph would be more valuable than the code.
> Source/WebCore/contentextensions/CombinedURLFilters.cpp:180 > + unsigned rootId;
nfaNodeId?
> Source/WebCore/contentextensions/CombinedURLFilters.cpp:192 > + if (edgeIndex >= vertex.edges.size()) {
I would inverse the two blocks of the if and the condition. The reason is readability. As you read, you can read them in order, the first case being the recursive step, the second being the edge case.
> Source/WebCore/contentextensions/CombinedURLFilters.cpp:200 > + auto& edge = stack.last().vertex.edges[stack.last().edgeIndex];
Let's put stack.last() in a temporary.
> Source/WebCore/contentextensions/CombinedURLFilters.cpp:262 > + // FIXME: Combine small NFAs to reduce the number of NFAs.
That will be done later, outside this class. That FIXME is at the wrong place.
> Source/WebCore/contentextensions/CombinedURLFilters.cpp:268 > + if (stack.last()->edges.isEmpty()) {
It might be clearer to use stack[stack.size() - 1] here to make the next line more obvious?
> Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp:537 > + const unsigned size = 100000;
You went for really big :)
> Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp:542 > + // FIXME: DFAToNFA::convert takes way too long on these deep NFAs. We should optimize for that case.
That's weird, that seems like an easy case.
Alex Christensen
Comment 17
2015-05-05 10:29:15 PDT
http://trac.webkit.org/changeset/183818
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug