Summary: | [Content Extensions] Redo NFA generation | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Alex Christensen <achristensen> | ||||||||||||||||||
Component: | WebCore Misc. | Assignee: | Alex Christensen <achristensen> | ||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||
Severity: | Normal | CC: | benjamin, sam | ||||||||||||||||||
Priority: | P2 | ||||||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||
Attachments: |
|
Description
Alex Christensen
2015-04-30 18:48:39 PDT
Created attachment 252129 [details]
Patch
Comment on attachment 252129 [details]
Patch
This patch doesn't quite pass all the api tests, and it uses recursion instead of a stack and a loop, but it's *pretty good* work in progress. Good enough to upload.
Created attachment 252160 [details]
Patch
This last patch passes most of the tests, but when I make multiple NFAs it generates too many NFA nodes. I think something is wrong with my calls to generateGraph. Ben, could you take a look? Comment on attachment 252160 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=252160&action=review Some quick comments: > Source/WebCore/contentextensions/CombinedURLFilters.cpp:171 > + if (edge.term.quantity() != AtomQuantifier::One) > + continue; // This will be a part of another NFA. That does not look correct. You can have quantifier in subterm of a group. That's why hasFixedLength() exists. > Source/WebCore/contentextensions/CombinedURLFilters.cpp:173 > + edge.term.destroy(); // Mark this leaf for deleting. Instead of making destroy() public, can't you just have: edge.term = Term(DeletedValue); ? > Source/WebCore/contentextensions/CombinedURLFilters.cpp:202 > + while (!stack.isEmpty() && stack.last().vertex->edges.last().term.quantity() == AtomQuantifier::One) Please don't use quantifiers directly. Created attachment 252240 [details]
Patch
Comment on attachment 252240 [details]
Patch
This latest patch passes all the current tests (HOORAY!). I still need to add some tests, make it non-recursive, and address Ben's comments.
We will still need to combine small DFAs even after this patch. Created attachment 252260 [details]
Patch
Created attachment 252261 [details]
Patch
This last patch should be pretty final and ready for real review. Created attachment 252262 [details]
Patch
Created attachment 252312 [details]
Patch
Created attachment 252313 [details]
Patch
Comment on attachment 252313 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=252313&action=review > Source/WebCore/contentextensions/CombinedURLFilters.cpp:170 > + /* > + // Non-recursive form of this: > + for (auto& edge : root.edges) { > + if (!edge.term.hasFixedLength()) > + continue; // This will be a part of another NFA. > + unsigned subtreeRootId = edge.term.generateGraph(nfa, nfaRootId, edge.child->finalActions); > + generateNFAForSubtree(nfa, subtreeRootId, *edge.child.get()); > + if (edge.child->edges.isEmpty()) > + edge.term = Term(Term::DeletedValue); // Mark this leaf for deleting. > + } > + root.edges.removeAllMatching([](PrefixTreeEdge& edge) > + { > + return edge.term.isDeletedValue(); > + }); > + */ Commented-out code? Comment on attachment 252313 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=252313&action=review Looks great. Sorry for the review delay. > Source/WebCore/contentextensions/CombinedURLFilters.cpp:157 > + /* > + // Non-recursive form of this: Did you intend to leave this? I think a description of the algorithm in a few paragraph would be more valuable than the code. > Source/WebCore/contentextensions/CombinedURLFilters.cpp:180 > + unsigned rootId; nfaNodeId? > Source/WebCore/contentextensions/CombinedURLFilters.cpp:192 > + if (edgeIndex >= vertex.edges.size()) { I would inverse the two blocks of the if and the condition. The reason is readability. As you read, you can read them in order, the first case being the recursive step, the second being the edge case. > Source/WebCore/contentextensions/CombinedURLFilters.cpp:200 > + auto& edge = stack.last().vertex.edges[stack.last().edgeIndex]; Let's put stack.last() in a temporary. > Source/WebCore/contentextensions/CombinedURLFilters.cpp:262 > + // FIXME: Combine small NFAs to reduce the number of NFAs. That will be done later, outside this class. That FIXME is at the wrong place. > Source/WebCore/contentextensions/CombinedURLFilters.cpp:268 > + if (stack.last()->edges.isEmpty()) { It might be clearer to use stack[stack.size() - 1] here to make the next line more obvious? > Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp:537 > + const unsigned size = 100000; You went for really big :) > Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp:542 > + // FIXME: DFAToNFA::convert takes way too long on these deep NFAs. We should optimize for that case. That's weird, that seems like an easy case. |