Bug 146961 - [Content extensions] Combine suffixes when generating NFAs
Summary: [Content extensions] Combine suffixes when generating NFAs
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Benjamin Poulain
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2015-07-15 00:06 PDT by Benjamin Poulain
Modified: 2015-07-16 14:52 PDT (History)
5 users (show)

See Also:


Attachments
Patch (53.47 KB, patch)
2015-07-15 00:38 PDT, Benjamin Poulain
no flags Details | Formatted Diff | Diff
Patch (54.07 KB, patch)
2015-07-15 13:54 PDT, Benjamin Poulain
no flags Details | Formatted Diff | Diff
Patch (60.23 KB, patch)
2015-07-15 15:43 PDT, Benjamin Poulain
achristensen: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Benjamin Poulain 2015-07-15 00:06:03 PDT
[Content extensions] Combine suffixes when generating NFAs
Comment 1 Benjamin Poulain 2015-07-15 00:38:03 PDT
Created attachment 256827 [details]
Patch
Comment 2 Alex Christensen 2015-07-15 11:37:15 PDT
Comment on attachment 256827 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=256827&action=review

> Source/WebCore/ChangeLog:14
> +        have the same trailing NFA nodes for both side of the disjunction.

sides

> Source/WebCore/ChangeLog:46
> +        Overall, I get the following gains:

Awesome.  Where are these gains coming from?  Is the DFA minimization faster, or NFA to DFA faster, or both?

> Source/WebCore/ChangeLog:50
> +        -Armand's test cse:

case

> Tools/TestWebKitAPI/Tests/WebCore/DFAMinimizer.cpp:70
> -    EXPECT_EQ(static_cast<size_t>(13), countLiveNodes(dfa));
> +    EXPECT_EQ(static_cast<size_t>(6), countLiveNodes(dfa));
>      dfa.minimize();
>      EXPECT_EQ(static_cast<size_t>(6), countLiveNodes(dfa));

A lot of these minimization tests don't do anything now.  Could you add some cases where the minimizer still does a lot of the work?
Comment 3 Benjamin Poulain 2015-07-15 12:41:14 PDT
Comment on attachment 256827 [details]
Patch

(In reply to comment #2)
> > Source/WebCore/ChangeLog:46
> > +        Overall, I get the following gains:
> 
> Awesome.  Where are these gains coming from?  Is the DFA minimization
> faster, or NFA to DFA faster, or both?

The compile time is improved because the DFA have less redundancy. This makes NFA-to-DFA and Minimization simpler.

The memory gains are due to combining more patterns into NFAs.

> > Tools/TestWebKitAPI/Tests/WebCore/DFAMinimizer.cpp:70
> > -    EXPECT_EQ(static_cast<size_t>(13), countLiveNodes(dfa));
> > +    EXPECT_EQ(static_cast<size_t>(6), countLiveNodes(dfa));
> >      dfa.minimize();
> >      EXPECT_EQ(static_cast<size_t>(6), countLiveNodes(dfa));
> 
> A lot of these minimization tests don't do anything now.  Could you add some
> cases where the minimizer still does a lot of the work?

That's a good point. I'll add new test cases.
Comment 4 Benjamin Poulain 2015-07-15 13:54:53 PDT
Created attachment 256859 [details]
Patch
Comment 5 Benjamin Poulain 2015-07-15 15:43:22 PDT
Created attachment 256869 [details]
Patch
Comment 6 Alex Christensen 2015-07-15 17:23:54 PDT
I measured a large test case on a device and this cut the compiling time in half, and the max memory usage stayed almost exactly the same.
Comment 7 Alex Christensen 2015-07-15 17:28:04 PDT
Comment on attachment 256869 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=256869&action=review

> Source/WebCore/contentextensions/Term.h:646
> -        for (unsigned i = 1; i < m_atomData.group.terms.size(); ++i) {
> +        for (unsigned i = 1; i < m_atomData.group.terms.size() - 1; ++i) {

Why is the last one a special case?
Comment 8 Benjamin Poulain 2015-07-15 17:42:17 PDT
Comment on attachment 256869 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=256869&action=review

>> Source/WebCore/contentextensions/Term.h:646
>> +        for (unsigned i = 1; i < m_atomData.group.terms.size() - 1; ++i) {
> 
> Why is the last one a special case?

The generateGraph() for the last one is not one that generate a new node, it is one that generate transitions to an existing node.
Comment 9 Radar WebKit Bug Importer 2015-07-16 14:46:27 PDT
<rdar://problem/21863296>
Comment 10 Benjamin Poulain 2015-07-16 14:52:18 PDT
Committed r186910: <http://trac.webkit.org/changeset/186910>