Bug 146961

Summary: [Content extensions] Combine suffixes when generating NFAs
Product: WebKit Reporter: Benjamin Poulain <benjamin>
Component: New BugsAssignee: Benjamin Poulain <benjamin>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, cmarcelo, commit-queue, sam, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch achristensen: review+

Description Benjamin Poulain 2015-07-15 00:06:03 PDT
[Content extensions] Combine suffixes when generating NFAs
Comment 1 Benjamin Poulain 2015-07-15 00:38:03 PDT
Created attachment 256827 [details]
Patch
Comment 2 Alex Christensen 2015-07-15 11:37:15 PDT
Comment on attachment 256827 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=256827&action=review

> Source/WebCore/ChangeLog:14
> +        have the same trailing NFA nodes for both side of the disjunction.

sides

> Source/WebCore/ChangeLog:46
> +        Overall, I get the following gains:

Awesome.  Where are these gains coming from?  Is the DFA minimization faster, or NFA to DFA faster, or both?

> Source/WebCore/ChangeLog:50
> +        -Armand's test cse:

case

> Tools/TestWebKitAPI/Tests/WebCore/DFAMinimizer.cpp:70
> -    EXPECT_EQ(static_cast<size_t>(13), countLiveNodes(dfa));
> +    EXPECT_EQ(static_cast<size_t>(6), countLiveNodes(dfa));
>      dfa.minimize();
>      EXPECT_EQ(static_cast<size_t>(6), countLiveNodes(dfa));

A lot of these minimization tests don't do anything now.  Could you add some cases where the minimizer still does a lot of the work?
Comment 3 Benjamin Poulain 2015-07-15 12:41:14 PDT
Comment on attachment 256827 [details]
Patch

(In reply to comment #2)
> > Source/WebCore/ChangeLog:46
> > +        Overall, I get the following gains:
> 
> Awesome.  Where are these gains coming from?  Is the DFA minimization
> faster, or NFA to DFA faster, or both?

The compile time is improved because the DFA have less redundancy. This makes NFA-to-DFA and Minimization simpler.

The memory gains are due to combining more patterns into NFAs.

> > Tools/TestWebKitAPI/Tests/WebCore/DFAMinimizer.cpp:70
> > -    EXPECT_EQ(static_cast<size_t>(13), countLiveNodes(dfa));
> > +    EXPECT_EQ(static_cast<size_t>(6), countLiveNodes(dfa));
> >      dfa.minimize();
> >      EXPECT_EQ(static_cast<size_t>(6), countLiveNodes(dfa));
> 
> A lot of these minimization tests don't do anything now.  Could you add some
> cases where the minimizer still does a lot of the work?

That's a good point. I'll add new test cases.
Comment 4 Benjamin Poulain 2015-07-15 13:54:53 PDT
Created attachment 256859 [details]
Patch
Comment 5 Benjamin Poulain 2015-07-15 15:43:22 PDT
Created attachment 256869 [details]
Patch
Comment 6 Alex Christensen 2015-07-15 17:23:54 PDT
I measured a large test case on a device and this cut the compiling time in half, and the max memory usage stayed almost exactly the same.
Comment 7 Alex Christensen 2015-07-15 17:28:04 PDT
Comment on attachment 256869 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=256869&action=review

> Source/WebCore/contentextensions/Term.h:646
> -        for (unsigned i = 1; i < m_atomData.group.terms.size(); ++i) {
> +        for (unsigned i = 1; i < m_atomData.group.terms.size() - 1; ++i) {

Why is the last one a special case?
Comment 8 Benjamin Poulain 2015-07-15 17:42:17 PDT
Comment on attachment 256869 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=256869&action=review

>> Source/WebCore/contentextensions/Term.h:646
>> +        for (unsigned i = 1; i < m_atomData.group.terms.size() - 1; ++i) {
> 
> Why is the last one a special case?

The generateGraph() for the last one is not one that generate a new node, it is one that generate transitions to an existing node.
Comment 9 Radar WebKit Bug Importer 2015-07-16 14:46:27 PDT
<rdar://problem/21863296>
Comment 10 Benjamin Poulain 2015-07-16 14:52:18 PDT
Committed r186910: <http://trac.webkit.org/changeset/186910>