Bug 143393

Summary: [Content Extensions] Sort DFANodes by importance
Product: WebKit Reporter: Alex Christensen <achristensen>
Component: WebCore Misc.Assignee: Alex Christensen <achristensen>
Status: NEW ---    
Severity: Normal CC: barraclough, benjamin
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch achristensen: review-

Description Alex Christensen 2015-04-03 16:46:28 PDT
The DFA nodes are currently compiled in the order they are added, which provides pretty-good memory locality compared to random ordering, but we could do better.
Comment 1 Alex Christensen 2015-04-03 16:52:42 PDT
Created attachment 250107 [details]
Patch
Comment 2 Geoffrey Garen 2015-04-06 11:13:50 PDT
Comment on attachment 250107 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=250107&action=review

> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:202
> +static Vector<unsigned> zeroes(size_t count)
> +{
> +    Vector<unsigned> vector;
> +    vector.reserveInitialCapacity(count);
> +    for (unsigned i = 0; i < count; ++i)
> +        vector.append(0);
> +    return vector;
> +}

Wat. Plz use Vector::fill, which uses memset, which uses SIMD, which makes me happy.

> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:240
> +static Vector<unsigned> incomingTransitions(const DFA& dfa)
> +{
> +    Vector<unsigned> transitionsIn = zeroes(dfa.size());
> +    for (unsigned i = 0; i < dfa.size(); ++i) {
> +        for (unsigned transition : dfa.nodeAt(i).transitions.values())
> +            transitionsIn[transition]++;
> +    }
> +    return transitionsIn;
> +}
> +
> +static Vector<unsigned> outgoingTransitions(const DFA& dfa)
> +{
> +    Vector<unsigned> transitionsOut;
> +    transitionsOut.reserveInitialCapacity(dfa.size());
> +    for (unsigned i = 0; i < dfa.size(); ++i)
> +        transitionsOut.append(dfa.nodeAt(i).transitions.size());
> +    return transitionsOut;
> +}

Can we compute these in the same loop, to avoid looping twice? Or is this loop really inexpensive?

> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:279
> +        // We should tune this objective function before landing, but this gives a slight improvement.

How much of an improvement is this?
Comment 3 Alex Christensen 2015-04-06 17:11:50 PDT
Comment on attachment 250107 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=250107&action=review

We should look into this and measure different objective functions' effects on memory usage after minimizing the size of the DFA bytecode.

>> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:202
>> +}
> 
> Wat. Plz use Vector::fill, which uses memset, which uses SIMD, which makes me happy.

I knew there should be something in Vector to do this.

>> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:240
>> +}
> 
> Can we compute these in the same loop, to avoid looping twice? Or is this loop really inexpensive?

Unless we do things terribly wrong, most of the time compiling is spent in NFAToDFA::convert.  This is really inexpensive, but we could do both in the same loop and organize the functions differently.

>> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:279
>> +        // We should tune this objective function before landing, but this gives a slight improvement.
> 
> How much of an improvement is this?

Not as much as I want.