Bug 147099

Summary: [Content Extensions] Use a jump table when consecutive transitions have different targets
Product: WebKit Reporter: Benjamin Poulain <benjamin>
Component: New BugsAssignee: Benjamin Poulain <benjamin>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, commit-queue, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch none

Description Benjamin Poulain 2015-07-19 18:23:39 PDT
[Content Extensions] Use a jump table when consecutive transitions have different targets
Comment 1 Benjamin Poulain 2015-07-19 18:26:45 PDT
Created attachment 257069 [details]
Patch
Comment 2 Alex Christensen 2015-07-20 11:27:14 PDT
*** Bug 142514 has been marked as a duplicate of this bug. ***
Comment 3 Alex Christensen 2015-07-20 11:31:23 PDT
Comment on attachment 257069 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=257069&action=review

> Source/WebCore/contentextensions/DFABytecode.h:60
> +    CheckValueRangeCaseInsensitive = 0x4,

I think it's strange to have a case insensitive jump table.

> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:213
> +    jumpTable.caseSensitive = ranges[lastRange].caseSensitive;

Why is the case sensitivity of the entire table determined by the case sensitivity of the last range?  I don't think this is correct.

> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:353
> +    ASSERT_WITH_MESSAGE(jumpTable.max < 128, "The DFA engine only supports the ASCII alphabet.");

This will make it harder to make this support unicode.
Comment 4 Alex Christensen 2015-07-20 23:25:02 PDT
Comment on attachment 257069 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=257069&action=review

Looks good, but I'm removing the r? because this needs rebasing and addressing a few things, mostly stylistic.  Consecutive unique destinations is a good way to determine if using branch tables is a good idea, but it could miss cases like if there are only two common destinations.  There's no perfect algorithm here.

> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:212
> +    jumpTable.max = ranges[lastRange].min;

min? I guess they're equal, but this looks confusing.

> Source/WebCore/contentextensions/DFABytecodeCompiler.h:65
> +    struct JumpTable {

I would add a destructor with ASSERT(min + destinations.size() == max + 1); and ASSERT(destinations.size() > 1);

> Source/WebCore/contentextensions/DFABytecodeInterpreter.cpp:135
> +    if (urlIndexIsAfterEndOfString)
> +        return false;

I think this should be in DFABytecodeInterpreter::interpret to be consistent.  Then this function would return void.

> Source/WebCore/contentextensions/DFABytecodeInterpreter.h:61
>      void interpretAppendAction(unsigned& programCounter, Actions&, bool ifDomain);
>      void interpretTestFlagsAndAppendAction(unsigned& programCounter, uint16_t flags, Actions&, bool ifDomain);
> +
> +    template<bool caseSensitive>
> +    bool interpetJumpTable(const char* url, uint32_t& urlIndex, uint32_t& programCounter, bool& urlIndexIsAfterEndOfString);

I just used an inline function with a bool parameter, I think it's probably better to use a template for all of these.

> Source/WebKit2/UIProcess/API/APIUserContentExtensionStore.h:54
> -    const static uint32_t CurrentContentExtensionFileVersion = 5;
> +    const static uint32_t CurrentContentExtensionFileVersion = 6;

This needs to be rebased :)
Comment 5 Benjamin Poulain 2015-07-21 02:34:24 PDT
Created attachment 257176 [details]
Patch
Comment 6 Alex Christensen 2015-07-21 11:00:45 PDT
Comment on attachment 257176 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=257176&action=review

> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:209
> +    jumpTable.caseSensitive = ranges[lastRange].caseSensitive;

I'm still not convinced this is correct.  What if you have a range that is case insensitive and a range that is case sensitive?  In that case, you don't want the whole jumpTable to be case sensitive, right?
Comment 7 Benjamin Poulain 2015-07-21 15:20:10 PDT
(In reply to comment #6)
> Comment on attachment 257176 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=257176&action=review
> 
> > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:209
> > +    jumpTable.caseSensitive = ranges[lastRange].caseSensitive;
> 
> I'm still not convinced this is correct.  What if you have a range that is
> case insensitive and a range that is case sensitive?  In that case, you
> don't want the whole jumpTable to be case sensitive, right?

Yep, indeed.


When the case sensitivity changes between two ranges, a new JumpTable is created.

This condition "|| baseRange->caseSensitive != range.caseSensitive" when finding ranges causes us to enter extractJumpTable() when the case-sensitivity changes.
Comment 8 Radar WebKit Bug Importer 2015-07-21 15:28:50 PDT
<rdar://problem/21929554>
Comment 9 WebKit Commit Bot 2015-07-21 16:46:19 PDT
Comment on attachment 257176 [details]
Patch

Clearing flags on attachment: 257176

Committed r187137: <http://trac.webkit.org/changeset/187137>
Comment 10 WebKit Commit Bot 2015-07-21 16:46:23 PDT
All reviewed patches have been landed.  Closing bug.