WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
142799
content extensions should have special path for global selectors
https://bugs.webkit.org/show_bug.cgi?id=142799
Summary
content extensions should have special path for global selectors
Alex Christensen
Reported
2015-03-17 15:47:58 PDT
Lots of selectors will apply to every page, so we should have a special path to compile these once and use them automatically.
Attachments
Patch
(15.64 KB, patch)
2015-03-17 15:54 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(9.65 KB, patch)
2015-03-17 16:42 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(5.54 KB, patch)
2015-03-17 17:44 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(20.26 KB, patch)
2015-03-18 11:30 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(34.79 KB, patch)
2015-03-18 18:30 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(30.79 KB, patch)
2015-03-18 19:26 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Show Obsolete
(5)
View All
Add attachment
proposed patch, testcase, etc.
Alex Christensen
Comment 1
2015-03-17 15:54:31 PDT
Created
attachment 248880
[details]
Patch
Brady Eidson
Comment 2
2015-03-17 16:09:02 PDT
Comment on
attachment 248880
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=248880&action=review
> Source/WebCore/contentextensions/CompiledContentExtension.cpp:50 > + const SerializedActionByte* begin = actions(); > + unsigned length = actionsLength(); > + RELEASE_ASSERT(length >= sizeof(unsigned)); > + unsigned selectorCount = *reinterpret_cast<const unsigned*>(begin); > + Vector<String> selectors; > + selectors.reserveInitialCapacity(selectorCount); > + unsigned position = sizeof(unsigned); > + for (unsigned i = 0; i < selectorCount; i++) > + selectors.append(deserializeSelector(begin, length, position)); > + return selectors;
Could use more paragraphing (empty lines between obviously connected chunks of code) - This is dense to read through.
> Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:60 > + unsigned stringStartIndex = location + sizeof(unsigned) + sizeof(bool); > + RELEASE_ASSERT(actionsLength >= stringStartIndex); > + unsigned selectorLength = *reinterpret_cast<const unsigned*>(&actions[location]); > + bool wideCharacters = actions[location + sizeof(ActionType) + sizeof(unsigned)]; > + if (wideCharacters) { > + RELEASE_ASSERT(actionsLength >= stringStartIndex + selectorLength * sizeof(UChar)); > + location += sizeof(unsigned) + sizeof(bool) + selectorLength * sizeof(UChar); > + return String(reinterpret_cast<const UChar*>(&actions[stringStartIndex]), selectorLength); > + } > + RELEASE_ASSERT(actionsLength >= stringStartIndex + selectorLength * sizeof(LChar)); > + location += sizeof(unsigned) + sizeof(bool) + selectorLength * sizeof(LChar); > + return String(reinterpret_cast<const LChar*>(&actions[stringStartIndex]), selectorLength);
Ditto
> Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:79 > + unsigned selectorLength = selector.length(); > + actions.resize(actions.size() + sizeof(unsigned)); > + *reinterpret_cast<unsigned*>(&actions[actions.size() - sizeof(unsigned)]) = selectorLength; > + bool wideCharacters = !selector.is8Bit(); > + actions.append(wideCharacters); > + if (wideCharacters) { > + for (unsigned i = 0; i < selectorLength; i++) { > + actions.resize(actions.size() + sizeof(UChar)); > + *reinterpret_cast<UChar*>(&actions[actions.size() - sizeof(UChar)]) = selector[i]; > + } > + } else { > + for (unsigned i = 0; i < selectorLength; i++) > + actions.append(selector[i]); > + } > +}
Etc... etc...
> LayoutTests/http/tests/contentextensions/css-display-none.html.json:34 > + { > + "action": { > + "type": "css-display-none", > + "selector": ".always_hidden" > + }, > }
This patch adds a new option to the json file, which is an action without a trigger. But what if the json author writes the action with a trigger that always triggers? i.e. - url-filter: ".*" ? Shouldn't those go to the global selector collection as well? And if they do, do we even need the triggerlesss actions at all?
Alex Christensen
Comment 3
2015-03-17 16:16:56 PDT
(In reply to
comment #2
)
> But what if the json author writes the action with a trigger that always > triggers? i.e. - url-filter: ".*" ?
Ben, correct me if I'm wrong, but those would be actions on the root of the DFA, right?
> Shouldn't those go to the global selector collection as well?
Yes.
> And if they do, do we even need the triggerlesss actions at all?
Probably not. Just have .* mean triggerless and not change the parser. That's probably a cleaner solution.
Brady Eidson
Comment 4
2015-03-17 16:28:48 PDT
Comment on
attachment 248880
[details]
Patch Based on agreement that things should work a little differently from this, r-
Alex Christensen
Comment 5
2015-03-17 16:42:37 PDT
Created
attachment 248885
[details]
Patch
Alex Christensen
Comment 6
2015-03-17 17:20:41 PDT
Comment on
attachment 248885
[details]
Patch A new byte code is not necessary for this. We should just add a loop at the beginning of the interpreter. There are many optimizations we could do related to detecting if a regex matches everything.
Alex Christensen
Comment 7
2015-03-17 17:21:46 PDT
And ignore-previous-rules could cause problems. I think we should start by only allowing rules that match everything at the beginning, so that ignore-previous-rules would simply disable use of the whole compiled stylesheet.
Alex Christensen
Comment 8
2015-03-17 17:44:02 PDT
Created
attachment 248891
[details]
Patch
Brady Eidson
Comment 9
2015-03-18 08:38:59 PDT
Comment on
attachment 248891
[details]
Patch I don't see how the DFA code change above can possibly cause the layout test changes in this patch...?
Alex Christensen
Comment 10
2015-03-18 11:30:20 PDT
Created
attachment 248945
[details]
Patch
Brady Eidson
Comment 11
2015-03-18 15:42:55 PDT
Comment on
attachment 248945
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=248945&action=review
> Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:127 > + WTFLogAlways("Trigger matching everything found not at beginning. This may cause incorrect behavior with ignore-previous-rules");
Is it enough to simply log an error for bad outside data? In general we understand that outside data cannot be trusted and might be malformed, and therefore we should handle it correctly.
> Source/WebCore/contentextensions/URLFilterParser.cpp:451 > + if (matchesEverything) > + fail(ASCIILiteral("Matches everything."));
It's unclear why "matches everything" is an error. Also, representing and error that you reference elsewhere (up above) by a non-shared ASCII literal string seems like a maintenance nightmare.
> Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp:350 > + testPatternError("a*b?.*.?[a-z]?[a-z]*", "Matches everything.");
And is this relevant to the whole "matches everything" string up above?
Alex Christensen
Comment 12
2015-03-18 15:57:27 PDT
Comment on
attachment 248945
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=248945&action=review
>> Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:127 >> + WTFLogAlways("Trigger matching everything found not at beginning. This may cause incorrect behavior with ignore-previous-rules"); > > Is it enough to simply log an error for bad outside data? In general we understand that outside data cannot be trusted and might be malformed, and therefore we should handle it correctly.
I'm not sure. I think we should fail completely, but Ben thinks we should succeed, but inform developers about semi-bad data like this.
>> Source/WebCore/contentextensions/URLFilterParser.cpp:451 >> + fail(ASCIILiteral("Matches everything.")); > > It's unclear why "matches everything" is an error. > > Also, representing and error that you reference elsewhere (up above) by a non-shared ASCII literal string seems like a maintenance nightmare.
It's not an error. If a regex matches everything we should not compile it in the NFA because we will put it directly in the root of the DFA later. I agree, failure cases should be an enum. There are a lot of them, and that might be a switch for another patch, though.
>> Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp:350 >> + testPatternError("a*b?.*.?[a-z]?[a-z]*", "Matches everything."); > > And is this relevant to the whole "matches everything" string up above?
Yes. This is the failure string.
Brady Eidson
Comment 13
2015-03-18 16:07:54 PDT
(In reply to
comment #12
)
> Comment on
attachment 248945
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=248945&action=review
> > >> Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:127 > >> + WTFLogAlways("Trigger matching everything found not at beginning. This may cause incorrect behavior with ignore-previous-rules"); > > > > Is it enough to simply log an error for bad outside data? In general we understand that outside data cannot be trusted and might be malformed, and therefore we should handle it correctly. > > I'm not sure. I think we should fail completely, but Ben thinks we should > succeed, but inform developers about semi-bad data like this.
I don't understand why this is not a case that we can make work correctly? But I would rather see failure with a log message than see incorrect success with an error message.
> > >> Source/WebCore/contentextensions/URLFilterParser.cpp:451 > >> + fail(ASCIILiteral("Matches everything.")); > > > > It's unclear why "matches everything" is an error. > > > > Also, representing and error that you reference elsewhere (up above) by a non-shared ASCII literal string seems like a maintenance nightmare. > > It's not an error. If a regex matches everything we should not compile it > in the NFA because we will put it directly in the root of the DFA later. > > I agree, failure cases should be an enum. There are a lot of them, and that > might be a switch for another patch, though.
Can we not add new ones in the meantime? Especially since it is apparently just to support a test...?
> > >> Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp:350 > >> + testPatternError("a*b?.*.?[a-z]?[a-z]*", "Matches everything."); > > > > And is this relevant to the whole "matches everything" string up above? > > Yes. This is the failure string.
Having a string in a test hard coded in the code itself just to support the test is highly irregular. Please don't do that?
Alex Christensen
Comment 14
2015-03-18 16:16:59 PDT
(In reply to
comment #13
)
> (In reply to
comment #12
) > > Comment on
attachment 248945
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=248945&action=review
> > > > >> Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:127 > > >> + WTFLogAlways("Trigger matching everything found not at beginning. This may cause incorrect behavior with ignore-previous-rules"); > > > > > > Is it enough to simply log an error for bad outside data? In general we understand that outside data cannot be trusted and might be malformed, and therefore we should handle it correctly. > > > > I'm not sure. I think we should fail completely, but Ben thinks we should > > succeed, but inform developers about semi-bad data like this. > > I don't understand why this is not a case that we can make work correctly?
If there is ignore-previous-rules between rules that match anything and we compiled a stylesheet of everything that matches everything, then we would not be able to use the compiled stylesheet or we would have incorrect behavior.
> But I would rather see failure with a log message than see incorrect success > with an error message.
I agree. I think it should fail completely.
> > > > > >> Source/WebCore/contentextensions/URLFilterParser.cpp:451 > > >> + fail(ASCIILiteral("Matches everything.")); > > > > > > It's unclear why "matches everything" is an error. > > > > > > Also, representing and error that you reference elsewhere (up above) by a non-shared ASCII literal string seems like a maintenance nightmare. > > > > It's not an error. If a regex matches everything we should not compile it > > in the NFA because we will put it directly in the root of the DFA later. > > > > I agree, failure cases should be an enum. There are a lot of them, and that > > might be a switch for another patch, though. > > Can we not add new ones in the meantime? Especially since it is apparently > just to support a test...?
It's not just to support a test. Regular expressions that match everything are not compiled through the NFA because that would just add a lot of transitions, slow down compiling, and end up having all those actions in the root anyways. We are shortcutting that process to speed up compiling. I added the test to make sure we are taking the shortcut at the correct times.
> > > > > >> Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp:350 > > >> + testPatternError("a*b?.*.?[a-z]?[a-z]*", "Matches everything."); > > > > > > And is this relevant to the whole "matches everything" string up above? > > > > Yes. This is the failure string. > > Having a string in a test hard coded in the code itself just to support the > test is highly irregular. Please don't do that?
I'll replace fail(const String&) with fail(enum FailureReason)
Brady Eidson
Comment 15
2015-03-18 16:29:47 PDT
(In reply to
comment #14
)
> (In reply to
comment #13
) > > (In reply to
comment #12
) > > > Comment on
attachment 248945
[details]
> > > Patch > > > > > > View in context: > > >
https://bugs.webkit.org/attachment.cgi?id=248945&action=review
> > > > > > >> Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:127 > > > >> + WTFLogAlways("Trigger matching everything found not at beginning. This may cause incorrect behavior with ignore-previous-rules"); > > > > > > > > Is it enough to simply log an error for bad outside data? In general we understand that outside data cannot be trusted and might be malformed, and therefore we should handle it correctly. > > > > > > I'm not sure. I think we should fail completely, but Ben thinks we should > > > succeed, but inform developers about semi-bad data like this. > > > > I don't understand why this is not a case that we can make work correctly? > If there is ignore-previous-rules between rules that match anything and we > compiled a stylesheet of everything that matches everything, then we would > not be able to use the compiled stylesheet or we would have incorrect > behavior.
The ContentExtensionCompiler is NOT compiling a stylesheet. It's simply collecting a set of selectors that will be used in the match-all case. So if the match-all case says "do these 20 selectors, then ignore previous rules, then do these other 30 selectors"... we could simply toss the first 20 and use only the following 30. The stylesheet will be put together later in the backend. Does this make sense?
Alex Christensen
Comment 16
2015-03-18 18:30:47 PDT
Created
attachment 249006
[details]
Patch
Brady Eidson
Comment 17
2015-03-18 18:48:55 PDT
Comment on
attachment 249006
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=249006&action=review
> Source/WebCore/contentextensions/CompiledContentExtension.cpp:41 > + // FIXME: This needs to be cleaned up, optimized, and moved to DFABytecodeInterpreter.
Why a FIXME, and not doing it now? Especially since a class friend'ing is required as-is
> Source/WebCore/contentextensions/CompiledContentExtension.cpp:58 > + // FIXME: Make sure only css selectors can get here.
Why a FIXME, and not doing it in this patch?
Brady Eidson
Comment 18
2015-03-18 18:49:26 PDT
Also, does this patch actually affect the CSS display none test that is changed?
Alex Christensen
Comment 19
2015-03-18 19:26:16 PDT
Created
attachment 249008
[details]
Patch
Alex Christensen
Comment 20
2015-03-18 19:29:45 PDT
(In reply to
comment #17
)
> Comment on
attachment 249006
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=249006&action=review
> > > Source/WebCore/contentextensions/CompiledContentExtension.cpp:41 > > + // FIXME: This needs to be cleaned up, optimized, and moved to DFABytecodeInterpreter.
Done.
> > Why a FIXME, and not doing it now? Especially since a class friend'ing is > required as-is > > > Source/WebCore/contentextensions/CompiledContentExtension.cpp:58 > > + // FIXME: Make sure only css selectors can get here. > > Why a FIXME, and not doing it in this patch?
Nothing terrible would happen if a non-css action matched everything. We could add a test for that later if we want to do something special. (In reply to
comment #17
)
> Also, does this patch actually affect the CSS display none test that is changed?
I removed the changes to LayoutTests. They were not really related to this patch.
WebKit Commit Bot
Comment 21
2015-03-18 21:57:57 PDT
Comment on
attachment 249008
[details]
Patch Clearing flags on attachment: 249008 Committed
r181726
: <
http://trac.webkit.org/changeset/181726
>
WebKit Commit Bot
Comment 22
2015-03-18 21:58:03 PDT
All reviewed patches have been landed. Closing bug.
Alex Christensen
Comment 23
2015-03-19 08:58:01 PDT
***
Bug 142298
has been marked as a duplicate of this bug. ***
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug