Bug 174966

Summary: Add WKContentRuleList action to add a header to a request
Product: WebKit Reporter: Alex Christensen <achristensen>
Component: New BugsAssignee: Alex Christensen <achristensen>
Status: NEW    
Severity: Normal CC: benjamin, buildbot, cdumez, darin, dbates, japhet, sam
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
buildbot: commit-queue-
Archive of layout-test-results from ews115 for mac-elcapitan none

Alex Christensen
Reported 2017-07-28 20:04:19 PDT
Add WKContentRuleList action to add a header to a request
Attachments
Patch (32.84 KB, patch)
2017-07-28 20:05 PDT, Alex Christensen
no flags
Patch (47.96 KB, patch)
2017-07-31 18:29 PDT, Alex Christensen
no flags
Patch (55.84 KB, patch)
2017-08-01 14:51 PDT, Alex Christensen
buildbot: commit-queue-
Archive of layout-test-results from ews115 for mac-elcapitan (2.03 MB, application/zip)
2017-08-02 04:14 PDT, Build Bot
no flags
Alex Christensen
Comment 1 2017-07-28 20:05:34 PDT
Alex Christensen
Comment 2 2017-07-28 20:07:02 PDT
Comment on attachment 316695 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=316695&action=review Tests need improvement, but the content of the patch is ready for review. > Tools/TestWebKitAPI/Tests/WebKit2Cocoa/AddHeader.mm:45 > + // FIXME: Test subresources, resources that don't match, etc. Also actually check for the existence of the test headers.
Sam Weinig
Comment 3 2017-07-29 20:06:46 PDT
I think we should make it really clear at the API layer that the intent is to add a rule list that contains rules that can modify requests. One way to do this would be to add a -[WKUserContentController addContentRuleList:options:] where you can pass in an option stating you want request / response modifying actions (and the existing -[WKUserContentController addContentRuleList:] would have a default options value that matches what we have today). Then, if it's easy to check that none of the actions are request modifying (which I think it should be, but could be wrong), we could throw, or maybe have an NSError out parameter. If it's not easy to check, we could pass the options down, and check when running the actions.
Sam Weinig
Comment 4 2017-07-29 20:07:14 PDT
Maybe instead of options, 'capabilities'? Not sure.
Darin Adler
Comment 5 2017-07-30 09:37:40 PDT
Comment on attachment 316695 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=316695&action=review > Source/WebCore/contentextensions/ContentExtensionActions.h:48 > + AddHeader, Terminology here isn’t great. This adds a header field value, possibly creating a new header field or possible appending to an existing header field. The header is the name for the thing that contains all the header fields plus a few other things. > Source/WebCore/contentextensions/ContentExtensionActions.h:56 > + Vector<std::pair<String, String>> headersToAdd; headerFieldValuesToAdd maybe > Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:59 > + actions.grow(actions.size() + sizeof(uint32_t)); > + *reinterpret_cast<uint32_t*>(&actions[actions.size() - sizeof(uint32_t)]) = stringLength; Do we really have to do it like this with a reinterpret_cast? That’s not normally how we serialize multiple bytes elsewhere. > Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:63 > + actions.uncheckedAppend(wideCharacters); The use of uncheckedAppend here looks incorrect and possibly dangerous. I don’t see any code above reserving the capacity so this can be unchecked. > Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:68 > + actions.grow(actions.size() + sizeof(UChar) * stringLength); > + for (unsigned i = 0; i < stringLength; ++i) > + *reinterpret_cast<UChar*>(&actions[startIndex + i * sizeof(UChar)]) = string[i]; Same question. > Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:170 > + case ActionType::IgnorePreviousRules: > + case ActionType::BlockLoad: > + case ActionType::BlockCookies: > + case ActionType::MakeHTTPS: > + actions.append(static_cast<SerializedActionByte>(actionType)); > + break; > + case ActionType::AddHeader: > + serializeStringAction(actions, actionType, rule.action().stringArgument()); > + break; I would just have a serializeString function. The action byte could be appended here for all cases unconditionally instead of inside the serializeStringAction. I think it would be cleaner separate of responsibilities in the code. > Source/WebCore/contentextensions/ContentExtensionParser.cpp:248 > + && header.startsWith("X-"); // Let's just pretend RFC6648 never happened. Is this intentionally only checking for capital "X"? Or should it be using two startsWith calls or startsWithLettersIgnoringASCIICase? > Source/WebCore/contentextensions/ContentExtensionParser.cpp:294 > + return {Action(ActionType::AddHeader, headerString)}; WebKit style puts spaces inside the braces. But I see that this file is not following that style. I’d also suggest braces after Action rather than parentheses. > Source/WebCore/contentextensions/ContentExtensionRule.cpp:69 > - uint32_t headerLength = sizeof(ActionType) + sizeof(uint32_t) + sizeof(bool); > - uint32_t stringStartIndex = location + headerLength; > + uint32_t prefixLength = sizeof(ActionType) + sizeof(uint32_t) + sizeof(bool); > + uint32_t stringStartIndex = location + prefixLength; > RELEASE_ASSERT(actionsLength >= stringStartIndex); > - uint32_t selectorLength = *reinterpret_cast<const unsigned*>(&actions[location + sizeof(ActionType)]); > + uint32_t stringLength = *reinterpret_cast<const unsigned*>(&actions[location + sizeof(ActionType)]); > bool wideCharacters = actions[location + sizeof(ActionType) + sizeof(uint32_t)]; > > if (wideCharacters) { > - RELEASE_ASSERT(actionsLength >= stringStartIndex + selectorLength * sizeof(UChar)); > - return Action(ActionType::CSSDisplayNoneSelector, String(reinterpret_cast<const UChar*>(&actions[stringStartIndex]), selectorLength), location); > + RELEASE_ASSERT(actionsLength >= stringStartIndex + stringLength * sizeof(UChar)); > + return Action(actionType, String(reinterpret_cast<const UChar*>(&actions[stringStartIndex]), stringLength), location); > } > - RELEASE_ASSERT(actionsLength >= stringStartIndex + selectorLength * sizeof(LChar)); > - return Action(ActionType::CSSDisplayNoneSelector, String(reinterpret_cast<const LChar*>(&actions[stringStartIndex]), selectorLength), location); > + RELEASE_ASSERT(actionsLength >= stringStartIndex + stringLength * sizeof(LChar)); > + return Action(actionType, String(reinterpret_cast<const LChar*>(&actions[stringStartIndex]), stringLength), location); Seems to me this would be cleaner with a deserializeString function that parallels the function used to serialize a string. Also seems like we should just consider using UTF-8 to serialize instead of having a separate UTF-16 case and Latin-1 case. Generally annoying to have to have yet another set of code that knows how to serialize strings with all this type casting and math. It’s a task we do elsewhere and annoying we could not reuse any serialization framework we already have elsewhere in WebKit. > Source/WebCore/contentextensions/ContentExtensionRule.h:151 > + ASSERT(type != ActionType::CSSDisplayNoneSelector && type != ActionType::CSSDisplayNoneStyleSheet && type != ActionType::AddHeader); Generally we try to have three assertions instead of one with && in it, so we can see which one failed. > Source/WebCore/contentextensions/ContentExtensionsBackend.cpp:180 > + size_t colonLocation = header.find(':'); > + ASSERT(colonLocation != notFound); Why can we assert this? Is there code somewhere that validates it? Seems like we need to check this rather than asserting it somewhere, since the data is coming from outside WebKit. Is it OK for the header field name or header field value to be empty string? If not, then we should check and assert that too. What about whitespace? Is it allowed? Should it be stripped?
Alex Christensen
Comment 6 2017-07-31 14:00:20 PDT
Thanks, Darin! I took your feedback into account with an upcoming patch. Sam, I'm not sure why you want a separate API for each ability of content rule lists, but if there is such a need, let's do it in a followup patch. (In reply to Darin Adler from comment #5) > Comment on attachment 316695 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=316695&action=review > > > Source/WebCore/contentextensions/ContentExtensionActions.h:48 > > + AddHeader, > > Terminology here isn’t great. This adds a header field value, possibly > creating a new header field or possible appending to an existing header > field. The header is the name for the thing that contains all the header > fields plus a few other things. I'm changing it to add-header-field, AddHeaderField, etc. > > Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:59 > > + actions.grow(actions.size() + sizeof(uint32_t)); > > + *reinterpret_cast<uint32_t*>(&actions[actions.size() - sizeof(uint32_t)]) = stringLength; > Do we really have to do it like this with a reinterpret_cast? That’s not > normally how we serialize multiple bytes elsewhere. I could rewrite it to make the format not dependent on the endianness of the machine, but I'd like to not change this in this patch. > > Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:63 > > + actions.uncheckedAppend(wideCharacters); > > The use of uncheckedAppend here looks incorrect and possibly dangerous. I > don’t see any code above reserving the capacity so this can be unchecked. Woah, you're right. This was mistakenly left in from an earlier version of the patch. > Also seems like we should just consider using UTF-8 to serialize instead of > having a separate UTF-16 case and Latin-1 case. I also think that would be excessive change to the file format for this patch. > > Source/WebCore/contentextensions/ContentExtensionsBackend.cpp:180 > > + size_t colonLocation = header.find(':'); > > + ASSERT(colonLocation != notFound); > > Why can we assert this? Is there code somewhere that validates it? Seems > like we need to check this rather than asserting it somewhere, since the > data is coming from outside WebKit. The parser fails if there is no colon in the header field. > Is it OK for the header field name or header field value to be empty string? > If not, then we should check and assert that too. What about whitespace? Is > it allowed? Should it be stripped? Whitespace is allowed and untouched. I will disallow \r and \n because they have special meaning in HTTP syntax.
Alex Christensen
Comment 7 2017-07-31 18:29:12 PDT
Darin Adler
Comment 8 2017-07-31 23:44:53 PDT
(In reply to Alex Christensen from comment #6) > > Is it OK for the header field name or header field value to be empty string? > > If not, then we should check and assert that too. What about whitespace? Is > > it allowed? Should it be stripped? > > Whitespace is allowed and untouched. I will disallow \r and \n because they > have special meaning in HTTP syntax. That doesn’t seem quite right or sufficient. I was reading the HTTP RFC 7230, and these are not allowed to contain arbitrary data. The field-name has to be a token and the only legal token characters are digits, letters, and these 15 characters: "!#$%&'*+-.^_`|~". The field-value is preceded and followed by optional whitespace (spaces and tabs only). And there are many special rules about whitespace and what characters are allowed. There is a lot relevant in the "Field Parsing" section of the RFC. Maybe your plan of allowing any characters is OK, but I am not sure it is.
Alex Christensen
Comment 9 2017-08-01 09:08:04 PDT
HTTPHeaderMap does not currently have the validity checks from RFC 7230, but maybe it should. I'll add them to my header validity checker.
Alex Christensen
Comment 10 2017-08-01 14:51:48 PDT
Alex Christensen
Comment 11 2017-08-01 14:58:29 PDT
Note: the fetch spec apparently intentionallyish doesn't check validity of HTTP headers https://fetch.spec.whatwg.org/#concept-header-value If they change their mind, we could use this parser.
Darin Adler
Comment 12 2017-08-01 18:12:12 PDT
(In reply to Alex Christensen from comment #11) > Note: the fetch spec apparently intentionallyish doesn't check validity of > HTTP headers > https://fetch.spec.whatwg.org/#concept-header-value What I read there says something different from "doesn't check validity": 1) checks validity of HTTP header field names 2) strips leading and trailing spaces from potential values for HTTP header field values 3) forbids \0, \r, and \n in HTTP header field values Should we also do these things?
Build Bot
Comment 13 2017-08-02 04:14:57 PDT
Comment on attachment 316898 [details] Patch Attachment 316898 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/4239181 New failing tests: http/tests/security/contentSecurityPolicy/audio-redirect-allowed2.html
Build Bot
Comment 14 2017-08-02 04:14:59 PDT
Created attachment 316952 [details] Archive of layout-test-results from ews115 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews115 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Alex Christensen
Comment 15 2017-08-02 09:24:39 PDT
(In reply to Darin Adler from comment #12) > (In reply to Alex Christensen from comment #11) > > Note: the fetch spec apparently intentionallyish doesn't check validity of > > HTTP headers > > https://fetch.spec.whatwg.org/#concept-header-value > > What I read there says something different from "doesn't check validity": > > 1) checks validity of HTTP header field names We do this. > 2) strips leading and trailing spaces from potential values for HTTP header > field values Based on https://tools.ietf.org/html/rfc7230#section-3.2.3 I think we should strip whitespace before and after everything and replace it with a single space before. > 3) forbids \0, \r, and \n in HTTP header field values We do this. > > Should we also do these things? We will have a stricter check than the fetch spec.
Alex Christensen
Comment 16 2017-08-02 11:14:47 PDT
After speaking with Sam, I agree that we need to have SPI for restricting abilities to prevent content blockers that add a unique identifier to every request. I also think we should consider preventing adding headers to insecure requests to prevent eavesdroppers from stealing possibly sensitive information added to the HTTP header.
Note You need to log in before you can comment on or make changes to this bug.