WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
NEW
172748
Consider blocking requests to HTTP(S) URLs that contain both `\n` and `<` characters.
https://bugs.webkit.org/show_bug.cgi?id=172748
Summary
Consider blocking requests to HTTP(S) URLs that contain both `\n` and `<` cha...
Mike West
Reported
2017-05-31 06:29:01 PDT
In the hopes of mitigating one form of dangling-markup-based exfiltration, Blink plans to block requests whose URLs contained both removable whitespace (`\n`, `\r`, `\t`) _and_ raw less-than (`<`) characters.
https://github.com/whatwg/fetch/issues/546
lays out the strategy and justification in more detail, proposed patches to URL and Fetch are up for review at
https://github.com/whatwg/url/pull/284
and
https://github.com/whatwg/fetch/pull/519
respectively, and Blink's "Intent to Remove" might be helpful:
https://groups.google.com/a/chromium.org/d/msg/blink-dev/KaA_YNOlTPk/VmmoV88xBgAJ
. CCing achristensen@ who's had helpful comments on the URL patch, though I don't think they're in favor of the exact implementation strategy outlined there. :) WDYT?
Attachments
Add attachment
proposed patch, testcase, etc.
Alex Christensen
Comment 1
2017-05-31 10:57:21 PDT
As outlined in
https://github.com/whatwg/url/pull/284
I am very opposed to this approach to mitigating the problem. Please don't do this in Chromium or the specifications.
Mike West
Comment 2
2017-05-31 22:57:21 PDT
Hey Alex! My understanding from your comments in the patch against URL (particularly
https://github.com/whatwg/url/pull/284#issuecomment-304087641
) is that you're not opposed to the behavior, but opposed to doing it by patching URL as opposed to HTML. Is that not accurate?
Alex Christensen
Comment 3
2019-09-13 08:57:20 PDT
URLs are used in a lot of places that aren't vulnerable to dangling markup attacks, so it definitely shouldn't go in the URL parser or specification. HTML is a more appropriate place because you're trying to avoid URLs that look like HTML, and URLs should not need to know anything about HTML. That said, I'm worried about compatibility. I'm under the impression that hand written URLs sometimes contain tabs, newlines, < and > for good reasons, but I have no data to back that up.
Mike West
Comment 4
2019-10-08 11:50:54 PDT
> URLs are used in a lot of places that aren't vulnerable to dangling markup attacks, > so it definitely shouldn't go in the URL parser or specification. HTML is a more > appropriate place because you're trying to avoid URLs that look like HTML, and URLs > should not need to know anything about HTML.
It's totally possible to implement this outside the URL parser. In Chromium, it's implemented as a flag that the URL parser sets during parsing (
https://cs.chromium.org/chromium/src/url/url_canon_etc.cc?rcl=2bd9bea1c6b9ace95707a0e8715f40793c9dc909&l=26
). We're scanning the URL anyway at that point to remove whitespace, and scanning the string prior to canonicalizing it turned out to show up in benchmarks. There is likely a clever way to avoid that performance impact, but it's what Chromium is doing today. From a spec perspective, I'd be fine with this all living in HTML, with the caveat that it seems like a large amount of work to go through that spec to find all the places where URLs could be parsed and wire them up to some parsing proxy. I don't have time right now to do that work. :(
> That said, I'm worried about compatibility. I'm under the impression that hand > written URLs sometimes contain tabs, newlines, < and > for good reasons, but I > have no data to back that up.
FWIW, Chrome has been shipping this behavior since 2017.
Simon Pieters (:zcorpan)
Comment 5
2024-05-30 04:26:16 PDT
The current proposal is
https://github.com/whatwg/html/pull/10022
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug