Bug 34668 - WebKit seems willing to load URLs it considers "invalid"
Summary: WebKit seems willing to load URLs it considers "invalid"
Status: RESOLVED INVALID
Alias: None
Product: WebKit
Classification: Unclassified
Component: Platform (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 37641
  Show dependency treegraph
 
Reported: 2010-02-05 16:53 PST by Brett Wilson (Google)
Modified: 2023-05-22 03:44 PDT (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Brett Wilson (Google) 2010-02-05 16:53:47 PST
This report is related to this bug in Chromium:
  http://code.google.com/p/chromium/issues/detail?id=160

The bug itself does not actually affect Safari, even though tracing through KURL it clearly thinks
  ed2k://|serverlist|http://www.gruk.org/server.met|/
is an invalid URL due to the presence of "|" in the host. It bothers checking the host at all because it sees "://" and treats the URL as hierarchical.

Chromium has been very strict about never doing anything with invalid URLs, since crazily formatted URLs have been the source of a number of security bugs in different browsers in the past. We fast fail for any loads (and other operations) if the valid bit isn't set on them. I think this is a generally good idea.

This bug report boils down to: "Are you sure you want to be sending URLs you think are invalid to the network stack and external protocol handlers?" I can't see all the networking code and other stuff in Safari, so it's hard for me to evaluate all the details of how it works and how big a risk this is.



In this case, of course, you want to send the URL to the external protocol handler, and not doing that is what the Chromium bug is about in the first place. The only way to make this work is to do what Firefox does, which I've implemented (but is not currently checked in) for Chromium. Firefox keeps a whitelist of known hierarchical schemes, and any URL not on that list is considered non-hierarchical like "data". Chromium/Google-URL already has such a list containing a few schemes, and so does even KURL (just consisting of "http" and "https" which are hardcoded).

This results in less validation being done on the URL, which solves the problem with the eDonkey URLs and a few other types. But there is still some minimal validation that occurs and you can always rely on the isValid flag to test whether a URL might be dangerous.
Comment 1 Anne van Kesteren 2023-05-22 03:44:11 PDT
The URL parser intentionally supports hierarchy. Not doing so would go against the standard.