Bug 68094 - xssauditor - script block ending in comment can bypass auditor.
Summary: xssauditor - script block ending in comment can bypass auditor.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Minor
Assignee: Nobody
URL:
Keywords: XSSAuditor
Depends on:
Blocks: 66579
  Show dependency treegraph
 
Reported: 2011-09-14 11:21 PDT by Thomas Sepez
Modified: 2011-09-22 19:27 PDT (History)
3 users (show)

See Also:


Attachments
Patch + test cases. (25.71 KB, patch)
2011-09-20 14:30 PDT, Thomas Sepez
no flags Details | Formatted Diff | Diff
Patch - unrelated files picked up by mistake. (23.81 KB, patch)
2011-09-20 14:36 PDT, Thomas Sepez
abarth: review-
Details | Formatted Diff | Diff
Simpler patch (11.24 KB, patch)
2011-09-21 14:14 PDT, Thomas Sepez
abarth: review-
Details | Formatted Diff | Diff
Patch + changes for previous comments. (11.87 KB, patch)
2011-09-21 17:52 PDT, Thomas Sepez
tsepez: review-
tsepez: commit-queue-
Details | Formatted Diff | Diff
Revised patch closing hole with truncation case. (12.24 KB, patch)
2011-09-22 10:17 PDT, Thomas Sepez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Sepez 2011-09-14 11:21:15 PDT
Second part of https://bugs.webkit.org/show_bug.cgi?id=27895.

This is for the situation where the page contains a naturally-occurring close script tag on the same line as the injecton, e.g.

<img src="$q"><script>foo()</script>

where $q is the injected vector from the URL, typically ?q="><script>alert(0)//, yeilding

<img src=""><script>alert(0)//"><script>foo()</script>

and the comment removes the unparseable fragment from consideration by JS. xssauditor looking at the entire script block will fail to match because the "><script>foo() fragment isn't in the URL.
Unlike the inline case, entities aren't an issue here, but because of false positives, stopping on the first comment isn't likely to be acceptable.  Many script blocks will begin with a comment, eg.

<script>
// Copyright (c) 2001 by some guy.
Comment 1 Thomas Sepez 2011-09-14 14:14:44 PDT
Copy some of dbates's additional details from the original bug:

We may want to file a separate bug to fix comments within script tags such as:

<script>alert(1);/*<!--
Example: http://good.webblaze.org/dbates/xsstest-script-comment.php?q=%3Cscript%3Ealert%28/XSS/%29%3B/*%3C%21--

<script>alert(1);/*
Example: http://good.webblaze.org/dbates/xsstest-script-comment.php?q=%3Cscript%3Ealert%28/XSS/%29%3B/*

<script>alert(1);//
Example: http://good.webblaze.org/dbates/xsstest-script-comment.php?q=%3Cscript%3Ealert%28/XSS/%29%3B//

<script>alert(1);<!--
Example: http://good.webblaze.org/dbates/xsstest-script-comment.php?q=%3Cscript%3Ealert%28/XSS/%29%3B%3C%21--
Comment 2 Thomas Sepez 2011-09-14 17:53:09 PDT
Even worse case is when attacker controls two injections on the page, and can escape the intervening page content with /* in one injection and */ in the other.  Not sure if we want to tackle this case [chromium bug http://code.google.com/p/chromium/issues/detail?id=96616 closed as wontfix for this case].

Consider a page

print '<img src="' . $q1 . '"><foo>bar<baz/><img src="' . $q2 . '">';

?q1="><script>/*&other=clutter&q2=bam*//**/alert(0)//foo%0a</script>

One rule I can think of that might catch this is something like this is:
- verify a <script> tag in the URL, else do nothing.
- without further decoding, parse the script block in the page, skipping over leading comments, and find the first actual code fragment, stopping at the next place a comment is introduced.
- reduce/canonicalize the fragment
- check if that fragment is present anywhere in the reduced/canonicalized URL.

so:

<img src=""><script>/*"><foo>bar<baz/><img src="bam*//**/alert(0)//foo
</script>

- verify <script> in URL.
- skip first comment /*"><foo>bar<baz/><img src="bam*/
- skip second comment /**/
- extract code fragment alert(0) up to next comment.
- decode and then find in URL.
Comment 3 Adam Barth 2011-09-15 11:07:54 PDT
Multiple injections are explicitly out of scope.
Comment 4 Thomas Sepez 2011-09-20 14:30:46 PDT
Created attachment 108052 [details]
Patch + test cases.  

Rough cut at code to lex out the relevant portions of the script block for comparison.
Comment 5 Thomas Sepez 2011-09-20 14:36:28 PDT
Created attachment 108054 [details]
Patch - unrelated files picked up by mistake.
Comment 6 Adam Barth 2011-09-20 14:38:14 PDT
Comment on attachment 108054 [details]
Patch - unrelated files picked up by mistake.

Woah there.  I don't think we should be lexing JavaScript.  That's way too complicated.  Isn't there something simpler we can do?
Comment 7 Thomas Sepez 2011-09-21 10:48:29 PDT
I still think we can't stop at the first comment as we do for inline handlers, since it is so common for legitimate script blocks to start with comments; doing so would cause any <script> in the URL to match, making it too noisy.  And having found the code, comparing it and the script tag in two comparisons against the URL seems to do a good job here (as it turns out, it happens to catch the case in the "wontfix bug" but that was not the intent).

I'm thinking that it is OK to eliminate the quote balancing when looking for the end of the code fragment.  I wanted to avoid stopping on something like "http://", but since legitimate code will have something before the string literal, we probably have enough signal even if we do stop there.  If you throw that requirement out, then it becomes more practical to do this via a series of finds() since you're not worried about the context. Simpler, in some sense, but less elegant and slower.  I'd expect that most of the code in here would reduce to state machines if we wanted to grease it ...
Comment 8 Adam Barth 2011-09-21 11:20:53 PDT
For the "starts with a comment" case, we could notice that there's only whitespace in the snippet and keep going.

Having a whole JavaScript lexer here is way more complexity that we want.
Comment 9 Thomas Sepez 2011-09-21 11:23:55 PDT
Well, as you can see, its not a full lexer, it only understands quotes and comment.  But point taken.
Comment 10 Adam Barth 2011-09-21 12:42:52 PDT
The main reason to grab more of the snippet is to reduce false positives.  In the case where you have a string, we've probably already got a good enough snippet to avoid false positives:

<script>
var x = "http://

It seems fine to terminate the snippet there. It's not super important that it's not really the beginning of a comment.
Comment 11 Thomas Sepez 2011-09-21 14:14:48 PDT
Created attachment 108232 [details]
Simpler patch
Comment 12 Adam Barth 2011-09-21 14:42:10 PDT
Comment on attachment 108232 [details]
Simpler patch

View in context: https://bugs.webkit.org/attachment.cgi?id=108232&action=review

> Source/WebCore/html/parser/XSSAuditor.cpp:311
> +        int start = 0;
> +        int end = token.endIndex() - token.startIndex();

snippetForRange doesn't use the same indicies as token.startIndex?  I guess this is the same way the code used to work.  (Also, these indicies should probably all be size_t, but that's a yak for another day.)

> Source/WebCore/html/parser/XSSAuditor.cpp:313
> +        if (isContainedInRequest(fullyDecodeString(snippet, decoder))) {

I see, we're not requiring the two pieces to be contiguous in the request.  Is that intentional?   I guess the code fragment isn't necessarily at the beginning of the script.

> Source/WebCore/html/parser/XSSAuditor.cpp:542
> +String XSSAuditor::extractCodeFragment(const String& string)

If this is specific to JavaScript, maybe we should call it snippetForJavaScript ?

> Source/WebCore/html/parser/XSSAuditor.cpp:548
> +    size_t foundPosition;

I would initialize this scalar with notFound, but that's just because I'm paranoid.

> Source/WebCore/html/parser/XSSAuditor.cpp:561
> +    if (string.substring(startPosition, 4) == "<!--") {
> +        while (startPosition < endPosition && string[startPosition] != '\r' && string[startPosition] != '\n')
> +            startPosition++;
> +        while (startPosition < endPosition && isHTMLSpace(string[startPosition]))
> +            startPosition++;
> +    }

I'm not sure why this is a special case.  It seems like we could just fold this into the following loop.  <!-- is really the same thing as //.

> Source/WebCore/html/parser/XSSAuditor.cpp:566
> +        if (string.substring(startPosition, 2) == "//") {

Here too we're doing a malloc for each character we scan.  This might be worth adding to WTFString.

> Source/WebCore/html/parser/XSSAuditor.cpp:568
> +            while (startPosition < endPosition && string[startPosition] != '\r' && string[startPosition] != '\n')
> +                startPosition++;

Maybe have an inline function to test for isLinebreak ?

> Source/WebCore/html/parser/XSSAuditor.cpp:584
> +    foundPosition = startPosition + kMaximumFragmentLength;

So, kMaximumFragmentLength isn't really the maximum.  It's some sort of target for the maximum.

> Source/WebCore/html/parser/XSSAuditor.cpp:595
> +    // Stop at next comment.
> +    if ((foundPosition = string.find("//", startPosition, endPosition - startPosition)) != notFound)
> +        endPosition = foundPosition;
> +    if ((foundPosition = string.find("/*", startPosition, endPosition - startPosition)) != notFound)
> +        endPosition = foundPosition;

We should probably stop at the first <!-- too.  In JavaScript <!-- is just like //.
Comment 13 Adam Barth 2011-09-21 14:43:35 PDT
Comment on attachment 108232 [details]
Simpler patch

This looks great.  I'd like to see another iteration of extractCodeFragment.
Comment 14 Thomas Sepez 2011-09-21 14:57:13 PDT
Comment on attachment 108232 [details]
Simpler patch

View in context: https://bugs.webkit.org/attachment.cgi?id=108232&action=review

>> Source/WebCore/html/parser/XSSAuditor.cpp:311
>> +        int end = token.endIndex() - token.startIndex();
> 
> snippetForRange doesn't use the same indicies as token.startIndex?  I guess this is the same way the code used to work.  (Also, these indicies should probably all be size_t, but that's a yak for another day.)

Will leave as is.

>> Source/WebCore/html/parser/XSSAuditor.cpp:313
>> +        if (isContainedInRequest(fullyDecodeString(snippet, decoder))) {
> 
> I see, we're not requiring the two pieces to be contiguous in the request.  Is that intentional?   I guess the code fragment isn't necessarily at the beginning of the script.

Yes, we need to deal with an injection like <script>// comment to trick you %0a alert(0) </script> now.  We won't include the comment in the snippet we extract.

>> Source/WebCore/html/parser/XSSAuditor.cpp:542
>> +String XSSAuditor::extractCodeFragment(const String& string)
> 
> If this is specific to JavaScript, maybe we should call it snippetForJavaScript ?

Sure.  Will rename.

>> Source/WebCore/html/parser/XSSAuditor.cpp:548
>> +    size_t foundPosition;
> 
> I would initialize this scalar with notFound, but that's just because I'm paranoid.

Sure.  would like an uninit reference to lead to string[notFound] taking an exception.

>> Source/WebCore/html/parser/XSSAuditor.cpp:561
>> +    }
> 
> I'm not sure why this is a special case.  It seems like we could just fold this into the following loop.  <!-- is really the same thing as //.

Ok.  My recollection of the spec was that the first instance of <!-- was a comment, but that subsequent ones were <!-- , as in if (a<!--b) {}  but I may be wrong here.

>> Source/WebCore/html/parser/XSSAuditor.cpp:566
>> +        if (string.substring(startPosition, 2) == "//") {
> 
> Here too we're doing a malloc for each character we scan.  This might be worth adding to WTFString.

Or just  sp + 1 < ep && string[sp] == '/' && string[sp+1] == '/'

>> Source/WebCore/html/parser/XSSAuditor.cpp:568
>> +                startPosition++;
> 
> Maybe have an inline function to test for isLinebreak ?

Sure.

>> Source/WebCore/html/parser/XSSAuditor.cpp:584
>> +    foundPosition = startPosition + kMaximumFragmentLength;
> 
> So, kMaximumFragmentLength isn't really the maximum.  It's some sort of target for the maximum.

Ok, I'll call it kMaximumFragmentLengthTarget.

>> Source/WebCore/html/parser/XSSAuditor.cpp:595
>> +        endPosition = foundPosition;
> 
> We should probably stop at the first <!-- too.  In JavaScript <!-- is just like //.

Yes, see above.
Comment 15 Adam Barth 2011-09-21 14:59:55 PDT
You're probably right w.r.t. <!-- only being a comment the first time.  Maybe run a quick test?
Comment 16 Thomas Sepez 2011-09-21 15:15:08 PDT
<script> <!-- this is a comment                                                 
alert(0) <!-- this may not be a comment                                         
// -->
</script>
<script>a=1;alert("case 1: " + (false<!--a));</script>                          
<script>a=1;alert("case 2: " + (false<! --a));</script>                         
<script>a=1;alert("case 3: " + (false< !--a));</script>                         

gives an alert: 0, syntax error, alert: case2:true, and alert: case3 true.  This implies that <!-- is always a comment.  At least on webkit and FF (6.0.2).

So will fold all those into the loop.
Comment 17 Adam Barth 2011-09-21 15:16:16 PDT
> So will fold all those into the loop.

Great.  Thanks.
Comment 18 Thomas Sepez 2011-09-21 17:00:13 PDT
Comment on attachment 108232 [details]
Simpler patch

View in context: https://bugs.webkit.org/attachment.cgi?id=108232&action=review

>>> Source/WebCore/html/parser/XSSAuditor.cpp:595
>>> +        endPosition = foundPosition;
>> 
>> We should probably stop at the first <!-- too.  In JavaScript <!-- is just like //.
> 
> Yes, see above.

Argh, as it turns out the 3-arg versions of these calls don't do what I expected them to do.  Will fix this too.
Comment 19 Thomas Sepez 2011-09-21 17:52:33 PDT
Created attachment 108261 [details]
Patch + changes for previous comments.
Comment 20 Adam Barth 2011-09-21 18:03:46 PDT
Comment on attachment 108261 [details]
Patch + changes for previous comments.

View in context: https://bugs.webkit.org/attachment.cgi?id=108261&action=review

Fantastic.

> Source/WebCore/html/parser/XSSAuditor.cpp:561
> +

Looks like there's an extra line here.
Comment 21 Thomas Sepez 2011-09-21 18:38:00 PDT
Comment on attachment 108261 [details]
Patch + changes for previous comments.

It dawns on me :There's a subtle hole I introduced when I merged the two loops together in the end.  I'll retry tomorrow.
Comment 22 Thomas Sepez 2011-09-22 10:17:53 PDT
Created attachment 108353 [details]
Revised patch closing hole with truncation case.
Comment 23 Adam Barth 2011-09-22 10:28:27 PDT
Comment on attachment 108261 [details]
Patch + changes for previous comments.

View in context: https://bugs.webkit.org/attachment.cgi?id=108261&action=review

> Source/WebCore/html/parser/XSSAuditor.cpp:600
> +            while (foundPosition < endPosition && !isHTMLSpace(string[foundPosition]))
> +                foundPosition++;

I see.  This could walk over a carefully-placed comment that was just after position kMaximumFragmentLengthTarget.  Good catch sir.
Comment 24 WebKit Review Bot 2011-09-22 19:27:40 PDT
Comment on attachment 108353 [details]
Revised patch closing hole with truncation case.

Clearing flags on attachment: 108353

Committed r95774: <http://trac.webkit.org/changeset/95774>
Comment 25 WebKit Review Bot 2011-09-22 19:27:46 PDT
All reviewed patches have been landed.  Closing bug.