Bug 68092

Summary: xssauditor - truncate inline snippets at a reasonable length before comparison
Product: WebKit Reporter: Thomas Sepez <tsepez>
Component: WebCore Misc.Assignee: Thomas Sepez <tsepez>
Status: RESOLVED FIXED    
Severity: Minor CC: abarth, dbates, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
Proposed patch plus new test case
abarth: review+, tsepez: commit-queue-
Patch, testcase, plus don't treat non-dangerous attributes as JS. none

Thomas Sepez
Reported 2011-09-14 11:09:45 PDT
Work postponed from https://bugs.webkit.org/show_bug.cgi?id=27895. Truncating the snipped will cause a problem if we truncate in the middle of a %-escape sequence. Maybe we should canonicalize before trimming the snippet.
Attachments
Proposed patch plus new test case (8.53 KB, patch)
2011-09-15 17:33 PDT, Thomas Sepez
abarth: review+
tsepez: commit-queue-
Patch, testcase, plus don't treat non-dangerous attributes as JS. (10.57 KB, patch)
2011-09-16 11:12 PDT, Thomas Sepez
no flags
Thomas Sepez
Comment 1 2011-09-14 17:30:53 PDT
alterative approach is to walk backwards from the end while the character is a hex digit, then check if we're preceded by % or %u.
Adam Barth
Comment 2 2011-09-14 17:32:12 PDT
What about the multiple encoding case?
Thomas Sepez
Comment 3 2011-09-15 15:35:06 PDT
Yup. Let's try re-ordering, then.
Thomas Sepez
Comment 4 2011-09-15 17:33:20 PDT
Created attachment 107570 [details] Proposed patch plus new test case Proposed patch plus test case. I manually tuned the length and confirmed the test case fails if we move the truncation prior to the decoding. The test case is kind of devious in that I wanted the alert to fire if the xss auditor didn't catch the issue, but without the ability to introduce strings or regexps via punctuation, I settled for a numeric expression, exploiting the dual nature of the %-sign -- an escape for URL characters versus a modulo operation in JS. Full tests still running on my box, hence no commit-queue "?" just yet. But please review. Thanks heaps.
Adam Barth
Comment 5 2011-09-15 17:39:52 PDT
Comment on attachment 107570 [details] Proposed patch plus new test case View in context: https://bugs.webkit.org/attachment.cgi?id=107570&action=review > Source/WebCore/html/parser/XSSAuditor.cpp:512 > - if ((position = snippet.find("=")) != notFound > - && (position = snippet.find(isNotHTMLSpace, position + 1)) != notFound > - && (position = snippet.find(isTerminatingCharacter, isHTMLQuote(snippet[position]) ? position + 1 : position)) != notFound) { > - snippet.truncate(position); > + if ((position = decodedSnippet.find("=")) != notFound > + && (position = decodedSnippet.find(isNotHTMLSpace, position + 1)) != notFound > + && (position = decodedSnippet.find(isTerminatingCharacter, isHTMLQuote(decodedSnippet[position]) ? position + 1 : position)) != notFound) { > + decodedSnippet.truncate(position); I'm slightly confused. Are URLs going through this path too? > Source/WebCore/html/parser/XSSAuditor.cpp:-516 > - ASSERT(!snippet.isEmpty()); Maybe we should assert the fullyDecoding the decodedSnippet doesn't change it?
Thomas Sepez
Comment 6 2011-09-16 09:40:47 PDT
Comment on attachment 107570 [details] Proposed patch plus new test case View in context: https://bugs.webkit.org/attachment.cgi?id=107570&action=review >> Source/WebCore/html/parser/XSSAuditor.cpp:512 >> + decodedSnippet.truncate(position); > > I'm slightly confused. Are URLs going through this path too? URLs (as in the request paramaters and/post body) being matched against don't go through this path. They go through fullyDecodeString as part of xssauditor::init(). URLs (as in <object src="URL"> go through this path, but the check for == "javascript:" is made against the parsed form of the attribute, not the snippet. Snippet is just used to see if injected. >> Source/WebCore/html/parser/XSSAuditor.cpp:-516 > > Maybe we should assert the fullyDecoding the decodedSnippet doesn't change it? The refactoring first moved the responsibility for decoding to each of the three callers of isContainedInrequest(). They could each assert this. The refactoring then removed the responsibility from two of the three callers by putting it into the middle of (now renamed) decodedSnippetForAttribute. It could assert this, too. There isn't any decoding going on in this function any longer, the sole remaining check is isEmpty(), so all the decoding transformations take place outside this.
Thomas Sepez
Comment 7 2011-09-16 09:41:24 PDT
Tests finished locally without any new noise.
Adam Barth
Comment 8 2011-09-16 10:43:05 PDT
Right, I'm saying that it could have: ASSERT(decodedSnippet == fullyDecodeString(decodedSnippet)); to make sure all the callers have already decoded the string. The reason I asked about whether URLs go through this path is whether we're going to get too many false positives from something like the following: <object data="http://example.com/foo/bar.swf"></object> which will generate the snippet: data="http:/ which doesn't seem specific enough to avoid false positives. We probably want this to be the snippet: data="http://example.com
Thomas Sepez
Comment 9 2011-09-16 10:49:03 PDT
(In reply to comment #8) > Right, I'm saying that it could have: > > ASSERT(decodedSnippet == fullyDecodeString(decodedSnippet)); > > to make sure all the callers have already decoded the string. > Ah, gotcha. Seems reasonable. Will do. > The reason I asked about whether URLs go through this path is whether we're going to get too many false positives from something like the following: > > <object data="http://example.com/foo/bar.swf"></object> > > which will generate the snippet: > > data="http:/ > > which doesn't seem specific enough to avoid false positives. We probably want this to be the snippet: > > data="http://example.com think you're right. that would have broken on previous commit. Need to avoid the comment-truncating path on the non-dangerous attributes. Will fix.
Adam Barth
Comment 10 2011-09-16 10:53:16 PDT
Thanks Tom.
Thomas Sepez
Comment 11 2011-09-16 10:54:18 PDT
(In reply to comment #9) > (In reply to comment #8) > > Right, I'm saying that it could have: > > > > ASSERT(decodedSnippet == fullyDecodeString(decodedSnippet)); > > > > to make sure all the callers have already decoded the string. > > > > Ah, gotcha. Seems reasonable. Will do. > Actually, don't think we can make this assertion. FullyDecoceString decodes in a loop then canonicalizes. So if your string was %[high char]23, FDS (despite the name) returns %23. No matter, what we want is %23 in this case because the intervening high char has meaning. Just not consistent meaning across charsets.
Adam Barth
Comment 12 2011-09-16 11:04:01 PDT
Comment on attachment 107570 [details] Proposed patch plus new test case You're totally right. We could grep for %, but that's probably not worth it.
Thomas Sepez
Comment 13 2011-09-16 11:12:10 PDT
Created attachment 107686 [details] Patch, testcase, plus don't treat non-dangerous attributes as JS.
Thomas Sepez
Comment 14 2011-09-16 11:35:40 PDT
Comment on attachment 107686 [details] Patch, testcase, plus don't treat non-dangerous attributes as JS. Please review revised.
Adam Barth
Comment 15 2011-09-16 11:43:02 PDT
Comment on attachment 107686 [details] Patch, testcase, plus don't treat non-dangerous attributes as JS. This looks great. I think this fixes another subtle bug when the characters we were looking to truncate at were URL-encoded.
WebKit Review Bot
Comment 16 2011-09-16 22:15:13 PDT
Comment on attachment 107686 [details] Patch, testcase, plus don't treat non-dangerous attributes as JS. Clearing flags on attachment: 107686 Committed r95366: <http://trac.webkit.org/changeset/95366>
WebKit Review Bot
Comment 17 2011-09-16 22:15:18 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.