67134 – XSSAuditor bypass under big5 encoding (also sjis).

RESOLVED FIXED67134

XSSAuditor bypass under big5 encoding (also sjis).

https://bugs.webkit.org/show_bug.cgi?id=67134

Summary XSSAuditor bypass under big5 encoding (also sjis).

Thomas Sepez

Reported 2011-08-29 10:13:41 PDT

XSSAuditor can be tricked under those encodings where not all bytes of a multibyte character are greater than 0x80. Big5 is one such example where trailing bytes need only be greater than 0x40. third_party/WebKit/Source/WebCore/platform/KURLGoogle.cpp: decodeURLEscapeSequences() isn't character-set aware. So given a single big5 character input like 0xc8 0x5f, this gets transformed into two characters (code points): c8 and 5f, due to the (correct) recovery when trying to interpret this as utf8. Later, the XSSAuditor removes all non-ascii code points in both the page contents and the URL, so as to be immune to these kinds of misinterpretations when comparing the page contents against the URL. This works well in the cases where a misinterpretation introduces two high-valued characters instead of one high-valued character, as typically happens when all the bytes in the multibyte sequence are greater than 0x80. But in the example page, it removes the (one) high-valued character, but in the URL removes the first misinterpreted byte but leaves the 5f. Not having such a character in the page, the match fails. Correct fix this is to make decodeURLEscapeSequences() encoding-aware (as is the non-G version in KURL.ccp), but this has its own issues as detailed in the comments in KURLGoogle.cpp).

Attachments
Proposed tests. (4.65 KB, patch) 2011-09-07 15:17 PDT, Thomas Sepez	dbates: review- dbates: commit-queue-	Details Formatted Diff Diff
patch using perl for testcase. (3.99 KB, patch) 2011-09-09 11:56 PDT, Thomas Sepez	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Thomas Sepez

Comment 1 2011-09-07 11:25:42 PDT

I've written a test for this, but it manifests only under --chromium. There's a happy co-incidence that generally masks this on the KURL.cpp side: namely what happens when a bad sequence is encountered. As indicated above, the KURLGoogle path is something like: %c9%5f => 00c9 005f => c3 89 5f => [ c389 big5 char ] 5f => [ hi unicode char ] 5f => 5f Where each byte of the invalid sequence is passed through as the equivlaent codepoint (after first arrow above). The KURL path is something like: %c9%5f => fffd 005f => ef bf bd 5f => [ efbf big 5 char] [ bd5f big5 char] => [hi unicode][hi unicode]=> empty string where each byte of the invalid sequence is replaced by the unicode replacement character U+fffd. The co-incidence is that during the KURL case, the replacement codepoint turns into an odd number of utf8 bytes (after the second arrow) but the KURLGoogle replacement code point turns into an even number of utf8 bytes. Hence the byte for the following 5f gets eaten as part of a second big5 character in one case, but gets passed through as-is in the other. There will be variations involving other charsets that don't hit this happy co-incidence, but this explains why the test case I was working on passed under the normal build.

Thomas Sepez

Comment 2 2011-09-07 15:13:16 PDT

Have constructed an example in shift_jis that will flunk against the present code but is expected to pass once the issue is fixed. We believe the fix to https://bugs.webkit.org/show_bug.cgi?id=66588 as proposed by Daniel Bates will cover this case. I'm just uploading the two prospective tests while waiting for that bug to close.

Thomas Sepez

Comment 3 2011-09-07 15:17:05 PDT

Created attachment 106655 [details] Proposed tests. Just to get code up for review even though we can't land at present.

Thomas Sepez

Comment 4 2011-09-09 10:13:35 PDT

These now passed for me on mac under both normal and --chromium cases following the fix to 66588.

Daniel Bates

Comment 5 2011-09-09 10:35:13 PDT

Comment on attachment 106655 [details] Proposed tests. Can we write these tests using some kind of Perl script that echos back the input data instead of hardcoding the response page? Adding such a script or modifying an existing script will make it easier to extend our test coverage for such encoding issues. If it's not possible to write such a script please elaborate on why. Take a look at the Encode module <http://perldoc.perl.org/Encode.html> for encoding text.

Thomas Sepez

Comment 6 2011-09-09 10:47:19 PDT

I'm just concerned about the correct functioning of the test on any given platform depending on the correctness of the perl implementation on that platform. Aren't there already a bunch of these xssauditor tests that flunk only on certain ports? Perhaps because the cgi implementation does some unexpected transformation or the perhaps the encoding doesn't come back as expected. I guess I want a test that says when these bytes go up, and these bytes come back, this is what happens.

Thomas Sepez

Comment 7 2011-09-09 11:56:37 PDT

Created attachment 106895 [details] patch using perl for testcase. Updated echo-intertag.pl to allow specification of charset in content-type header. We don't decode the strings because we want the exact byte sequence returned as passed in escaped in the q parameter. This appears to be the case when CGI is invoked without the utf8 option, treating the parameter as if it were a binary string.

Daniel Bates

Comment 8 2011-09-09 15:03:07 PDT

Comment on attachment 106895 [details] patch using perl for testcase. Thanks Tom. This looks good to me.

WebKit Review Bot

Comment 9 2011-09-09 16:04:34 PDT

Comment on attachment 106895 [details] patch using perl for testcase. Clearing flags on attachment: 106895 Committed r94884: <http://trac.webkit.org/changeset/94884>

WebKit Review Bot

Comment 10 2011-09-09 16:04:39 PDT

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component WebCore Misc.

Assignee

Nobody

Reported

2011-08-29 10:13 PDT

Modified

2011-09-09 16:04 PDT History

CC List

4 users Show

URL

Keywords XSSAuditor

Depends on

66588

Blocks

66579

Dependencies

tree graph