Bug 70095

Summary: Reject invalid MIME type strings for a file selection dialog parameter
Product: WebKit Reporter: Kent Tamura <tkent>
Component: FormsAssignee: Kent Tamura <tkent>
Status: RESOLVED FIXED    
Severity: Normal CC: fishd, joepeck
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 69598    
Bug Blocks: 70003    
Attachments:
Description Flags
Patch
none
Patch for landing
none
Patch for landing 2
none
Patch 2 darin: review+

Kent Tamura
Reported 2011-10-14 01:42:08 PDT
FileChooserSettings::acceptMIMETypes contains MIME type strings in wrong format. We had better drop invalid MIME type strings.
Attachments
Patch (4.20 KB, patch)
2011-10-14 01:48 PDT, Kent Tamura
no flags
Patch for landing (5.08 KB, patch)
2011-10-14 02:33 PDT, Kent Tamura
no flags
Patch for landing 2 (5.75 KB, patch)
2011-10-14 03:23 PDT, Kent Tamura
no flags
Patch 2 (2.57 KB, patch)
2011-10-18 01:12 PDT, Kent Tamura
darin: review+
Kent Tamura
Comment 1 2011-10-14 01:48:36 PDT
Darin Fisher (:fishd, Google)
Comment 2 2011-10-14 01:56:21 PDT
Comment on attachment 110981 [details] Patch Nice and simple! I'm glad you were able to reuse some existing code for this.
Kent Tamura
Comment 3 2011-10-14 01:59:26 PDT
Comment on attachment 110981 [details] Patch Thank you for midnight quick review!
Early Warning System Bot
Comment 4 2011-10-14 02:10:39 PDT
Kent Tamura
Comment 5 2011-10-14 02:33:16 PDT
Created attachment 110985 [details] Patch for landing
Kent Tamura
Comment 6 2011-10-14 03:23:40 PDT
Created attachment 110990 [details] Patch for landing 2 Windows build fix
Joseph Pecoraro
Comment 7 2011-10-14 11:28:44 PDT
RFC 2616 HTTP/1.1 Media Types Section and relevant parts of the grammar: http://tools.ietf.org/html/rfc2616#section-3.7 SP = <US-ASCII SP, space (32)> HT = <US-ASCII HT, horizontal-tab (9)> CTL = <any US-ASCII control character (octets 0 - 31) and DEL (127)> token = 1*<any CHAR except CTLs or separators> separators = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" | "\" | <"> | "/" | "[" | "]" | "?" | "=" | "{" | "}" | SP | HT media-type = type "/" subtype *( ";" parameter ) type = token subtype = token It looks like isTokenCharacter may be a little generous in validating a MIME type: bool ContentTypeParser::isTokenCharacter(UChar c) { return isASCII(c) && c > ' ' && c != '"' && c != '(' && c != ')' && c != ',' && c != '/' && (c < ':' || c > '@') && (c < '[' || c > ']'); } It looks to me like this would allow some non-token characters, like: '{' (separator 0x7B) '}' (separator 0x7D) DEL (CTL 0x7F) But I don't know how important this would be to fix. And maybe I'm actually missing something and this is okay.
Daniel Bates
Comment 8 2011-10-16 19:55:04 PDT
Comment on attachment 110990 [details] Patch for landing 2 Attachment 110990 [details] did not pass mac-ews (mac): Output: http://queues.webkit.org/results/10076816
Kent Tamura
Comment 9 2011-10-17 23:01:04 PDT
(In reply to comment #7) > RFC 2616 HTTP/1.1 Media Types Section and relevant parts of the grammar: > http://tools.ietf.org/html/rfc2616#section-3.7 > > SP = <US-ASCII SP, space (32)> > HT = <US-ASCII HT, horizontal-tab (9)> > CTL = <any US-ASCII control character (octets 0 - 31) and DEL (127)> > token = 1*<any CHAR except CTLs or separators> > separators = "(" | ")" | "<" | ">" | "@" > | "," | ";" | ":" | "\" | <"> > | "/" | "[" | "]" | "?" | "=" > | "{" | "}" | SP | HT > > media-type = type "/" subtype *( ";" parameter ) > type = token > subtype = token > > It looks like isTokenCharacter may be a little generous in validating a MIME type: > > bool ContentTypeParser::isTokenCharacter(UChar c) > { > return isASCII(c) && c > ' ' && c != '"' && c != '(' && c != ')' && c != ',' && c != '/' && (c < ':' || c > '@') && (c < '[' || c > ']'); > } > > It looks to me like this would allow some non-token characters, like: > > '{' (separator 0x7B) > '}' (separator 0x7D) > DEL (CTL 0x7F) > > But I don't know how important this would be to fix. And maybe I'm > actually missing something and this is okay. It seems RFC 2045 and RFC 2616 are not compatible. From RFC 2045: token := 1*<any (US-ASCII) CHAR except SPACE, CTLs, or tspecials> tspecials := "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "\" / <"> "/" / "[" / "]" / "?" / "=" ; Must be in quoted-string, ; to use within parameter values I'll update the patch so that it doesn't use ContentTypeParser.
Kent Tamura
Comment 10 2011-10-18 01:12:23 PDT
Created attachment 111406 [details] Patch 2 Do not use ContentTypeParser
Darin Adler
Comment 11 2011-10-18 09:56:28 PDT
Comment on attachment 111406 [details] Patch 2 View in context: https://bugs.webkit.org/attachment.cgi?id=111406&action=review > Source/WebCore/html/HTMLInputElement.cpp:1310 > +static bool isRFC2616TokenCharacter(UChar ch) Since this function is used only in one place, it would be better to mark it inline for compilers that aren’t as aggressive about inlining such things. > Source/WebCore/html/HTMLInputElement.cpp:1312 > + return isASCII(ch) && ch > ' ' && ch != '"' && ch != '(' && ch != ')' && ch != ',' && ch != '/' && (ch < ':' || ch > '@') && (ch < '[' || ch > ']') && ch != '{' && ch != '}' && ch != 0x7f; If there is a speed issue at all, a 128-byte array is a much faster way to implement this after the isASCII test. > Source/WebCore/html/HTMLInputElement.cpp:1334 > + if (j != slashPosition && !isRFC2616TokenCharacter(trimmedMimeType[j])) { You would get slightly faster performance if you reversed the && here.
Joseph Pecoraro
Comment 12 2011-10-18 10:18:31 PDT
Comment on attachment 111406 [details] Patch 2 I also think it would be nice to put the bulk of the HTMLInputElement::acceptMIMETypes changes into a new function, isValidMIMEType(), but that is totally up to you. I think it would be much easier to read, and someone later on might look for MIME type validation and find the function.
Kent Tamura
Comment 13 2011-10-19 19:08:24 PDT
Comment on attachment 111406 [details] Patch 2 View in context: https://bugs.webkit.org/attachment.cgi?id=111406&action=review Darrin, Joseph, thank you for the comments. I'll follow your comments except the following one, and land the patch. >> Source/WebCore/html/HTMLInputElement.cpp:1312 >> + return isASCII(ch) && ch > ' ' && ch != '"' && ch != '(' && ch != ')' && ch != ',' && ch != '/' && (ch < ':' || ch > '@') && (ch < '[' || ch > ']') && ch != '{' && ch != '}' && ch != 0x7f; > > If there is a speed issue at all, a 128-byte array is a much faster way to implement this after the isASCII test. acceptMIMETypes() is called once only when a file chooser dialog is opened. I think the current code is enough.
Kent Tamura
Comment 14 2011-10-19 19:12:04 PDT
Note You need to log in before you can comment on or make changes to this bug.