Bug 70095 - Reject invalid MIME type strings for a file selection dialog parameter
Summary: Reject invalid MIME type strings for a file selection dialog parameter
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Forms (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Kent Tamura
URL:
Keywords:
Depends on: 69598
Blocks: 70003
  Show dependency treegraph
 
Reported: 2011-10-14 01:42 PDT by Kent Tamura
Modified: 2011-10-19 19:12 PDT (History)
2 users (show)

See Also:


Attachments
Patch (4.20 KB, patch)
2011-10-14 01:48 PDT, Kent Tamura
no flags Details | Formatted Diff | Diff
Patch for landing (5.08 KB, patch)
2011-10-14 02:33 PDT, Kent Tamura
no flags Details | Formatted Diff | Diff
Patch for landing 2 (5.75 KB, patch)
2011-10-14 03:23 PDT, Kent Tamura
no flags Details | Formatted Diff | Diff
Patch 2 (2.57 KB, patch)
2011-10-18 01:12 PDT, Kent Tamura
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kent Tamura 2011-10-14 01:42:08 PDT
FileChooserSettings::acceptMIMETypes contains MIME type strings in wrong format.
We had better drop invalid MIME type strings.
Comment 1 Kent Tamura 2011-10-14 01:48:36 PDT
Created attachment 110981 [details]
Patch
Comment 2 Darin Fisher (:fishd, Google) 2011-10-14 01:56:21 PDT
Comment on attachment 110981 [details]
Patch

Nice and simple!  I'm glad you were able to reuse some existing code for this.
Comment 3 Kent Tamura 2011-10-14 01:59:26 PDT
Comment on attachment 110981 [details]
Patch

Thank you for midnight quick review!
Comment 4 Early Warning System Bot 2011-10-14 02:10:39 PDT
Comment on attachment 110981 [details]
Patch

Attachment 110981 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/10067454
Comment 5 Kent Tamura 2011-10-14 02:33:16 PDT
Created attachment 110985 [details]
Patch for landing
Comment 6 Kent Tamura 2011-10-14 03:23:40 PDT
Created attachment 110990 [details]
Patch for landing 2

Windows build fix
Comment 7 Joseph Pecoraro 2011-10-14 11:28:44 PDT
RFC 2616 HTTP/1.1 Media Types Section and relevant parts of the grammar:
http://tools.ietf.org/html/rfc2616#section-3.7

    SP             = <US-ASCII SP, space (32)>
    HT             = <US-ASCII HT, horizontal-tab (9)>
    CTL            = <any US-ASCII control character (octets 0 - 31) and DEL (127)>
    token          = 1*<any CHAR except CTLs or separators>
    separators     = "(" | ")" | "<" | ">" | "@"
                     | "," | ";" | ":" | "\" | <">
                     | "/" | "[" | "]" | "?" | "="
                     | "{" | "}" | SP | HT

    media-type     = type "/" subtype *( ";" parameter )
    type           = token
    subtype        = token

It looks like isTokenCharacter may be a little generous in validating a MIME type:

    bool ContentTypeParser::isTokenCharacter(UChar c)
    {
        return isASCII(c) && c > ' ' && c != '"' && c != '(' && c != ')' && c != ',' && c != '/' && (c < ':' || c > '@') && (c < '[' || c > ']');
    }

It looks to me like this would allow some non-token characters, like:

    '{'   (separator 0x7B)
    '}'   (separator 0x7D)
    DEL   (CTL 0x7F)

But I don't know how important this would be to fix. And maybe I'm
actually missing something and this is okay.
Comment 8 Daniel Bates 2011-10-16 19:55:04 PDT
Comment on attachment 110990 [details]
Patch for landing 2

Attachment 110990 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/10076816
Comment 9 Kent Tamura 2011-10-17 23:01:04 PDT
(In reply to comment #7)
> RFC 2616 HTTP/1.1 Media Types Section and relevant parts of the grammar:
> http://tools.ietf.org/html/rfc2616#section-3.7
> 
>     SP             = <US-ASCII SP, space (32)>
>     HT             = <US-ASCII HT, horizontal-tab (9)>
>     CTL            = <any US-ASCII control character (octets 0 - 31) and DEL (127)>
>     token          = 1*<any CHAR except CTLs or separators>
>     separators     = "(" | ")" | "<" | ">" | "@"
>                      | "," | ";" | ":" | "\" | <">
>                      | "/" | "[" | "]" | "?" | "="
>                      | "{" | "}" | SP | HT
> 
>     media-type     = type "/" subtype *( ";" parameter )
>     type           = token
>     subtype        = token
> 
> It looks like isTokenCharacter may be a little generous in validating a MIME type:
> 
>     bool ContentTypeParser::isTokenCharacter(UChar c)
>     {
>         return isASCII(c) && c > ' ' && c != '"' && c != '(' && c != ')' && c != ',' && c != '/' && (c < ':' || c > '@') && (c < '[' || c > ']');
>     }
> 
> It looks to me like this would allow some non-token characters, like:
> 
>     '{'   (separator 0x7B)
>     '}'   (separator 0x7D)
>     DEL   (CTL 0x7F)
> 
> But I don't know how important this would be to fix. And maybe I'm
> actually missing something and this is okay.

It seems RFC 2045 and RFC 2616 are not compatible.

From RFC 2045:

     token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
                 or tspecials>

     tspecials :=  "(" / ")" / "<" / ">" / "@" /
                   "," / ";" / ":" / "\" / <">
                   "/" / "[" / "]" / "?" / "="
                   ; Must be in quoted-string,
                   ; to use within parameter values

I'll update the patch so that it doesn't use ContentTypeParser.
Comment 10 Kent Tamura 2011-10-18 01:12:23 PDT
Created attachment 111406 [details]
Patch 2

Do not use ContentTypeParser
Comment 11 Darin Adler 2011-10-18 09:56:28 PDT
Comment on attachment 111406 [details]
Patch 2

View in context: https://bugs.webkit.org/attachment.cgi?id=111406&action=review

> Source/WebCore/html/HTMLInputElement.cpp:1310
> +static bool isRFC2616TokenCharacter(UChar ch)

Since this function is used only in one place, it would be better to mark it inline for compilers that aren’t as aggressive about inlining such things.

> Source/WebCore/html/HTMLInputElement.cpp:1312
> +    return isASCII(ch) && ch > ' ' && ch != '"' && ch != '(' && ch != ')' && ch != ',' && ch != '/' && (ch < ':' || ch > '@') && (ch < '[' || ch > ']') && ch != '{' && ch != '}' && ch != 0x7f;

If there is a speed issue at all, a 128-byte array is a much faster way to implement this after the isASCII test.

> Source/WebCore/html/HTMLInputElement.cpp:1334
> +            if (j != slashPosition && !isRFC2616TokenCharacter(trimmedMimeType[j])) {

You would get slightly faster performance if you reversed the && here.
Comment 12 Joseph Pecoraro 2011-10-18 10:18:31 PDT
Comment on attachment 111406 [details]
Patch 2

I also think it would be nice to put the bulk of the HTMLInputElement::acceptMIMETypes changes
into a new function, isValidMIMEType(), but that is totally up to you. I think it would be much
easier to read, and someone later on might look for MIME type validation and find the function.
Comment 13 Kent Tamura 2011-10-19 19:08:24 PDT
Comment on attachment 111406 [details]
Patch 2

View in context: https://bugs.webkit.org/attachment.cgi?id=111406&action=review

Darrin, Joseph, thank you for the comments.
I'll follow your comments except the following one, and land the patch.


>> Source/WebCore/html/HTMLInputElement.cpp:1312
>> +    return isASCII(ch) && ch > ' ' && ch != '"' && ch != '(' && ch != ')' && ch != ',' && ch != '/' && (ch < ':' || ch > '@') && (ch < '[' || ch > ']') && ch != '{' && ch != '}' && ch != 0x7f;
> 
> If there is a speed issue at all, a 128-byte array is a much faster way to implement this after the isASCII test.

acceptMIMETypes() is called once only when a file chooser dialog is opened.  I think the current code is enough.
Comment 14 Kent Tamura 2011-10-19 19:12:04 PDT
Landed: http://trac.webkit.org/changeset/97918