Bug 46968 - [Qt] QWebPage MIME type handling inconsistency with other web browsers
Summary: [Qt] QWebPage MIME type handling inconsistency with other web browsers
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Major
Assignee: Luiz Agostini
URL:
Keywords: Qt, QtTriaged
Depends on: 58848
Blocks: 45523
  Show dependency treegraph
 
Reported: 2010-10-01 01:49 PDT by Simon Hausmann
Modified: 2011-05-04 12:59 PDT (History)
16 users (show)

See Also:


Attachments
patch (39.73 KB, patch)
2011-03-30 18:32 PDT, Luiz Agostini
no flags Details | Formatted Diff | Diff
patch (42.37 KB, patch)
2011-03-31 10:14 PDT, Luiz Agostini
benjamin: review-
Details | Formatted Diff | Diff
patch (1.47 MB, patch)
2011-04-04 17:23 PDT, Luiz Agostini
no flags Details | Formatted Diff | Diff
patch (1.47 MB, patch)
2011-04-07 15:23 PDT, Luiz Agostini
no flags Details | Formatted Diff | Diff
patch (1.47 MB, patch)
2011-04-08 15:15 PDT, Luiz Agostini
no flags Details | Formatted Diff | Diff
patch (1.47 MB, patch)
2011-04-08 15:22 PDT, Luiz Agostini
no flags Details | Formatted Diff | Diff
patch (1.47 MB, patch)
2011-04-11 15:54 PDT, Luiz Agostini
no flags Details | Formatted Diff | Diff
patch (1.47 MB, patch)
2011-04-11 17:00 PDT, Luiz Agostini
kenneth: review+
kenneth: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Hausmann 2010-10-01 01:49:21 PDT
Reported at http://bugreports.qt.nokia.com/browse/QTBUG-13323 against Qt 4.6.3:


It's not possible to load a page in QWebPage/QWebFrame that has no mime type. On other web browsers (Chrome, Firefox and IE), pages with no mime type are treated as text/html.
An example page can be found here http://ad.bannerconnect.net/st?ad_type=iframe&ad_size=160x600&section=845611 which has the response header (at time of writing)

Response header
Age:0
Cache-Control:no-store
Connection:close
Content-Length:4523
Date:Wed, 01 Sep 2010 01:43:25 GMT
Last-Modified:Wed, 01 Sep 2010 01:43:25 GMT
P3P:policyref="/w3c/p3p.xml", CP="NOI DSP COR NID CURa ADMa DEVa PSAa PSDa OUR BUS COM INT OTC PUR STA"
Pragma:no-cache
Server:YTS/1.18.4
I first thought I could override this by creating a Plugin that handles empty MIME type. It's not possible as all calls in FrameLoaderClientQt::canShowMIMEType(const String& MIMEType) in FrameLoaderClientQt.cpp first check if MIMEType.isEmpty() & return false.

FrameLoaderClientQt.cpp:508
bool FrameLoaderClientQt::canShowMIMEType(const String& MIMEType) const
{
    if (MIMETypeRegistry::isSupportedImageMIMEType(MIMEType)) <-- This checks if MIMEType.isEmpty() and returns false
        return true;

    if (MIMETypeRegistry::isSupportedNonImageMIMEType(MIMEType)) <-- This checks if MIMEType.isEmpty() and returns false
        return true;

    if (m_frame && m_frame->settings()  && m_frame->settings()->arePluginsEnabled()
        && PluginDatabase::installedPlugins()->isMIMETypeRegistered(MIMEType)) <-- This checks if MIMEType.isEmpty() and returns false
        return true;

    return false;
}
Comment 1 Robert Hogan 2010-10-31 06:19:21 PDT
The example loads fine in QtTestBrowser for me.
Comment 2 Robert Hogan 2010-11-05 15:26:02 PDT
Another site that reproduces this error is:

http://extranet.dpd.de/cgi-bin/delistrack?pknr=dsanndsa&submit=&lang=de

from https://bugs.webkit.org/show_bug.cgi?id=45523

HTTP/1.1 200 OK
Server: Sun-ONE-Web-Server/6.1
Date: Fri, 05 Nov 2010 22:19:05 GMT
<!doctype html public "-//w3c//dtd html 4.01 //en"   "http: //www.w3.org/TR/html4/strict.dtd">
Transfer-encoding: chunked
Connection: close

0bc5
<html><head><meta content="text/html; charset=utf-8" http-equiv="content-type"/><title>DPD Online - Tracking</title><link href="/css/common.css" rel="stylesheet" type="text/css"/><link href="/css/rms.css" rel="stylesheet" type="text/css"/><link href="/css/forms.css" rel="stylesheet" type="text/css"/><link href="/css/content.css" rel="stylesheet" type="text/css"/><link href="/css/navigation.css" rel="stylesheet" type="text/css"/><script type="text/javascript">function showPrintView(){var URL = window.location;URL +='&printView=0';window.open(URL,'Tracking','resizable=yes,scrollbars=yes,toolbar=yes,menubar=yes');}</script></head><body><div id="main" style="width:750px;padding-left:23px;"><div id="header" style="position: relative; left: 8px; width:742px;top:20px;"><div id="headLine" style="position: relative;left: 180px; top: 18px;width:200px;"><p><font color="#FFFFFF" size="6" weight="bold">Tracking</font></p></div><table border="0" cellpadding="0" cellspacing="0" ><tr><td><img alt="" id="header_corner_left" src="/images/header_corner_left.gif"><img alt="" id="header_corner_right" style="left:736px" src="/images/header_corner_right.gif"><img id="dpd_logo" src="/images/dpd_logo.gif" border="0"></td></tr></table></div><div id="spacer" style="height:50px;"></div><div id="contentFrame" width="100%"><table cellpadding="0" cellspacing="0">
<tr><td><img alt="" src="/images/content_corner_TL.gif" ></td><td><img alt="" src="/images/content_corner_TL2.gif" ></td><td class="content_border_top"><img alt="" border="0" height="1" src="/images/dummy.gif" width="645"></td><td><img alt="" src="/images/content_corner_TR.gif" ></td></tr><tr><td class="content_border_left"><img height="363" src="/images/content_border_left_grad.gif" width="3" alt="" ></td><td class="content_bg_inside" colspan="2"><div id="content" style="position:relative;padding-left:23px;padding-right:23px;padding-top:5px;"><div id="error" style="width:515px;"><div class="modul_head_hd">Ein Fehler ist aufgetreten</div><table border="0" cellpadding="0" cellspacing="0" style="width:515px"><tr class="modul_bg_blue1_wl"><td style="padding:15px;color:#FF0000"><b>Die eingegebene Nummer ist zu kurz!</b></td></tr></table></div></div></td><td class="content_border_right"><img alt="" height="363" src="/images/content_border_right_grad.gif" width="3" ></td></tr><tr><td class="content_border_left"><img alt="" height="30" src="/images/dummy.gif" width="1" ></td><td class="content_bg_inside" colspan="2"></td><td class="content_border_right"></td></tr><tr><td><img alt="" src="/images/content_corner_BL.gif" ></td><td class="content_border_bottom" colspan="2"><img alt="" height="1" src="/images/dummy.gif" width="1" ></td><td><img alt="" src="/images/content_corner_BR.gif" ></td></tr></table></div><div id="DPDfooter" ><br><a href="http://www.dpd.com/portal/DPD-Portal/Siteutilities/Imprint" style="text-decoration:none"><img src="/images/icon_arrow_red.gif"><b>&nbsp;&nbsp;Impressum</b></a><br><br><br><br></div></div></body></html>
0
Comment 3 Robert Hogan 2010-11-05 15:33:52 PDT
I think the proper solution is mime-type sniffing in QNAM.

The chromium implementation is well-documented and would be a good starting point:

http://src.chromium.org/viewvc/chrome/trunk/src/net/base/mime_sniffer.cc?view=markup
Comment 4 Benjamin Poulain 2010-11-12 09:06:43 PST
Giving it P2, this is quite bad.
Comment 5 Robert Hogan 2010-11-12 10:43:57 PST
(In reply to comment #4)
> Giving it P2, this is quite bad.

My thinking on this was to introduce a set of functions in QNetworkReplyHandler or utility function in WebCoreSupport that sniff for HTML content when the Content-Type header is empty. This isn't full on mime-type sniffing but it resolves this bug and might be a good way to dip our toe in the water.

I'm also not totally sure that sniffing should happen in QNAM - maybe actual content versus content reported in headers is something only QtWebKit should worry about. Putting in QNAM seems like a bit of a layering violation, I'm not sure.

I'd like to solve this bug so thoughts on the right place to put the solution welcome!
Comment 6 Luiz Agostini 2011-03-01 09:10:51 PST
(In reply to comment #5)
> (In reply to comment #4)
> > Giving it P2, this is quite bad.
> 
> My thinking on this was to introduce a set of functions in QNetworkReplyHandler or utility function in WebCoreSupport that sniff for HTML content when the Content-Type header is empty. This isn't full on mime-type sniffing but it resolves this bug and might be a good way to dip our toe in the water.
> 
> I'm also not totally sure that sniffing should happen in QNAM - maybe actual content versus content reported in headers is something only QtWebKit should worry about. Putting in QNAM seems like a bit of a layering violation, I'm not sure.
> 
> I'd like to solve this bug so thoughts on the right place to put the solution welcome!

I agree that the sniffing should not be in QNAM. Robert, do you plan to work on this issue soon?
Comment 7 Luiz Agostini 2011-03-30 18:32:50 PDT
Created attachment 87651 [details]
patch
Comment 8 Early Warning System Bot 2011-03-30 18:43:03 PDT
Attachment 87651 [details] did not build on qt:
Build output: http://queues.webkit.org/results/8305243
Comment 9 Luiz Agostini 2011-03-30 19:39:02 PDT
(In reply to comment #8)
> Attachment 87651 [details] did not build on qt:
> Build output: http://queues.webkit.org/results/8305243

Oops. WebKit2 part is missing...
Comment 10 Luiz Agostini 2011-03-31 10:14:51 PDT
Created attachment 87757 [details]
patch

fixing webkit2 issue.
Comment 11 Benjamin Poulain 2011-03-31 10:29:53 PDT
Comment on attachment 87757 [details]
patch

This patch will be a bit difficult to get in because it is huge.
It really needs to have solid tests coverage.
Comment 12 Kenneth Rohde Christiansen 2011-03-31 11:20:03 PDT
Comment on attachment 87757 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=87757&action=review

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:34
> +    for (size_t i = 0; i < size; ++i)
> +        if (text == data[i])
> +            return true;

The for needs braces

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:96
> +    return binaryFLags[data];

FLags? Should be Flags I suppose

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:103
> +    for (size_t i = 0; i < size; ++i)
> +        if (isBinaryChar(data[i]))
> +            return true;

For needs braces

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:110
> +static inline bool isWhiteSpace(char data)
> +{
> +    return data == 0x09 || data == 0x0A || data == 0x0C || data == 0x0D || data == 0x20;
> +}

Code like this isn't defined anywhere else?

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:170
> +    bool matches(const char* data, size_t dataSize) const
> +    {

Why not just define these methods in the .h file? This doesnt seem the way we usually do things

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:186
> +    bool maskedMatches(const char* data, size_t dataSize) const

The method name, makes me think this method returns matches

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:193
> +        const uint32_t* pattern32 = (const uint32_t*)m_pattern;
> +        const uint32_t* mask32 = (const uint32_t*)m_mask;
> +        const uint32_t* data32 = (const uint32_t*)data;

Why use C casts?

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:225
> +static const size_t securityConstrainedTypesSize = 19;
> +static const MagicNumbers securityConstrainedTypes[securityConstrainedTypesSize] = {
> +    MagicNumbers("<!DOCTYPE HTML", "\xFF\xFF\xDF\xDF\xDF\xDF\xDF\xDF\xDF\xFF\xDF\xDF\xDF\xDF", "text/html", 14, true, true),
> +    MagicNumbers("<HTML", "\xFF\xDF\xDF\xDF\xDF", "text/html", 5, true, true),

Where does these come from?

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:283
> +static const char* checkTextOrBinaryType(const char* data, size_t dataSize)

This method actually returns a mimetype, so the name should be changed.

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:305
> +static const char* checkUnknownType(const char* data, size_t dataSize)

Same here I think. The names can be more clear

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:392
> +static const char* checkFeedOrHtmlType(const char* data, size_t dataSize)

WebKit style is to call it HTML and not Html

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:421
> +} // MimeSniffing
> +
> +QMimeTypeSniffer::QMimeTypeSniffer(QNetworkReply* reply, const QString& advertisedMimeType, bool isSupportedImageType, int timeout)
> +    : QObject(0)

The rest in this file seems like somthing that is definately sharable across ports, so I think the Qt parts should be in a separate file

> Source/WebKit/qt/WebCoreSupport/FrameNetworkingContextQt.cpp:31
> +FrameNetworkingContextQt::FrameNetworkingContextQt(Frame* frame, QObject* originatingObject, bool mimeIsniffingEnabled, QNetworkAccessManager* networkAccessManager)

mimeIsniffingEnabled <- something wrong with that name

Maybe shouldSniffMIMEType would be a lot more WebKit'ish

> Source/WebKit2/ChangeLog:9
> +        http://tools.ietf.org/html/draft-abarth-mime-sniff-06 .

Why the . ?
Comment 13 Kenneth Rohde Christiansen 2011-03-31 11:21:45 PDT
cc'ing Adam Barth. Hi Adam, this seems to be based on your sniffing draft - can you have a [detailed] look? Thanks
Comment 14 Kenneth Rohde Christiansen 2011-03-31 11:35:50 PDT
I just looked quickly over the chromium version and they have good comments on wehre things comes from. You should do the same:

// Whether a given byte looks like it might be part of binary content.
// Source: HTML5 spec
static char kByteLooksBinary[] = {
  1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1,  // 0x00 - 0x0F
  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1,  // 0x10 - 0x1F

  static const char* kSniffableTypes[] = {
    // Many web servers are misconfigured to send text/plain for many
    // different types of content.
    "text/plain",
    // We want to sniff application/octet-stream for
    // application/x-chrome-extension, but nothing else.
    "application/octet-stream",
    // XHTML and Atom/RSS feeds are often served as plain xml instead of
    // their more specific mime types.
    "text/xml",
    "application/xml",
  };

etc etc... Things like this makes the code more understandable and somewhat reviewable.

link: http://src.chromium.org/viewvc/chrome/trunk/src/net/base/mime_sniffer.cc?view=markup
Comment 15 Kenneth Rohde Christiansen 2011-03-31 11:40:23 PDT
Another interesting link (from abarth):

Secure Content Sniffing for Web Browsers, or How to Stop Papers from Reviewing Themselves
Adam Barth, Juan Caballero, and Dawn Song
In Proc. of the 30th IEEE Symposium on Security and Privacy (Oakland 2009)
http://www.adambarth.com/papers/2009/barth-caballero-song.pdf
Comment 16 Luiz Agostini 2011-04-04 17:23:33 PDT
Created attachment 88170 [details]
patch
Comment 17 WebKit Review Bot 2011-04-04 17:25:55 PDT
Attachment 88170 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style']" exit_code: 1

Source/WebKit/qt/tests/MIMESniffing/tst_MIMESniffing.cpp:20:  Found header this file implements before WebCore config.h. Should be: config.h, primary header, blank line, and then alphabetically sorted.  [build/include_order] [4]
Source/WebCore/platform/network/qt/MIMESniffing.cpp:20:  Found header this file implements before WebCore config.h. Should be: config.h, primary header, blank line, and then alphabetically sorted.  [build/include_order] [4]
Total errors found: 2 in 27 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 18 Luiz Agostini 2011-04-04 17:30:21 PDT
(In reply to comment #12)
> (From update of attachment 87757 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=87757&action=review
> 
> > Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:170
> > +    bool matches(const char* data, size_t dataSize) const
> > +    {
> 
> Why not just define these methods in the .h file? This doesnt seem the way we usually do things

It is a class that is not supposed to be used outside this cpp file. That is why it is defined in the cpp.

> 
> > Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:225
> > +static const size_t securityConstrainedTypesSize = 19;
> > +static const MagicNumbers securityConstrainedTypes[securityConstrainedTypesSize] = {
> > +    MagicNumbers("<!DOCTYPE HTML", "\xFF\xFF\xDF\xDF\xDF\xDF\xDF\xDF\xDF\xFF\xDF\xDF\xDF\xDF", "text/html", 14, true, true),
> > +    MagicNumbers("<HTML", "\xFF\xDF\xDF\xDF\xDF", "text/html", 5, true, true),
> 
> Where does these come from?

see http://tools.ietf.org/html/draft-abarth-mime-sniff-06#page-12

> 
> > Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:421
> > +} // MimeSniffing
> > +
> > +QMimeTypeSniffer::QMimeTypeSniffer(QNetworkReply* reply, const QString& advertisedMimeType, bool isSupportedImageType, int timeout)
> > +    : QObject(0)
> 
> The rest in this file seems like somthing that is definately sharable across ports, so I think the Qt parts should be in a separate file

done
Comment 19 Luiz Agostini 2011-04-04 17:42:53 PDT
(In reply to comment #14)
> I just looked quickly over the chromium version and they have good comments on wehre things comes from. You should do the same:
> 
> // Whether a given byte looks like it might be part of binary content.
> // Source: HTML5 spec
> static char kByteLooksBinary[] = {
>   1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1,  // 0x00 - 0x0F
>   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1,  // 0x10 - 0x1F
> 
>   static const char* kSniffableTypes[] = {
>     // Many web servers are misconfigured to send text/plain for many
>     // different types of content.
>     "text/plain",
>     // We want to sniff application/octet-stream for
>     // application/x-chrome-extension, but nothing else.
>     "application/octet-stream",
>     // XHTML and Atom/RSS feeds are often served as plain xml instead of
>     // their more specific mime types.
>     "text/xml",
>     "application/xml",
>   };

I added some pointers to the spec in comments in the source code.

For example the binaryFlags array came from this:


                         +-------------------------+
                         | Binary Data Byte Ranges |
                         +-------------------------+
                         | 0x00 -- 0x08            |
                         | 0x0B                    |
                         | 0x0E -- 0x1A            |
                         | 0x1C -- 0x1F            |
                         +-------------------------+

I simplly created a function returning true or false based in this information and then used qDebug() to generate the array you see in the code, considering that it would improve performance. Then I created a static inline method with a descriptive name (isBinaryChar) just to do the table lookup.

I did the same now for white spaces.

> 
> etc etc... Things like this makes the code more understandable and somewhat reviewable.
> 
> link: http://src.chromium.org/viewvc/chrome/trunk/src/net/base/mime_sniffer.cc?view=markup

I tried to give to the methods clear names and in several occasions I created static inline methods with descriptive names to make the code more readable. My understanding is that this is the recommended way to proceed. But some short comments pointing to the spec were added to the code as well.
Comment 20 Luiz Agostini 2011-04-04 17:45:36 PDT
(In reply to comment #17)
> Attachment 88170 [details] did not pass style-queue:
> 
> Failed to run "['Tools/Scripts/check-webkit-style']" exit_code: 1
> 
> Source/WebKit/qt/tests/MIMESniffing/tst_MIMESniffing.cpp:20:  Found header this file implements before WebCore config.h. Should be: config.h, primary header, blank line, and then alphabetically sorted.  [build/include_order] [4]
> Source/WebCore/platform/network/qt/MIMESniffing.cpp:20:  Found header this file implements before WebCore config.h. Should be: config.h, primary header, blank line, and then alphabetically sorted.  [build/include_order] [4]
> Total errors found: 2 in 27 files
> 
> 
> If any of these errors are false positives, please file a bug against check-webkit-style.

Those errors were expected. I did it to be able to implement the Qt tests without needing to add too much of WebKit to its include path.

MIMESniffing does not depend on anything from WebKit or Qt.
Comment 21 Luiz Agostini 2011-04-06 00:28:10 PDT
ping review
Comment 22 Benjamin Poulain 2011-04-06 06:20:19 PDT
Comment on attachment 88170 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=88170&action=review

> Source/WebCore/platform/network/qt/MIMESniffing.cpp:29
> +static inline bool findType(const char* text, size_t size, const char** data)

I think findType() returning a bool is a bit confusing. I would expect a function named like that to return a type.

> Source/WebCore/platform/network/qt/MIMESniffing.cpp:41
> +static const size_t textTypesSize = 4;
> +static const char* textTypes[textTypesSize] = {

I would drop the "static" unless you have a good reason for it.

> Source/WebCore/platform/network/qt/MIMESniffing.cpp:55
> +static const size_t unknownTypesSize = 3;
> +static const char* unknownTypes[textTypesSize] = {

I would drop the "static" unless you have a good reason for it.

> Source/WebCore/platform/network/qt/MIMESniffing.cpp:67
> +static const size_t xmlTypesSize = 2;
> +static const char* xmlTypes[textTypesSize] = {

I would drop the "static" unless you have a good reason for it.

> Source/WebCore/platform/network/qt/MIMESniffing.cpp:75
> +    static const char* const suffix = "+xml";
> +    static const size_t suffixSize = strlen(suffix);

I would drop the "static" unless you have a good reason for it.
You could also hardcode suffix size to 4 as you did for the type tables.

> Source/WebCore/platform/network/qt/MIMESniffing.cpp:79
> +    if (typeSize >= suffixSize && !memcmp(type + typeSize - suffixSize, suffix, suffixSize))

If speed is a constraint, I would just to the comparison myself to avoid the function call and the few branches at the beginning of memcmp().
If speed is not an issue, just ignore this comment :)

> Source/WebCore/platform/network/qt/MIMESniffing.cpp:86
> +static const char binaryFlags[256] = {

static?

> Source/WebCore/platform/network/qt/MIMESniffing.cpp:120
> +static const char whiteSpaceChars[256] = {

static?

> Source/WebCore/platform/network/qt/MIMESniffing.cpp:144
> +static inline bool skipWhiteSpaces(const char*& data, size_t& pos, size_t dataSize)

const char*&? why not just const char *?
We usually use ref-to-const only for non base types.

The function name is also a bit strange for what it does.
With such name, I would expect char *skipWhiteSpaces(...) where the pointer to the new position is returned.

> Source/WebCore/platform/network/qt/MIMESniffing.cpp:153
> +    if (pos >= dataSize || !isWhiteSpace(data[pos]))
> +        return false;
> +
> +    do {
> +        ++pos;
> +    } while (pos < dataSize && isWhiteSpace(data[pos]));
> +
> +    return true;

Why not just the loop? I don't see the return value used anywhere.

Enough for now, I'll continue when Luiz ping me again :)
Comment 23 Benjamin Poulain 2011-04-07 04:55:36 PDT
Comment on attachment 88170 [details]
patch

Second round.

View in context: https://bugs.webkit.org/attachment.cgi?id=88170&action=review

Any reason to put MIMESniffing in .../qt/ ?
I would guess other ports might be interested, it could be easier to have it in platform/network.

>> Source/WebCore/platform/network/qt/MIMESniffing.cpp:86
>> +static const char binaryFlags[256] = {

Note to self: some Python says the talbe is correct :)

> Source/WebCore/platform/network/qt/MIMESniffing.cpp:321
> +    mimeType = findMimeType(data, dataSize, bomTypes, bomTypesSize);
> +    if (mimeType)
> +        return mimeType;

Isn't that overkill to use MagicNumbers for this condition?
it could be:
if (dataSize >= 2)
    if ((data[0] == 0xFE && data[1] 0xFF) // UTF-16BE BOM
        || (data[0] == 0xFF && data[1] 0xFE)) // UTF-16LE BOM
        return "text/plain";
etc

The problem I have is that MagicNumbers is a big beast with masking and the part to skip spaces. You are abusing it here which increase complexity while not increasing code reuse IMHO.

> Source/WebCore/platform/network/qt/MIMESniffing.cpp:477
> +// http://tools.ietf.org/html/draft-abarth-mime-sniff-06#page-6
> +MIMESniffer MIMESniffer::create(const char* advetizedMIMEType, bool isSupportedImageType)
> +{
> +    if (!advetizedMIMEType)
> +        return MIMESniffer(512, &MimeSniffing::unknownTypeSniffingProcedure);
> +
> +    if (MimeSniffing::isTextOrBinaryType(advetizedMIMEType))
> +        return MIMESniffer(512, &MimeSniffing::textOrBinaryTypeSniffingProcedure);
> +
> +    if (MimeSniffing::isUnknownType(advetizedMIMEType))
> +        return MIMESniffer(512, &MimeSniffing::unknownTypeSniffingProcedure);
> +
> +    if (MimeSniffing::isXMLType(advetizedMIMEType))
> +        return MIMESniffer();
> +
> +    if (isSupportedImageType)
> +        return MIMESniffer(8, &MimeSniffing::imageTypeSniffingProcedure);
> +
> +    if (!strcmp(advetizedMIMEType, "text/html"))
> +       return MIMESniffer(512, &MimeSniffing::feedTypeSniffingProcedure);
> +
> +    return MIMESniffer();
> +}

This code just determine the parameters of the class internals and pass them to the constructor. It is not clear to me why you need the ::create() function instead of passing (const char* advetizedMIMEType, bool isSupportedImageType) to the constructor and let it populate the internals?

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:32
> +    , m_sniffer(MIMESniffer::create(advertisedMimeType.toLatin1().constData(), isSupportedImageType))
> +    , m_isFinished(!m_sniffer.dataSize() || sniff())

I suggest to put those in the function body, not in the initialization.
It is tricky to notice the intialization of m_isFinished call sniff().

I also think using m_sniffer.dataSize() is a bad condition. You must know how the MIMESniffer works to understand this condition. I would rather have something like "isValid()" or "typeMayNeedDynamicSniffing()" or something alike.

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:44
> +    if (!m_reply->isFinished() && m_reply->bytesAvailable() < m_sniffer.dataSize())
> +        return false;

Couldn't the sniffing be more dynamic than that? For me it looks like there is no reason to always wait for dataSize(). You just need one invalid character to determine some types.

Let lay you are in "Text or Binary" and the first n (<3) characters are a UTF BOM, you can already say it is text/plain.

This is just an idea to take the decision as early as possible so the rest of the engine can start working.

> Source/WebCore/platform/network/qt/QNetworkReplyHandler.cpp:182
> +    if (m_sniffer) {
> +        delete m_sniffer;
> +        m_sniffer = 0;
> +    }

It would be cleaner to store QMimeTypeSniffer with a OwnPtr. I am not a fan of the mutliple copy of this code.
Also note that the if() is unnecessary, it is valid to delete a null pointer.
Comment 24 George Guo 2011-04-07 08:14:59 PDT
This is a great work, if needed, I can help testing on Symbian side.
Comment 25 Luiz Agostini 2011-04-07 15:23:41 PDT
Created attachment 88716 [details]
patch
Comment 26 WebKit Review Bot 2011-04-07 15:26:53 PDT
Attachment 88716 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1

Source/WebKit/qt/tests/MIMESniffing/tst_MIMESniffing.cpp:20:  Found header this file implements before WebCore config.h. Should be: config.h, primary header, blank line, and then alphabetically sorted.  [build/include_order] [4]
Total errors found: 1 in 27 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 27 Luiz Agostini 2011-04-07 15:31:08 PDT
(In reply to comment #22)
> (From update of attachment 88170 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=88170&action=review
> 
> > Source/WebCore/platform/network/qt/MIMESniffing.cpp:29
> > +static inline bool findType(const char* text, size_t size, const char** data)
> 
> I think findType() returning a bool is a bit confusing. I would expect a function named like that to return a type.

ok

> 
> > Source/WebCore/platform/network/qt/MIMESniffing.cpp:41
> > +static const size_t textTypesSize = 4;
> > +static const char* textTypes[textTypesSize] = {
> 
> I would drop the "static" unless you have a good reason for it.
> 
> > Source/WebCore/platform/network/qt/MIMESniffing.cpp:55
> > +static const size_t unknownTypesSize = 3;
> > +static const char* unknownTypes[textTypesSize] = {
> 
> I would drop the "static" unless you have a good reason for it.
> 
> > Source/WebCore/platform/network/qt/MIMESniffing.cpp:67
> > +static const size_t xmlTypesSize = 2;
> > +static const char* xmlTypes[textTypesSize] = {
> 
> I would drop the "static" unless you have a good reason for it.
> 
> > Source/WebCore/platform/network/qt/MIMESniffing.cpp:75
> > +    static const char* const suffix = "+xml";
> > +    static const size_t suffixSize = strlen(suffix);
> 
> I would drop the "static" unless you have a good reason for it.
> You could also hardcode suffix size to 4 as you did for the type tables.
> 
> > Source/WebCore/platform/network/qt/MIMESniffing.cpp:86
> > +static const char binaryFlags[256] = {
> 
> static?
> 
> > Source/WebCore/platform/network/qt/MIMESniffing.cpp:120
> > +static const char whiteSpaceChars[256] = {
> 
> static?> static?

ok. I put everything in an anonymous namespace and removed the static keyword.

> > Source/WebCore/platform/network/qt/MIMESniffing.cpp:144
> > +static inline bool skipWhiteSpaces(const char*& data, size_t& pos, size_t dataSize)
> 
> const char*&? why not just const char *?
> We usually use ref-to-const only for non base types.

It was left there by mistake after a refactoring. :)

> 
> The function name is also a bit strange for what it does.
> With such name, I would expect char *skipWhiteSpaces(...) where the pointer to the new position is returned.
> 
> > Source/WebCore/platform/network/qt/MIMESniffing.cpp:153
> > +    if (pos >= dataSize || !isWhiteSpace(data[pos]))
> > +        return false;
> > +
> > +    do {
> > +        ++pos;
> > +    } while (pos < dataSize && isWhiteSpace(data[pos]));
> > +
> > +    return true;
> 
> Why not just the loop? I don't see the return value used anywhere.

ok

> 
> Enough for now, I'll continue when Luiz ping me again :)

thanks!
Comment 28 Kenneth Rohde Christiansen 2011-04-07 15:38:32 PDT
Comment on attachment 88716 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=88716&action=review

> Source/WebCore/platform/network/MIMESniffing.cpp:145
> +static inline void skipWhiteSpaces(const char* data, size_t& pos, size_t dataSize)

I think that people generally use the term "whitespace" as whitespace in general. ie without the s.

> Source/WebCore/platform/network/MIMESniffing.cpp:297
> +            return types[i].m_mimeType;

If something is obviously public, why add the m_?

> Source/WebCore/platform/network/MIMESniffing.cpp:314
> +    mimeType = findSimpleMimeType(data, dataSize, safeTypes, safeTypesSize);

findSimpleMIMEType! MIME not Mime. I have seen this other places... please search for Mime.

> Source/WebCore/platform/network/MIMESniffing.cpp:378
> +    bool rdf = false;

isRDF or similar seems more webkitísh

> Source/WebCore/platform/network/MIMESniffing.cpp:401
> +static inline bool skipTag(const char*& data, size_t& pos, size_t dataSize, const char* tag, size_t tagSize, const char* tagEnd, size_t tagEndSize)

I would personally use something like outData, outPos, to make it obvious what you are doing

> Source/WebCore/platform/network/MIMESniffing.cpp:431
> +    if (checkText(data, pos, dataSize, "<feed", 5))
> +        return "application/atom+xml";

Does it work if these documents start with <XML ? those are valid atom documents.

> Source/WebCore/platform/network/MIMESniffing.cpp:442
> +MIMESniffer::MIMESniffer(const char* advetizedMIMEType, bool isSupportedImageType)

advertized .... missed an "r"

> Source/WebCore/platform/network/MIMESniffing.h:29
> +    MIMESniffer(const char* advetizedMIMEType, bool isSupportedImageType);

advertized

> Source/WebCore/platform/network/MIMESniffing.h:41
> +namespace MimeSniffing {

MIMESniffing!

> Source/WebCore/platform/network/qt/QMimeTypeSniffer.h:21
> +#ifndef QMimeTypeSniffer_h
> +#define QMimeTypeSniffer_h

This is not a public Qt class, should be called QtMimeTypeSniffer or QtMIMETypeSniffer... not sure about our naming for Qt only classes.

> Source/WebCore/platform/network/qt/QNetworkReplyHandler.cpp:182
> +    if (m_sniffer) {
> +        delete m_sniffer;
> +        m_sniffer = 0;
> +    }

Can't this be an OwnPtr?

> Source/WebCore/platform/network/qt/QNetworkReplyHandler.cpp:202
> +    m_advertisedMimeType = extractMIMETypeFromMediaType(contentType);

advertised - I saw you wrote it with z elsewhere... you should make all those with s :-)

> Source/WebCore/platform/network/qt/QNetworkReplyHandler.cpp:237
> +    delete m_sniffer;
> +    m_sniffer = 0;

OwnPtr would be better... It must have a clear() or you can just assign 0 to it.

> Source/WebKit/qt/WebCoreSupport/FrameLoaderClientQt.cpp:1652
> +    QVariant value = m_webFrame->page()->property("_q_mimeSniffingEnabled");
> +    bool mimeSniffingEnabled = !value.isValid() || value.toBool();

Any reason that this is not enabled by default? I think it needs a lot of testing so we could add this just on our branches if needed.
Comment 29 Luiz Agostini 2011-04-07 15:42:54 PDT
(In reply to comment #23)
> (From update of attachment 88170 [details])
> Second round.
> 
> View in context: https://bugs.webkit.org/attachment.cgi?id=88170&action=review
> 
> Any reason to put MIMESniffing in .../qt/ ?
> I would guess other ports might be interested, it could be easier to have it in platform/network.

ok

> 
> >> Source/WebCore/platform/network/qt/MIMESniffing.cpp:86
> >> +static const char binaryFlags[256] = {
> 
> Note to self: some Python says the talbe is correct :)
> 
> > Source/WebCore/platform/network/qt/MIMESniffing.cpp:321
> > +    mimeType = findMimeType(data, dataSize, bomTypes, bomTypesSize);
> > +    if (mimeType)
> > +        return mimeType;
> 
> Isn't that overkill to use MagicNumbers for this condition?
> it could be:
> if (dataSize >= 2)
>     if ((data[0] == 0xFE && data[1] 0xFF) // UTF-16BE BOM
>         || (data[0] == 0xFF && data[1] 0xFE)) // UTF-16LE BOM
>         return "text/plain";
> etc
> 
> The problem I have is that MagicNumbers is a big beast with masking and the part to skip spaces. You are abusing it here which increase complexity while not increasing code reuse IMHO.

I created a new method for the simple type (the ones that do not need masking or space skipping).

> 
> > Source/WebCore/platform/network/qt/MIMESniffing.cpp:477
> > +// http://tools.ietf.org/html/draft-abarth-mime-sniff-06#page-6
> > +MIMESniffer MIMESniffer::create(const char* advetizedMIMEType, bool isSupportedImageType)
> > +{
> > +    if (!advetizedMIMEType)
> > +        return MIMESniffer(512, &MimeSniffing::unknownTypeSniffingProcedure);
> > +
> > +    if (MimeSniffing::isTextOrBinaryType(advetizedMIMEType))
> > +        return MIMESniffer(512, &MimeSniffing::textOrBinaryTypeSniffingProcedure);
> > +
> > +    if (MimeSniffing::isUnknownType(advetizedMIMEType))
> > +        return MIMESniffer(512, &MimeSniffing::unknownTypeSniffingProcedure);
> > +
> > +    if (MimeSniffing::isXMLType(advetizedMIMEType))
> > +        return MIMESniffer();
> > +
> > +    if (isSupportedImageType)
> > +        return MIMESniffer(8, &MimeSniffing::imageTypeSniffingProcedure);
> > +
> > +    if (!strcmp(advetizedMIMEType, "text/html"))
> > +       return MIMESniffer(512, &MimeSniffing::feedTypeSniffingProcedure);
> > +
> > +    return MIMESniffer();
> > +}
> 
> This code just determine the parameters of the class internals and pass them to the constructor. It is not clear to me why you need the ::create() function instead of passing (const char* advetizedMIMEType, bool isSupportedImageType) to the constructor and let it populate the internals?

sure.

> 
> > Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:32
> > +    , m_sniffer(MIMESniffer::create(advertisedMimeType.toLatin1().constData(), isSupportedImageType))
> > +    , m_isFinished(!m_sniffer.dataSize() || sniff())
> 
> I suggest to put those in the function body, not in the initialization.
> It is tricky to notice the intialization of m_isFinished call sniff().

I forgot this one in last patch. :( if I receive a r+ I can make this change before landing. if not I will put it in next patch.

> 
> I also think using m_sniffer.dataSize() is a bad condition. You must know how the MIMESniffer works to understand this condition. I would rather have something like "isValid()" or "typeMayNeedDynamicSniffing()" or something alike.
> 
> > Source/WebCore/platform/network/qt/QMimeTypeSniffer.cpp:44
> > +    if (!m_reply->isFinished() && m_reply->bytesAvailable() < m_sniffer.dataSize())
> > +        return false;
> 
> Couldn't the sniffing be more dynamic than that? For me it looks like there is no reason to always wait for dataSize(). You just need one invalid character to determine some types.
> 
> Let lay you are in "Text or Binary" and the first n (<3) characters are a UTF BOM, you can already say it is text/plain.
> 
> This is just an idea to take the decision as early as possible so the rest of the engine can start working.ng.

Yes, but we will wait for the whole content anyway. And sniffing will not wait for a large amount of data. I decided to keep it simple.

> 
> > Source/WebCore/platform/network/qt/QNetworkReplyHandler.cpp:182
> > +    if (m_sniffer) {
> > +        delete m_sniffer;
> > +        m_sniffer = 0;
> > +    }
> 
> It would be cleaner to store QMimeTypeSniffer with a OwnPtr. I am not a fan of the mutliple copy of this code.
> Also note that the if() is unnecessary, it is valid to delete a null pointer.

sure. The if was left there after some refactoring as well.
I forgot this one as well. But will put it in code before landing if I receive the r+. otherwise it will be in next patch.
Comment 30 Luiz Agostini 2011-04-07 15:47:47 PDT
(In reply to comment #28)
> (From update of attachment 88716 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=88716&action=review
> 
> > Source/WebCore/platform/network/MIMESniffing.cpp:145
> > +static inline void skipWhiteSpaces(const char* data, size_t& pos, size_t dataSize)
> 
> I think that people generally use the term "whitespace" as whitespace in general. ie without the s.

ok

> 
> > Source/WebCore/platform/network/MIMESniffing.cpp:297
> > +            return types[i].m_mimeType;
> 
> If something is obviously public, why add the m_?

ok.

> 
> > Source/WebCore/platform/network/MIMESniffing.cpp:314
> > +    mimeType = findSimpleMimeType(data, dataSize, safeTypes, safeTypesSize);
> 
> findSimpleMIMEType! MIME not Mime. I have seen this other places... please search for Mime.

ok.

> 
> > Source/WebCore/platform/network/MIMESniffing.cpp:378
> > +    bool rdf = false;
> 
> isRDF or similar seems more webkitísh

ok

> 
> > Source/WebCore/platform/network/MIMESniffing.cpp:401
> > +static inline bool skipTag(const char*& data, size_t& pos, size_t dataSize, const char* tag, size_t tagSize, const char* tagEnd, size_t tagEndSize)
> 
> I would personally use something like outData, outPos, to make it obvious what you are doing

ok

> 
> > Source/WebCore/platform/network/MIMESniffing.cpp:431
> > +    if (checkText(data, pos, dataSize, "<feed", 5))
> > +        return "application/atom+xml";
> 
> Does it work if these documents start with <XML ? those are valid atom documents.\

id does not. I just implemented what I found in spec. but of course we may add more later.

> 
> > Source/WebCore/platform/network/MIMESniffing.cpp:442
> > +MIMESniffer::MIMESniffer(const char* advetizedMIMEType, bool isSupportedImageType)
> 
> advertized .... missed an "r"

ok

> 
> > Source/WebCore/platform/network/MIMESniffing.h:29
> > +    MIMESniffer(const char* advetizedMIMEType, bool isSupportedImageType);
> 
> advertized

ok

> 
> > Source/WebCore/platform/network/MIMESniffing.h:41
> > +namespace MimeSniffing {
> 
> MIMESniffing!

ok

> 
> > Source/WebCore/platform/network/qt/QMimeTypeSniffer.h:21
> > +#ifndef QMimeTypeSniffer_h
> > +#define QMimeTypeSniffer_h
> 
> This is not a public Qt class, should be called QtMimeTypeSniffer or QtMIMETypeSniffer... not sure about our naming for Qt only classes.

Yeah, old question. We had been talking about it before. :)

> 
> > Source/WebCore/platform/network/qt/QNetworkReplyHandler.cpp:182
> > +    if (m_sniffer) {
> > +        delete m_sniffer;
> > +        m_sniffer = 0;
> > +    }
> 
> Can't this be an OwnPtr?

sure

> 
> > Source/WebCore/platform/network/qt/QNetworkReplyHandler.cpp:202
> > +    m_advertisedMimeType = extractMIMETypeFromMediaType(contentType);
> 
> advertised - I saw you wrote it with z elsewhere... you should make all those with s :-)

ok.
> 
> > Source/WebCore/platform/network/qt/QNetworkReplyHandler.cpp:237
> > +    delete m_sniffer;
> > +    m_sniffer = 0;
> 
> OwnPtr would be better... It must have a clear() or you can just assign 0 to it.

ok.

> 
> > Source/WebKit/qt/WebCoreSupport/FrameLoaderClientQt.cpp:1652
> > +    QVariant value = m_webFrame->page()->property("_q_mimeSniffingEnabled");
> > +    bool mimeSniffingEnabled = !value.isValid() || value.toBool();
> 
> Any reason that this is not enabled by default? I think it needs a lot of testing so we could add this just on our branches if needed.

it is enabled by default!
Comment 31 Luiz Agostini 2011-04-08 15:15:37 PDT
Created attachment 88881 [details]
patch
Comment 32 WebKit Review Bot 2011-04-08 15:19:28 PDT
Attachment 88881 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1

Source/WebKit/qt/tests/MIMESniffing/tst_MIMESniffing.cpp:21:  You should add a blank line after implementation file's own header.  [build/include_order] [4]
Total errors found: 1 in 27 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 33 Luiz Agostini 2011-04-08 15:22:21 PDT
Created attachment 88882 [details]
patch

style fix
Comment 34 Benjamin Poulain 2011-04-11 10:47:58 PDT
Comment on attachment 88882 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=88882&action=review

I have read the section 4, "Text or Binary" and compared to the code and I could not think of any way to get the wrong type.
Will do "Unknown Type" after some rest  :)

> Source/WebCore/platform/network/MIMESniffing.cpp:165
> +#define MAGIC_NUMBERS_1(pattern, mask, mimeType, flags) {(pattern), (mask), (mimeType), sizeof(pattern) - 1, (flags)}
> +#define MAGIC_NUMBERS_2(pattern, mimeType) {(pattern), 0, (mimeType), sizeof(pattern) - 1, 0}

I think I would prefer a inline constructor intead of structure initialization with #defines, just for clarity. (ignore this comment if you want)

I really don't like the _1 and _2. I would prefer something more explicit. For me it looks like MAGIC_NUMBERS_MASKED and MAGIC_NUMBERS_SIMPLE.

> Source/WebCore/platform/network/MIMESniffing.cpp:191
> +// http://tools.ietf.org/html/draft-abarth-mime-sniff-06#page-13

Looks like page-8 to me :)

> Source/WebCore/platform/network/MIMESniffing.cpp:212
> +    MAGIC_NUMBERS_1("RIFF\x00\x00\x00\x00WAVE", "\xFF\xFF\xFF\xFF\x00\x00\x00\x00\xFF\xFF\xFF\xFF", "audio/x-wave", 0), // "RIFF" followed by four bytes, followed by "WAVE".

Wait a minute, the array is imageTypes, and the types is audio/... that does not look like an image to me :)

> Source/WebCore/platform/network/MIMESniffing.cpp:276
> +    bool result = info.mask ? maskedCompare(info, data, info.size) : dataSize >= info.size && !memcmp(data, info.pattern, info.size);

I think a branch would be clearer than a ternary operator.
if (info.mask) is an important condition for this part.

> Source/WebCore/platform/network/MIMESniffing.cpp:320
> +    mimeType = findSimpleMIMEType(data, dataSize, safeTypes, safeTypesSize);
> +    if (mimeType)
> +        return mimeType;
> +
> +    mimeType = findMIMEType(data, dataSize, imageTypes, imageTypesSize);
> +    if (mimeType)
> +        return mimeType;

For this step, the spec says:
WARNING!  It is critical that this step not ever return a
          scriptable type (e.g., text/html), because otherwise that
          would allow a privilege escalation attack.
I think it would be nice to have an assertion for that.

> Source/WebCore/platform/network/MIMESniffing.cpp:361
> +    const char* mimeType = 0;
> +
> +    mimeType = findMIMEType(data, dataSize, imageTypes, imageTypesSize);
> +    if (mimeType)
> +        return mimeType;
> +
> +    return 0;

Why not just:
return findMIMEType(data, dataSize, imageTypes, imageTypesSize);

It does not seem the branch adds any value here.

> Source/WebKit/qt/WebCoreSupport/FrameLoaderClientQt.cpp:1653
> +    QVariant value = m_webFrame->page()->property("_q_MIMESniffingEnabled");
> +    bool MIMESniffingEnabled = !value.isValid() || value.toBool();
> +

Hum, not a fan of this. Why not add a new setting?
Comment 35 Luiz Agostini 2011-04-11 15:41:27 PDT
(In reply to comment #34)
> (From update of attachment 88882 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=88882&action=review
> 
> > Source/WebCore/platform/network/MIMESniffing.cpp:191
> > +// http://tools.ietf.org/html/draft-abarth-mime-sniff-06#page-13
> 
> Looks like page-8 to me :)

yep. there are items for BOMs in the table in page 13 as well, but I think that it page 8 is better.

> 
> > Source/WebKit/qt/WebCoreSupport/FrameLoaderClientQt.cpp:1653
> > +    QVariant value = m_webFrame->page()->property("_q_MIMESniffingEnabled");
> > +    bool MIMESniffingEnabled = !value.isValid() || value.toBool();
> > +
> 
> Hum, not a fan of this. Why not add a new setting?

This makes the configuration possible without making it public and without creating compatibility issues. :)
Comment 36 Benjamin Poulain 2011-04-11 15:51:55 PDT
(In reply to comment #35)
> > Hum, not a fan of this. Why not add a new setting?
> 
> This makes the configuration possible without making it public and without creating compatibility issues. :)

I am often the first to shoot when someone want to add new APIs but I think in this case I think it is ok. I think the detection should be enabled by default, but could be disabled by user if needed for whatever reason.

If you think new API is a bad idea, we should enable it by default, and disable with the property _q_MIMESniffingDisabled. I think MIME sniffing is a good think for our "work out of the box" philosophy.
Comment 37 Luiz Agostini 2011-04-11 15:54:50 PDT
Created attachment 89109 [details]
patch
Comment 38 Luiz Agostini 2011-04-11 16:48:50 PDT
(In reply to comment #36)
> (In reply to comment #35)
> > > Hum, not a fan of this. Why not add a new setting?
> > 
> > This makes the configuration possible without making it public and without creating compatibility issues. :)
> 
> I am often the first to shoot when someone want to add new APIs but I think in this case I think it is ok. I think the detection should be enabled by default, but could be disabled by user if needed for whatever reason.
> 
> If you think new API is a bad idea, we should enable it by default, and disable with the property _q_MIMESniffingDisabled. I think MIME sniffing is a good think for our "work out of the box" philosophy.

it is enabled by default as it in now, but I will change the property from _q_MIMESniffingEnabled to _q_MIMESniffingDisabled.
Comment 39 Luiz Agostini 2011-04-11 17:00:14 PDT
Created attachment 89123 [details]
patch
Comment 40 Kenneth Rohde Christiansen 2011-04-13 01:01:24 PDT
Comment on attachment 89123 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=89123&action=review

> Source/WebCore/platform/network/MIMESniffing.cpp:178
> +    MAGIC_NUMBERS_MASKED("<A", "\xFF\xDF", "text/html", SkipWhiteSpace | TrailingSpaceOrBracket), // <A

Why the // <A comment?

> Source/WebCore/platform/network/MIMESniffing.cpp:246
> +    for (size_t i = 0; i < count; ++i)
> +        if ((*data32++ & *mask32++) != *pattern32++)
> +            return false;

Needs braces

> Source/WebCore/platform/network/MIMESniffing.cpp:250
> +    const char* p = reinterpret_cast<const char*>(pattern32);
> +    const char* m = reinterpret_cast<const char*>(mask32);
> +    const char* d = reinterpret_cast<const char*>(data32);

Generally we should avoid variable names as these

> Source/WebCore/platform/network/MIMESniffing.cpp:257
> +    for (size_t i = 0; i < count; ++i)
> +        if ((*d++ & *m++) != *p++)
> +            return false;
> +

Needs braces

> Source/WebCore/platform/network/MIMESniffing.cpp:388
> +const char RSSUrl[] = "http://purl.org/rss/1.0";
> +const char RDFUrl[] = "http://www.w3.org/1999/02/22-rdf-syntax-ns#";

rssUrl, rdfUrl

> Source/WebCore/platform/network/MIMESniffing.h:35
> +
> +

one newline too much

> Source/WebKit/qt/WebCoreSupport/FrameNetworkingContextQt.h:32
> +    FrameNetworkingContextQt(Frame*, QObject* originatingObject, bool MIMESniffingEnabled, QNetworkAccessManager*);

mimeTypeSniffingEnabled

> Source/WebKit/qt/WebCoreSupport/FrameNetworkingContextQt.h:40
> +    bool m_MIMESniffingEnabled;

m_mimeTypeSniffingEnabled or mimeSniffingEnabled

> Source/WebKit/qt/tests/MIMESniffing/TestData.h:25
> +    const char* advetisedType;

missing r!

> Source/WebKit/qt/tests/MIMESniffing/TestData.h:984
> +#endif // TESTDATA_H

TestData_h right?

> Source/WebKit/qt/tests/MIMESniffing/tst_MIMESniffing.cpp:54
> +    for (int i = 0; i < testListSize; ++i) {
> +
> +
> +        QFile file(testList[i].file);

two not needed newlines there

> Source/WebKit2/WebProcess/WebCoreSupport/qt/WebFrameNetworkingContext.cpp:54
> +bool WebFrameNetworkingContext::MIMESniffingEnabled() const
> +{
> +    return m_MIMESniffingEnabled;
> +}

a method name does not start with capital :-) mimeTypeSniffingEnabled

> Source/WebKit2/WebProcess/WebCoreSupport/qt/WebFrameNetworkingContext.h:39
> +    bool m_MIMESniffingEnabled;

Same here
Comment 41 WebKit Review Bot 2011-04-15 18:22:29 PDT
http://trac.webkit.org/changeset/84057 might have broken WinCairo Debug (Build)
Comment 42 Simon Hausmann 2011-04-19 01:34:08 PDT
This bug seems very related to bug #25064 , which is a more generic attempt to this problem AFAICS. Luiz, are you still working on this? The r+'ed patch was never landed.
Comment 43 Simon Hausmann 2011-04-19 01:41:32 PDT
Ohh, looks like it was landed in r84057 :)
Comment 44 Ademar Reis 2011-04-25 07:19:58 PDT
Blocking 2.1.x, as requested by Jeff.
Comment 45 Joel Parks 2011-04-26 07:51:00 PDT
relates to BR-7305 and Case ou1cimx1#723140
Comment 46 Ademar Reis 2011-04-29 07:26:30 PDT
(In reply to comment #45)
> relates to BR-7305 and Case ou1cimx1#723140

This patch is too large and complex for 2.1. These problems will have to be fixed in a different way.