Bug 24904

Summary: Disentangle standards mode CSS type check from content sniffing
Product: WebKit Reporter: Adam Barth <abarth>
Component: CSSAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, ap, eroman, mjs, wtc
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
URL: https://myrewardzone.bestbuy.com/
Attachments:
Description Flags
patchzor
none
patchzor2
ap: review-
patch the third
none
git format-patch
none
more tests darin: review+

Description Adam Barth 2009-03-27 16:14:33 PDT
In standards mode, we requires that that stylesheets loaded via a link tag have the proper MIME type, "text/css."  Apparently, this is required to pass ACID3.  Unfortunately, a number of Web sites supply stylesheets to standards mode pages without specifying a Content-Type header.  These pages work fine in Safari and Firefox, but fail to load their stylesheets in Chromium because Chromium sniffs their MIME type as "text/plain."

There are a couple of solutions to this problem:

1) Accept text/plain stylesheets in standards mode.

2) Change the content sniffer to guess text/css for URLs that end in ".css"

3) Provide a hint from WebCore to the content sniffer that we're expecting a text/css resource.

I think options (2) and (3) are workable.  Wan-Teh debugged Firefox's handling of this situation and determined that they take approach (3).  This is observable because they accept a stylesheet named style.foo in standards mode even if the stylesheet lacks a Content-Type header.  Without access to Safari's source code, we were unable to determine how Safari handles this case.

I recommend that we provide a MIME type hint to the network layer when loading CSS files so Chromium can perfectly match Firefox's behavior here.

Patch forthcoming.
Comment 1 Adam Barth 2009-03-27 16:21:35 PDT
Created attachment 29032 [details]
patchzor
Comment 2 Adam Barth 2009-03-27 16:26:13 PDT
For more background about how we got to this point, see http://code.google.com/p/chromium/issues/detail?id=7448
Comment 3 Eric Seidel (no email) 2009-03-28 00:58:10 PDT
Maciej seems like the kind of person who would review this type of change.
Comment 4 Alexey Proskuryakov 2009-03-28 09:39:32 PDT
My understanding is that one cannot reliably add data members to ResourceRequest - they are converted to platform requests and  back, so any members that cannot be stored inside a platform request will be lost after a delegate call.

Now, this specific use is only necessary on Chromium, which may not use willSendRequest-style delegate calls, but the member is added to ResourceRequestBase, which one would expect to work reliably on all platforms.

Also, "defaultMimeType" should be "defaultMIMEType" per our coding style guide.
Comment 5 Adam Barth 2009-03-28 09:57:44 PDT
(In reply to comment #4)
> My understanding is that one cannot reliably add data members to
> ResourceRequest - they are converted to platform requests and  back, so any
> members that cannot be stored inside a platform request will be lost after a
> delegate call.

Are platform requests in webkit.org source repository somewhere?  I'd like to understand this better.

> Now, this specific use is only necessary on Chromium, which may not use
> willSendRequest-style delegate calls, but the member is added to
> ResourceRequestBase, which one would expect to work reliably on all platforms.

We can move the member to Chromium's ResourceRequest.  I have another patch in progress that adds some data for mixed content detection, but for that patch the reviewer asked to move the member to ResourceRequestBase because it would be useful to other ports.

In this case, any port that uses the HTML 5 content sniffing algorithm will need this member to render sites correctly.

> Also, "defaultMimeType" should be "defaultMIMEType" per our coding style guide.

I'll fix this in the next patch update.

Comment 6 Alexey Proskuryakov 2009-03-28 11:47:37 PDT
See e.g. platform/network/mac/ResourceRequestMac.mm. It's interesting that ResourceRequestBase already has a member that does not map to anything in platform request (m_reportUploadProgress) - I think that's a mistake, but someone better familiar with the design might correct me.
Comment 7 Adam Barth 2009-03-28 17:34:48 PDT
(In reply to comment #6)
> See e.g. platform/network/mac/ResourceRequestMac.mm

Maybe I don't understand what you're getting at, but from http://trac.webkit.org/browser/trunk/WebCore/platform/network/mac/ResourceRequest.h it looks like the platform request for Mac inherits from ResourceRequestBase so it seems like defaultMIMEType and reportUploadProgress can be stored in a platform request...  Maybe you're referring to NSURLRequest?

What's the right way to associate more data with a ResourceRequest?
Comment 8 Alexey Proskuryakov 2009-03-29 03:15:51 PDT
See WebFrameLoaderClient::dispatchWillSendRequest() in WebFrameLoaderClient.mm:

    if (implementations->willSendRequestFunc)
        request = (NSURLRequest *)CallResourceLoadDelegate(implementations->willSendRequestFunc, webView, @selector(webView:resource:willSendRequest:redirectResponse:fromDataSource:), [webView _objectForIdentifier:identifier], request.nsURLRequest(), redirectResponse.nsURLResponse(), dataSource(loader));

If the client implements a willSendRequest delegate method, then ResourceRequest will be completely re-created from NSURLRequest.

> What's the right way to associate more data with a ResourceRequest?

My understanding is that the only cross-platform way is to provide the data on demand via a ResourceHandleClient method.

If the data is not cross-platform, and the platform doesn't implement Mac-style FrameLoaderClient callbacks, then it should be possible to add the data to the platform's ResourceRequest.
Comment 9 Adam Barth 2009-03-29 09:49:11 PDT
(In reply to comment #8)
> See WebFrameLoaderClient::dispatchWillSendRequest() in WebFrameLoaderClient.mm:

Ugg...  That's too bad.

> My understanding is that the only cross-platform way is to provide the data on
> demand via a ResourceHandleClient method.

I'll investigate this option.

> If the data is not cross-platform, and the platform doesn't implement Mac-style
> FrameLoaderClient callbacks, then it should be possible to add the data to the
> platform's ResourceRequest.

For this change, would you prefer that we go for a cross-platform implementation or would a Chromium-only implementation be more expedient?  I think Chromium is the only the port that needs this data today, but other ports will need it in the future if they want to implement the HTML 5 content sniffing algorithm.

(Thanks for explaining this all to me, by the way.)

Comment 10 Adam Barth 2009-03-29 09:49:47 PDT
Comment on attachment 29032 [details]
patchzor

Clearing the review flag to rework this patch.
Comment 11 Alexey Proskuryakov 2009-03-29 10:12:53 PDT
> For this change, would you prefer that we go for a cross-platform
> implementation or would a Chromium-only implementation be more expedient?

I'd suggest hunting down the person who designed the ResourceRequest class - as I was just learning the answers during this discussion, I don't feel qualified to make suggestions.
Comment 12 Adam Barth 2009-04-06 16:56:32 PDT
Morphing title to reflect new perspective.
Comment 13 Adam Barth 2009-04-06 16:58:09 PDT
Created attachment 29292 [details]
patchzor2

On #webkit, Maciej suggested just looking at the Content-Type header directly instead of looking at the computed mimeType.  This has the advantage of disentangling this check from content sniffing and should fix the issue.
Comment 14 Wan-Teh Chang 2009-04-06 18:48:29 PDT
Can we remove mimeType.isEmpty() and
equalIgnoringCase(mimeType, "application/x-unknown-content-type")
from CachedCSSStyleSheet::canUseSheet()?

These were blindly copied from Firefox source code.  Adam's change
will make the WebKit code differ more from the Firefox code, and
there will be less reason to allow mimeType.isEmpty() and
equalIgnoringCase(mimeType, "application/x-unknown-content-type").
Comment 15 Adam Barth 2009-04-06 20:56:15 PDT
We need to keep the isEmpty check for sure.  We should probably keep the other check too to keep matching Firefox perfectly.
Comment 16 Alexey Proskuryakov 2009-04-06 23:23:37 PDT
Comment on attachment 29292 [details]
patchzor2

> +    String mimeType = response().httpHeaderField("Content-Type");
>      return mimeType.isEmpty() || equalIgnoringCase(mimeType, "text/css") || equalIgnoringCase(mimeType, "application/x-unknown-content-type");

This won't allow e.g. "text/css; charset=utf-8". We have an extractMIMETypeFromMediaType() method to extract the MIME type from header value.

By the way, could you please provide some examples of sites that serve CSS resources without a Content-Type header? I'm surprised that there is a number of those.
Comment 17 Adam Barth 2009-04-07 00:07:55 PDT
(In reply to comment #16)
> By the way, could you please provide some examples of sites that serve CSS
> resources without a Content-Type header? I'm surprised that there is a number
> of those.

We originally received a report of this problem on https://myrewardzone.bestbuy.com/ (mentioned in the URL field above).  I think we successfully evangelized that site, but then we received reports of the same issue with these sites:

http://www.duels.com/
http://farecast.live.com/

This issue appears to be relatively common.  In any case, its a standards compliance issue.  Updated patch forthcoming.
Comment 18 Adam Barth 2009-04-07 00:25:53 PDT
Created attachment 29300 [details]
patch the third
Comment 19 Alexey Proskuryakov 2009-04-07 00:37:46 PDT
Do you know how exactly sniffing works in Safari now? For an empty Content-Type, I'm getting "text/css" in mimeType() - but will this patch change the behavior for sites that send e.g. "text/plain"?

Also, wouldn't a regression test for this change be helpful?
Comment 20 Adam Barth 2009-04-07 01:25:40 PDT
(In reply to comment #19)
> Do you know how exactly sniffing works in Safari now?  For an empty
> Content-Type, I'm getting "text/css" in mimeType()

I don't understand how that works.  The answer is buried somewhere in the closed-source CFNetwork.  Maciej hinted that it had to do with the file extension, but I couldn't verify that.

> - but will this patch change
> the behavior for sites that send e.g. "text/plain"?

It shouldn't change that behavior, as far as I know.  If CFNetwork is somehow magically sniffing text/css from text/plain, that violates the HTML 5 spec and is inconsistent with Firefox's behavior.

> Also, wouldn't a regression test for this change be helpful?

I could add a test, but it won't appear to be any different since we're not changing the CFNetwork behavior.  I'll add a test on the Chromium side, where we will see a difference.
Comment 21 Alexey Proskuryakov 2009-04-07 01:42:33 PDT
> It shouldn't change that behavior, as far as I know.  If CFNetwork is somehow
> magically sniffing text/css from text/plain, that violates the HTML 5 spec and
> is inconsistent with Firefox's behavior.

So, the answer is that Safari behavior might change, but not in a way that will break the Web, right?

> I could add a test, but it won't appear to be any different since we're not
> changing the CFNetwork behavior.

It's fine to add a test that checks for status quo, to avoid breaking the behavior in the future (or in other ports).
Comment 22 Adam Barth 2009-04-07 17:59:30 PDT
Comment on attachment 29300 [details]
patch the third

> So, the answer is that Safari behavior might change, but not in a way that will
> break the Web, right?

Yes.  I haven't seen a case where the Safari behavior would change, but I haven't tested the text/plain case.

> It's fine to add a test that checks for status quo, to avoid breaking the
> behavior in the future (or in other ports).

Clearing the review flag while I do this.

Thanks for being patient with me.  :)
Comment 23 Adam Barth 2009-04-07 18:51:48 PDT
Created attachment 29323 [details]
git format-patch
Comment 24 Alexey Proskuryakov 2009-04-08 00:47:33 PDT
Comment on attachment 29323 [details]
git format-patch

Testing with Safari on Mac OS X 10.5.6, this patch actually fixes a regression on the included test - since its extension is .cgi, the resource is sniffed as text/plain, so the stylesheet isn't applied in ToT.

I'd love to also have tests for a stylesheet with .css extension but no Content-Type, and for text/plain content type. Since an earlier version of this patch broke stylesheets with charset in Content-Type, we definitely need a test verifying that those work.

In failure case, Web Inspector shows a warning such as "Resource interpreted as stylesheet but transferred with MIME type text/plain." That's a lie - it should be an error, saying that that the stylesheet wasn't applied because of this issue. It might be appropriate to fix this now.

+        <...>  This test
+        isn't perfect: style.cgi sends "Content-Type: " instead of
+        not sending the header at all <..>

That's not necessarily bad - at least one of the sites you gave as a reference in comment 17 sends exactly that.

+    layoutTestController.waitUntilDone();

There shouldn't be any need to do this - onload fires before a test finishes by default in DRT.

+    // For protocols, like file:// and ftp:// that don't have a Content-Type

There is an extra comma after "protocols". Also, the comment doesn't match what the code does - even though file:// and ftp:// were likely the reason to permit an empty content type, we permit them for all protocols.

r=me with only these minor issues fixed if you'd like to be done with this bug ASAP, but please consider making more tests, and maybe fixing the Web Inspector problem.
Comment 25 Adam Barth 2009-04-08 08:28:46 PDT
Thanks Alexey.  I'll land this patch (with the fixes you suggest) because this bug is fairly high-priority in Chromium-land.  I'll do the rest of the work (more tests, Web inspector) in a follow up patch.
Comment 26 Adam Barth 2009-04-08 09:52:29 PDT
Comment on attachment 29323 [details]
git format-patch

Landed "git format-patch" in r42322.  Clearing review flag to remove from commit queue.
Comment 27 Adam Barth 2009-04-08 09:54:05 PDT
Created attachment 29337 [details]
more tests

Here are some more tests.  I couldn't figure out how to test the empty / no Content-Type header with the .css extension.  The only way I know to fake the extension is using PHP, but then I can't seem to set an empty Content-Type header.
Comment 28 Alexey Proskuryakov 2009-04-08 10:13:23 PDT
> Here are some more tests.  I couldn't figure out how to test the empty / no
> Content-Type header with the .css extension.  The only way I know to fake the
> extension is using PHP, but then I can't seem to set an empty Content-Type
> header.

See e.g. http/tests/misc/.htaccess for how this can be done with Apache.
Comment 29 Adam Barth 2009-04-08 10:34:41 PDT
(In reply to comment #28)
> See e.g. http/tests/misc/.htaccess for how this can be done with Apache.

Apache doesn't seem to like ForceType "".  It causes a 500 internal server error.  :(
Comment 30 Darin Adler 2009-04-08 11:15:29 PDT
Comment on attachment 29337 [details]
more tests

r=me on adding these tests; however, I don't know if you can land them without first landing a fix, so this may incorrectly show up in the "needs commit" queue
Comment 31 Alexey Proskuryakov 2009-04-08 11:57:43 PDT
(In reply to comment #29)
> Apache doesn't seem to like ForceType "".  

Ugh, I didn't know that. Well, one can force PHP processing for any file, I think that is done with "AddType application/x-httpd-php .css", or maybe with some AddHandler trick.
Comment 32 Adam Barth 2009-04-08 12:19:20 PDT
Additional tests landed in r42330.  I'm inclined to treat the Web inspector issue in a separate bug.  Marking as fixed.
Comment 33 Darin Adler 2009-04-24 12:33:50 PDT
See bug 25352 -- this broke bmwusa.com