Bug 17844 - Acid2 (no data): WebKit starts a download for the data in an embedded object element
: Acid2 (no data): WebKit starts a download for the data in an embedded object ...
Status: NEW
: WebKit
Page Loading
: 528+ (Nightly build)
: Macintosh Mac OS X 10.5
: P2 Normal
Assigned To:
: http://hixie.ch/tests/evil/acid/002-n...
: HasReduction
:
: 4911
  Show dependency treegraph
 
Reported: 2008-03-14 04:48 PST by
Modified: 2010-06-10 15:46 PST (History)


Attachments
minimal test case (254 bytes, text/html)
2008-03-14 04:49 PST, Robert Blaut
no flags Details
Further reduction (121 bytes, text/html)
2008-03-23 00:58 PST, Cameron Zwarich (cpst)
no flags Details
test case with data URI (ok) (77 bytes, text/html)
2008-03-23 11:58 PST, Robert Blaut
no flags Details
Preliminary patch (836 bytes, patch)
2008-03-23 21:29 PST, Cameron Zwarich (cpst)
no flags Review Patch | Details | Formatted Diff | Diff
Test for user navigation (1.50 KB, application/octet-stream)
2008-03-23 21:32 PST, Cameron Zwarich (cpst)
no flags Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-03-14 04:48:41 PST
It's not duplicate of bug 4911. It's a 'brand' new Acid2 (no data) bug.

Summary:
On this page http://hixie.ch/tests/evil/acid/002-no-data/  there is a combination to object elements:

<object data="data006"><object data="http://www.damowmow.com/404/" type="text/html">[...]</object></object>

data006 is text/html file:

curl -I "http://hixie.ch/tests/evil/acid/002-no-data/006"
HTTP/1.1 404 Not Found
Date: Fri, 14 Mar 2008 11:28:00 GMT
Server: Apache/2.0.61 (Unix) PHP/4.4.7 mod_ssl/2.0.61 OpenSSL/0.9.7e mod_fastcgi/2.4.2 DAV/2 SVN/1.4.2
Accept-Ranges: bytes
Vary: Accept-Encoding
X-Pingback: http://tracking.damowmow.com/
Content-Language: en-GB-x-Hixie
Content-Type: text/html; charset=utf-8

the second data is also text/html file:
curl -I "http://www.damowmow.com/404/"HTTP/1.1 404 Not Found
Date: Fri, 14 Mar 2008 11:43:16 GMT
Server: Apache/2.0.61 (Unix) PHP/4.4.7 mod_ssl/2.0.61 OpenSSL/0.9.7e mod_fastcgi/2.4.2 DAV/2 SVN/1.4.2
Last-Modified: Mon, 11 Jul 2005 02:17:49 GMT
ETag: "ed2bb7-360-63d98d40"
Accept-Ranges: bytes
Content-Length: 864
Vary: Accept-Encoding
X-Pingback: http://tracking.damowmow.com/
Content-Language: en-GB-x-Hixie
Link: </resources/images/astrophy/16>; rel="icon"
Content-Type: text/html; charset=utf-8

So if you load the test case in Webkit, the engine should not download any file. It's not true in the latest Webkit nightly. Webkit download 006 file.

Steps to reproduce:
1) Go to this page:http://hixie.ch/tests/evil/acid/002-no-data/ 
2) Notice the Webkit download '006' file.

Expected behavior:
No file should be download

Current behavior:
006 file is downloaded.
------- Comment #1 From 2008-03-14 04:49:10 PST -------
Created an attachment (id=19759) [details]
minimal test case
------- Comment #2 From 2008-03-14 07:38:02 PST -------
Strange. Currently 006 file is sent with Content-Type: application/x-unknown
 instead of text /html:

curl -I "http://hixie.ch/tests/evil/acid/002-no-data/data006"
HTTP/1.1 200 OK
Date: Fri, 14 Mar 2008 14:33:46 GMT
Server: Apache/2.0.61 (Unix) PHP/4.4.7 mod_ssl/2.0.61 OpenSSL/0.9.7e mod_fastcgi/2.4.2 DAV/2 SVN/1.4.2
Last-Modified: Tue, 31 May 2005 12:02:17 GMT
ETag: "1a9dad7-5-c6926440"
Accept-Ranges: bytes
Content-Length: 5
X-Pingback: http://tracking.damowmow.com/
Content-Language: en-GB-x-Hixie
Content-Type: application/x-unknown
------- Comment #3 From 2008-03-23 00:54:59 PST -------
I can't reproduce this if I just host the data file somewhere else, although my hosting gives it either text/plain or text/html, not application/x-unknown. Since the problem still occurred back when it was hosted as text/html, I don't think that's the problem. The 404 doesn't seem to be special, because the problem still occurs with other 404 pages out there.

I'll try to get to the bottom of this.
------- Comment #4 From 2008-03-23 00:58:05 PST -------
Created an attachment (id=19979) [details]
Further reduction

The 404 and the fallback content are seemingly irrelevant to this bug. Here is a file without them that still reproduces the problem.
------- Comment #5 From 2008-03-23 11:24:58 PST -------
I think I found why we (In reply to comment #2)
> Strange. Currently 006 file is sent with Content-Type: application/x-unknown
>  instead of text /html:

I think I figured this out. The page was probably never sent with text/html. In your original post you paste the following snippet of HTML:

<object data="data006"><object data="http://www.damowmow.com/404/"
type="text/html">[...]</object></object>

However, you curl'd "006", which doesn't exist, not "data006". It was downloading data006, which has type application/x-unknown, just like the corresponding data URL in the single page test.

Starting a download is still a bug, though, and I'll change the name to reflect that. 
------- Comment #6 From 2008-03-23 11:39:48 PST -------
(In reply to comment #5)

> However, you curl'd "006", which doesn't exist, not "data006". 

Uh... My fault. Sorry for the mess :(
------- Comment #7 From 2008-03-23 11:44:23 PST -------
(In reply to comment #6)
> (In reply to comment #5)
> 
> > However, you curl'd "006", which doesn't exist, not "data006". 
> 
> Uh... My fault. Sorry for the mess :(

No mess, I saw it as soon as I started debugging. ;-)

There are two ways to fix this. The first is to change the policy check code on the WebKit side, and the second is to change MainResourceLoader to ignore it in this case. I am leaning towards the second, but I will need to run some tests with other browsers to see what happens with iframes, framesets, etc.
------- Comment #8 From 2008-03-23 11:58:18 PST -------
Created an attachment (id=19984) [details]
test case with data URI (ok)

Cameron, the crucial question in this case is: why data URI behaves differently?
------- Comment #9 From 2008-03-23 12:08:08 PST -------
(In reply to comment #8)
> Created an attachment (id=19984) [edit] [details]
> test case with data URI (ok)
> 
> Cameron, the crucial question in this case is: why data URI behaves
> differently?

There is never a policy check on the data URI, so the issue doesn't come up. I'll try to figure out what happens instead.
------- Comment #10 From 2008-03-23 18:32:49 PST -------
The data URI version never creates a MainResourceLoader for the data URI, so it is using completely logic entirely. I have a fix for this, which blocks all downloads from object elements by simply ignoring the content policy from the WebFrameLoaderClient. Maciej says that WebKit may be the wrong place to make this policy, and that it may require changes in the API to allow WebKit clients to make the choice themselves.

I'll post the fix when I get to school, along with some tests and examples of what other browsers do in similar situations.
------- Comment #11 From 2008-03-23 21:29:54 PST -------
Created an attachment (id=19993) [details]
Preliminary patch

Here is a preliminary patch. Maciej says that WebKit may be the wrong place to make this policy decision, and that we should possibly make it a client preference.
------- Comment #12 From 2008-03-23 21:32:59 PST -------
Created an attachment (id=19994) [details]
Test for user navigation

Here is a test case for what happens when a user navigates inside of an object element to content with application/x-unknown. Both Firefox and Opera prompt the user for a download, but my fix disables it entirely, displaying the fallback content instead. Should we match their behaviour?
------- Comment #13 From 2008-03-25 11:35:21 PST -------
I've been talking with brady about this bug over IRC, and I just wanted to make a summary. Here are the scenarios:

Scenario #1 - data of object element refers to application/x-unknown content

WebKit (with Safari) - download and display fallback content
Firefox and Opera - don't download and display fallback content

Scenario #2 - application/x-unknown content reached via navigation inside of an object element

WebKit (with Safari) - download and display fallback content
Firefox and Opera - download and don't display fallback content

Scenario #3 - src of iframe  element refers to application/x-unknown content

WebKit (with Safari) - download
Firefox and Opera - download

Scenario #4 - application/x-unknown content reached via navigation inside of an iframe element

WebKit (with Safari) - download
Firefox and Opera - download

Note that we only differ on #1 and #2.
------- Comment #14 From 2008-03-25 11:44:49 PST -------
> Here are the scenarios:
> Scenario #1 - data of object element refers to application/x-unknown content
> 
> WebKit (with Safari) - download and display fallback content
> Firefox and Opera - don't download and display fallback content

The only difference here is app-level.  Firefox and Opera prompt on downloads, but Safari downloads without prompting.  Safari is the WebKit PolicyDelegate in this context and makes the download decision.  If different behavior is desired, a Safari bug should be filed at http://bugreporter.apple.com

> Scenario #2 - application/x-unknown content reached via navigation inside of an
> object element
> 
> WebKit (with Safari) - download and display fallback content
> Firefox and Opera - download and don't display fallback content

This is the interesting difference - and I'm having a hard time wrapping my head around why it is we're wrong and Opera/FFX are right.  Can someone explain to me why we shouldn't display the fallback content?
The "downloading the resource" bug popped up with Acid2, but does it cause us to fail anything?
------- Comment #15 From 2008-03-25 11:50:00 PST -------
(In reply to comment #14)
> > Scenario #2 - application/x-unknown content reached via navigation inside of an
> > object element
> > 
> > WebKit (with Safari) - download and display fallback content
> > Firefox and Opera - download and don't display fallback content
> 
> This is the interesting difference - and I'm having a hard time wrapping my
> head around why it is we're wrong and Opera/FFX are right.  Can someone explain
> to me why we shouldn't display the fallback content?

I think the idea is that if the page inside of the object element were the main document, you would still see the page after the download, so why not inside of the object element? I am not sure if this argument is really correct. It seems that either way you have to introduce some inconsistencies.

> The "downloading the resource" bug popped up with Acid2, but does it cause us
> to fail anything?

It doesn't cause a failure with Acid 2, although some people would definitely see it as a problem.
------- Comment #16 From 2008-07-16 23:40:16 PST -------
(In reply to comment #14)
> > Here are the scenarios:
> > Scenario #1 - data of object element refers to application/x-unknown content
> > 
> > WebKit (with Safari) - download and display fallback content
> > Firefox and Opera - don't download and display fallback content
> 
> The only difference here is app-level.  Firefox and Opera prompt on downloads,
> but Safari downloads without prompting.  Safari is the WebKit PolicyDelegate in
> this context and makes the download decision.  If different behavior is
> desired, a Safari bug should be filed at http://bugreporter.apple.com

Brady, Firefox and Opera don't prompt for download on Acid2 (no data) page. This behavior is absolutely correct. Browsers shouldn't download nor prompt for download in this case. Only fallback content should be displayed as described in object handling algorithm: http://www.whatwg.org/specs/web-apps/current-work/#the-object.

Cameron, do you plan to improve the patch?