Bug 10015

Summary: objects with TYPE="application/pdf" only use internal PDF image, don't use PDf plugins
Product: WebKit Reporter: Rudi Sherry <rsherry>
Component: PDFAssignee: Nobody <webkit-unassigned>
Status: RESOLVED WORKSFORME    
Severity: Normal CC: ap, ddkilzer, emacemac7, j, rsherry
Priority: P2 Keywords: InRadar
Version: 420+   
Hardware: Mac   
OS: OS X 10.4   
URL: http://acroeng.adobe.com/bugzilla/pdfbug.html

Description Rudi Sherry 2006-07-19 13:51:49 PDT
If you have an object with the property TYPE="application/pdf" and data that's a PDF file, it will image the PDF file using the native NSImage stuff.  If you remove the TYPE property, it will invoke a plug-in that handles PDF.

This happens even if you set the "OmitWebKitPDFSupport" preference.

See the URL.  The top square has the TYPE and you see it's blank; if you have Adobe Reader installed the bottom square tries to load it.  Note that since Adobe Reader 7 doesn't support embedded objects on the mac you will get an error -- but there is no way that WebKit can account (or not) for that fact, and it should try to load the plug-in if it can.

The problem is twofold:
(1) when the "type" token is read in, HTMLObjectEelement::isImageType() is called to see if it's an image type... first, the plug-ins should be polled to see if they are asking to handle this type.

(2) HTMLObjectEelement::isImageType() is returning true for "applicaotin/pdf" because it returns true for all the types that NSImage can handle, but it never checks the boolean preference "WebKitOmitPDFSupport" (WebKit checks it in some places but WebCore does not); so "application/pdf" is considered an internal image type and later -- when the check for plug-ins would normally happen for unsupported types -- the plug-ins are never polled.  If that boolean preference is set, the HTMLElement image types should specifically remove "application/pdf" from the list (as WebKit does).

Either one of the above changes would fix this, but I believe that both changes should be made.
Comment 1 Alexey Proskuryakov 2006-07-23 13:41:04 PDT
Confirmed.

I am not sure if (2) is a bug - this undocumented preference just disables WebPDFView support, and I wouldn't necessarily expect that it will limit image drawing, too. At any rate, this is a separate issue, so it would be better to have a separate bug about it.

The top OBJECT being empty is IMO a bug, too - it's empty because the lower left corner of the PDF is drawn, while I would expect to see the upper left corner. A few quick experiments even suggest that TOT may behave even worse (my test PDF was distorted and drawn partially outside of its box). We'll need to file a separate bug for this, too.
Comment 2 Alice Liu 2007-01-31 11:47:50 PST
<rdar://problem/4663328>
Comment 3 Dave Hyatt 2007-04-12 18:55:42 PDT
I disagree with the behavior described in (1).  Plugins should not be allowed to take over the browser's image types ever.  If we want to make a special case for PDF through the preference, then I agree we could fix (2).
Comment 4 Dave Hyatt 2007-04-12 18:56:36 PDT
I mention this since we've had bugs in the past where Quicktime ended up handling PNG if it was used in an <object> tag.  That's just awful.
Comment 5 Rudi Sherry 2007-04-13 10:00:11 PDT
(In reply to comment #4)

I think we have a philosophical difference.

I like the flexibility where I can create plug-ins to handle almost anything.  Plug-ins are used to extend and customize behavior, based on mimetype; saying that there are a certain set of mimetypes that are simply off-limits to plug-ins seems arbitrary to me, especially when we decide there are a few that need special behavior.  Indeed, we don't even know what the list is since it includes "anything an NSImage can handle" and that set of mimetypes may increase in the future.

Why is it awful that Quicktime handles PNG?  I don't know any details about what makes it awful, but I believe it's really a Quicktime problem, and it is not an indication that allowing it is bad architecture. 

Given all that, though, for performance and other practical reasons I'm willing to go along with the status quo and add the loophole specifically for application/pdf; I'd like to put my two cents in, though, to structure the bug fix such that it can someday be extended to all mime types.
Comment 6 Jake Logan 2007-04-16 22:46:28 PDT
(In reply to comment #5)
> (In reply to comment #4)
> 
> I think we have a philosophical difference.
> 
> I like the flexibility where I can create plug-ins to handle almost anything. 
> Plug-ins are used to extend and customize behavior, based on mimetype; saying
> that there are a certain set of mimetypes that are simply off-limits to
> plug-ins seems arbitrary to me, especially when we decide there are a few that
> need special behavior.  Indeed, we don't even know what the list is since it
> includes "anything an NSImage can handle" and that set of mimetypes may
> increase in the future.
> 
> Why is it awful that Quicktime handles PNG?  I don't know any details about
> what makes it awful, but I believe it's really a Quicktime problem, and it is
> not an indication that allowing it is bad architecture. 
> 
> Given all that, though, for performance and other practical reasons I'm willing
> to go along with the status quo and add the loophole specifically for
> application/pdf; I'd like to put my two cents in, though, to structure the bug
> fix such that it can someday be extended to all mime types.
> 

Rudi Sherry from Adobe says:

Reader 8 on the mac can now work embedded in HTML documents, using the <object> or <embed> html objects.  There are increasing numbers of workflows using this mechanism now that there is support for bidirectional javascript between the PDF and the html DOM.

Unfortunately, any object with type="application/pdf" will not load Reader in the current Safari because the WebKit gives preference to the innate PDF image type in Mac OS X; this means that until this bug is fixed, web servers won't be able to support Safari or webkit-based applications for these workflows.

Here's my take on the issue.

The documented and expected behavior of the "type" attribute for the OBJECT tag is as follows:

This attribute specifies the content type for the data specified by data. This attribute is optional but recommended when data is specified since it allows the user agent to avoid loading information for unsupported content types. If the value of this attribute differs from the HTTP Content-Type returned by the server when the object is retrieved, the HTTP Content-Type takes precedence.

http://www.w3.org/TR/html401/struct/objects.html#adef-type-OBJECT

My thinking is that WebKit should properly support it (other browsers do, according to the definition above) and we really need it for SAP support. SAP relies heavily on the in-built scripting support inside Adobe Reader 8 and other features specific to Adobe Reader 8. Showing an 'image' or the first page of the PDF won't help SAP.

Certainly something needs to be done, since right now using the 'type' attribute causes a blank page to show in WebKit which is definitely wrong behavior. (the PDF object is present but you can't see it)
Comment 7 Dave Hyatt 2007-04-17 05:02:59 PDT
I think you're mixing up a few things with your last two comments.

(1) Properly supporting the content type as an override will not address any of your issues.  It's a bug right now that if the type attribute is omitted that we assume a plug-in should be used.  Once we close that bug, the same behavior will start applying as though the content type was specified on a type attribute.  In other words, we'll start checking the MIME type to see if it's an image.

(2) Fundamental image types and fundamental document types should not be overridable by plugins.  This would allow plugins to take over basic image rendering and even HTML document or XML document rendering.  This is roughly analogous to the way browsers will still display HTML on their own even if the OS thinks another browser is registered as the default.

(3) Is PDF a fundamental image type?  As I said in an earlier comment, in my opinion it isn't and could actually just be excluded from consideration as an image when used in an <object> tag.

Comment 8 Rudi Sherry 2007-04-17 10:35:31 PDT
I don't understand the assertion in (2); why shouldn't plug-ins have that ability?  Plain text is a fundamental type, but the OS allows you to choose what application it will launch for .txt files; why shouldn't the browser allow the same for plug-ins?  It seems by that analogy (OS selecting a browser maps to browser selecting a plug-in, HTML-handler maps to mimetype-handler) a plug-in *should* be able to override anything.  The user choosing an application that gets launched when .jpg files are double-clicked maps to the user installing a plug-in that handles "image/jpg".

That said, though, your approach with Fundamental Types is logical and consistent (and definitely better for performance).  I think that my main issue was finding that NSImage's capabilities -- whatever they happened to be on that system -- was determining Fundamental Image Types.  Perhaps we can have an explicit list of fundamental image types in WebKit, rather than asking NSImage what it can handle?

a. if it's on the list: NSImage gets it no questions asked
b. not on the list?  Ask plug-ins; if one handles it, it gets it
c. not a. or b.? Ask NSImage; if it handles it, it gets it
d. whatever we do now with non-image types

How does that sounds?

BTW, I definitely agree with (1); it always seemed like a bug to me that the data was handled differently depending on how the mimetype was detected -- thus omitting mimetypes was, to me, always a workaround and not a solution.

Comment 9 Alexey Proskuryakov 2011-05-05 14:51:22 PDT
This doesn't occur with Safari 5.0.5 or WebKit nightlies with Adobe PDF plug-in 10.0.3 for me.