Bug 14824 - REGRESSION(r14258): +setMIMETypesShownAsHTML has no effect if MIME type is text/* (works in Safari 2, doesn't work in Safari 3)
Summary: REGRESSION(r14258): +setMIMETypesShownAsHTML has no effect if MIME type is te...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: 523.x (Safari 3)
Hardware: Mac OS X 10.4
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar, Regression
Depends on:
Blocks:
 
Reported: 2007-07-31 05:07 PDT by Andre Pang
Modified: 2007-10-08 11:48 PDT (History)
3 users (show)

See Also:


Attachments
Proposed safe fix - for now (8.92 KB, patch)
2007-08-04 00:16 PDT, Brady Eidson
no flags Details | Formatted Diff | Diff
Patch to fix regression from Bug 14882 (4.29 KB, patch)
2007-08-05 15:15 PDT, David Kilzer (:ddkilzer)
beidson: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andre Pang 2007-07-31 05:07:23 PDT
I am attempting to load a filename on the local filesystem with an extension of .php, e.g. foo.php.  While this filename has a .php extension, it has already been processed with PHP and only contains normal HTML and no PHP commands.  A WebView is able to load the PHP file and display it as HTML if the following Objective-C code is used in an application:

[WebView setMIMETypesShownAsHTML:[[WebView MIMETypesShownAsHTML]] arrayByAddingObject:@"text/php"]]

This works in Safari 2: WebKit interprets the .php file as HTML and renders it correctly.  This does not work when Safari 3 is installed.  In Safari 3, the .php file is viewed as plain text.

I would expect this behaviour to work as it did in Safari 2, since I have told the WebView class to specifically view the text/php MIME type as HTML.
Comment 1 Andre Pang 2007-07-31 05:14:20 PDT
I also tested this out with r15121, and the bug is present there.  So it broke somewhere between Safari 2 and r15121.

Comment 2 Mark Rowe (bdash) 2007-07-31 05:23:40 PDT
This was broken in r14258 with the switch to handling plain text files in WebCore.  DOMImplementation::createDocument checks whether the resource's mime type should be displayed as a plain text document using:

bool DOMImplementation::isTextMIMEType(const String& mimeType)
{
    if (mimeType == "application/x-javascript" ||
        (mimeType.startsWith("text/") && mimeType != "text/html" &&
         mimeType != "text/xml" && mimeType != "text/xsl"))
        return true;
    
    return false;
}

MIME types not recognized as text, images, or plugins, are treated as HTML.  This is clearly not aware of any MIME types registered in WebKit via +[WebView setMIMETypesShownAsHTML:], so any MIME type starting with "text/" will be displayed as plain text.  +setMIMETypesShownAsHTML: needs to somehow be hooked into the logic of DOMImplementation::createDocument.
Comment 3 Mark Rowe (bdash) 2007-07-31 05:27:11 PDT
<rdar://problem/5372989>
Comment 4 Brady Eidson 2007-07-31 10:00:54 PDT
I got this, should be an easy fix.
Comment 5 Brady Eidson 2007-07-31 10:40:09 PDT
Hmmmm, it might be more complicated that I originally thought - I'd assumed this was an easy fix with the MIMETypeRegistry but it turns out the WebKit API concept of "MIME types displayable as HTML documents" is entirely contained within WebKit and is based on the view/representation classes.

I still think the MIMETypeRegistry is the solution to the issue, but there's a disjoint between WebCore and WebKit here.

WebCore, by default, looks for "text/html" *only*.
WebKit has some preset list of MIMETypes that will "show as HTML" - text/html is obviously included, but I need to explore and see what the others are.
Comment 6 Brady Eidson 2007-07-31 10:47:44 PDT
In an extremely round-about manner, WebKit's list of MIMEtypes that show as HTML corresponds to the MIMETypeRegistry::isSupportedNonImageMIMEType() list

That list includes many text/* mime types, including -
      "text/html",
      "text/xml",
      "text/xsl",
      "text/plain",
      "text/"

That list includes things we want to create a TextDocument for, and not an HTMLDocument.  But a TextDocument in WebCore speak still displays as an HTML document via a WebHTMLView in WebKit-land.

So we're in a dill of a pickle here - how to maintain the dual-concept of "Shows as HTML" at the WebKit API level, and "TextDocument vs HTMLDocument" down in WebCore?
Comment 7 Mark Rowe (bdash) 2007-08-01 04:50:29 PDT
What does the Windows port do with this?  It seems to me that we should be pushing most of this logic down into WebCore so that we aren't making two similar decisions based on different information at the different layers.
Comment 8 Brady Eidson 2007-08-01 17:19:57 PDT
It turns out the logic *is* all in WebCore - WebKitMac just takes an extremely round about way of arriving at that.

I had a patch almost complete before the simple fix fell apart.  There's a big problem here.

WebCore::MIMETypeRegistry has the concept of "supportedNonImageMIMETypes" and all of those are considered to be "as HTML".  This set of mime types is what WebKit ends up using when managing this API.

But then FrameLoader and/or DOMImplementation makes certain hard coded decisions without calling to the MIMETypeRegistry.  Mark already pointed this out.

This includes assuming that any "text/*" type besides "text/html" should display as text.  That decision overrides the "as-html" registration.

I think theres a few fixes that can be made here.
1 - "cosmetically", collapse all the roundaboutness WebKitMac uses to connect the API concept to the MIMETypeRegistry
2 - Find a clean way to move the concept of "text/* should be text" from hard coded DOMImplementation logic into the MIMETypeRegistry. 

There's a few details that make this kind of complex - namely the fact that "text/plain" and "text/" are in this hard coded list that are considered "HTML" mimetypes by the WebKit API, but are hard coded to be considered text mime types by WebCore.

=/
Comment 9 Brady Eidson 2007-08-01 20:57:37 PDT
This WebKit code is *sooo* convoluted, I can't believe it.  Definitely leftovers from the old way of think about what responsibilities WebKit has. 

An extensive reworking is straightforward but now still might not be the time for it.
Comment 10 Mark Rowe (bdash) 2007-08-01 21:01:07 PDT
As it stands this API that worked fine in Tiger is rather broken.  Is there some less-than-perfect solution we can come up with that gets things working in the mean time so we can leave the cleanup until a more appropriate time?
Comment 11 Brady Eidson 2007-08-01 22:02:41 PDT
Oh, beautiful.

I just ran MiniBrowser against ToT WebKit and compared results to Safari 2 WebKit

In Safari 2, the default set of MIMETypes in the [WebView MIMETypesShownAsHTML] list are:
text/html
application/atom+xml
multipart/x-mixed-replace
application/xml
application/xhtml+xml
type text/xsl
text/xml
application/rss+xml
application/x-webarchive

ToT (incorrectly?) adds text/ and text/plain, as well as application/x-javascript

I think if we restore the Tiger version of that list, fixing this bug will be a LOT easier
Comment 12 Brady Eidson 2007-08-01 22:14:46 PDT
Confirmed - with the same minibrowser test - that in Safari 2 you could remove text/html from the list and have html displayed as plain text.  You can no longer do this in ToT.  Yet another regression.
Comment 13 Brady Eidson 2007-08-01 22:30:39 PDT
There's two warring concepts here.
"Non-image MIMETypes we support"
and
"Non-image MIMETypes we display as HTML"

At some point, these two concepts were inadvertently merged, which is really the crux of making this fix difficult to do correctly.

That is also the reason why you can no longer remove text/html as a supported type and force it to display as text.
Comment 14 Brady Eidson 2007-08-02 13:55:17 PDT
Ugh, even more bugs found.

Running the following snippet - 
    NSLog(@"Before -\n%@", [WebView MIMETypesShownAsHTML]);
    [WebView setMIMETypesShownAsHTML:[[WebView MIMETypesShownAsHTML] arrayByAddingObject:@"text/php"]];
    NSLog(@"After - \n%@", [WebView MIMETypesShownAsHTML]);

Results in the following output -
2007-08-02 13:53:33.198 Safari[14732:10b] Before -
(
    "text/html",
    "application/x-ftp-directory",
    "application/atom+xml",
    "multipart/x-mixed-replace",
    "application/xml",
    "application/xhtml+xml",
    "text/xsl",
    "text/xml",
    "image/svg+xml",
    "application/rss+xml",
    "application/x-webarchive",
    "application/x-javascript"
)
2007-08-02 13:53:33.198 Safari[14732:10b] After - 
(
    "application/atom+xml",
    "image/vnd.adobe.photoshop",
    "image/targa",
    "multipart/x-mixed-replace",
    "image/gif",
    "application/xml",
    "image/pict",
    "image/x-quicktime",
    "image/sgi",
    "image/fpx",
    "text/xsl",
    "image/bmp",
    "application/rss+xml",
    "image/png",
    "image/jp2",
    "image/jpeg",
    "application/x-webarchive",
    "text/html",
    "application/x-ftp-directory",
    "image/x-canon-crw",
    "application/xhtml+xml",
    "image/x-xbitmap",
    "text/xml",
    "image/tiff",
    "image/svg+xml",
    "text/php",
    "image/pjpeg",
    "application/x-javascript"
)

All of these additional types are magically being added to the "shownAsHTML" list by the mere procedure of setting that list
Comment 15 Brady Eidson 2007-08-02 14:10:38 PDT
This gets even worse - 
Because of how [WebView setMIMETypesShownAsHTML:] is made, it only stores the mimetypes locally in WebKit.

It's becoming clear that the link between WebCore MIMETypeRegistry and Webkit was not designed with the intention for the MIMETypeRegistry to ever be mutated - WebKit constructs a list of mimetypes from the registry exactly once.  It then never passes any changes down into the WebCore level, and never asks WebCore for the list again.

*sigh*
Comment 16 Brady Eidson 2007-08-04 00:16:14 PDT
Created attachment 15836 [details]
Proposed safe fix - for now

Coded this one up on the train to Seattle when I was internet-less.

I think this is an extremely safe and simple fix for the bug in question.  As it stands, this API is COMPLETELY broken.  This fix will at least re-enable the ability to register mime types as "shown as html" without breaking any other quirks.

There are many other things wrong with the current situation with this API that can be resolved later when we're not quite as locked down in what types of fixes we're considering
Comment 17 Oliver Hunt 2007-08-04 00:30:47 PDT
Comment on attachment 15836 [details]
Proposed safe fix - for now

Replace the if in MIMETypeRegistry::shouldTreatAsText with a straight return

make the changelog readable, and r=me
Comment 18 Brady Eidson 2007-08-04 00:34:51 PDT
Landed in r24866

Need to do some bugzilla paperwork to file bugs on the remaining issues, but for now its bedtime!
Comment 19 David Kilzer (:ddkilzer) 2007-08-04 11:08:50 PDT
This caused a regression when viewing text/plain documents in Safari.  See Bug 14882.

Comment 20 David Kilzer (:ddkilzer) 2007-08-04 19:11:57 PDT
Fixed by beidson in r24866 (with comment tweak in r24867).
Comment 21 Brady Eidson 2007-08-04 22:07:23 PDT
http://trac.webkit.org/projects/webkit/changeset/24869 TOTALLY rebroke this

Comment 22 David Kilzer (:ddkilzer) 2007-08-05 15:15:01 PDT
Created attachment 15845 [details]
Patch to fix regression from Bug 14882

Proposed patch to fix the regression caused by Bug 14882:

- Reverts changes from patch in Bug 14882 to fix this bug.
- Changes +[WebDataSource _repTypesAllowImageTypeOmission:] and +[WebFrameView _viewTypesAllowImageTypeOmission:] to include now-missing "text/" and "text/plain" MIME types so that text/plain files may be viewed in the browser.
Comment 23 Mark Rowe (bdash) 2007-08-07 14:09:55 PDT
Comment on attachment 15836 [details]
Proposed safe fix - for now

Clearing the review flag as this was already landed.
Comment 24 Brady Eidson 2007-10-08 11:41:13 PDT
Comment on attachment 15845 [details]
Patch to fix regression from Bug 14882

What does this patch do?

It feels like it is more focused on solving the regression I caused re: downloading text files, but that regression has already been fixed.

I can't see how, with this patch, text mime types *can* be registered as HTML, and thats the bug that remains to be solved here.

If I'm completely off base and theres an explanation how, I will totally re-review :)
Comment 25 David Kilzer (:ddkilzer) 2007-10-08 11:48:45 PDT
(In reply to comment #24)
> (From update of attachment 15845 [details] [edit])
> What does this patch do?
> 
> It feels like it is more focused on solving the regression I caused re:
> downloading text files, but that regression has already been fixed.

Yes, it was intended to fix the regression.  Closing since it has already been fixed.