25939 – [GTK] Text attachments in bugzilla are downloaded instead of being opened

RESOLVED FIXED 25939

[GTK] Text attachments in bugzilla are downloaded instead of being opened

https://bugs.webkit.org/show_bug.cgi?id=25939

Summary [GTK] Text attachments in bugzilla are downloaded instead of being opened

Xan Lopez

Reported 2009-05-21 10:11:29 PDT

Title says it all pretty much. One example is attachment in comment #4 here: http://bugzilla.gnome.org/show_bug.cgi?id=583180

Attachments
Don't bailout if content-type is text/plain so we can setup the headers correctly (3.49 KB, patch) 2009-06-13 06:40 PDT, Jan Alonzo	no flags	Details Formatted Diff Diff
updated patch to sniff text/plain data too - but only once we have the data (2.92 KB, patch) 2009-06-13 17:32 PDT, Jan Alonzo	no flags	Details Formatted Diff Diff
update patch with updated comments and changelog (3.33 KB, patch) 2009-06-14 01:22 PDT, Jan Alonzo	abarth: review-	Details Formatted Diff Diff
remove workaround and rely on libsoup to do the sniffing (2.95 KB, patch) 2009-07-03 05:30 PDT, Jan Alonzo	gustavo: review-	Details Formatted Diff Diff
Show Obsolete (3) View All Add attachment proposed patch, testcase, etc.

Jan Alonzo

Comment 1 2009-06-06 02:52:25 PDT

*** Bug 25400 has been marked as a duplicate of this bug. ***

Jan Alonzo

Comment 2 2009-06-13 06:40:39 PDT

Created attachment 31233 [details] Don't bailout if content-type is text/plain so we can setup the headers correctly We should only bail out of the got-headers callback if the content is not modified and we don't have a content-type. At the moment we are also bailing out if content-type is text/plain, hence we don't get to setup the headers correctly for text/plain data. I've stumbled on this while trying to enable more tests in http/tests/navigation. I've also removed a couple more tests and can probably serve as tests for this patch. I've tested with the launcher and works fine and can view text/plain now :)

Holger Freyther

Comment 3 2009-06-13 07:56:15 PDT

Thanks, this will allow me to use epiphany for reviewing. I will let kov review this one.

Xan Lopez

Comment 4 2009-06-13 08:17:10 PDT

Comment on attachment 31233 [details] Don't bailout if content-type is text/plain so we can setup the headers correctly > diff --git a/WebCore/platform/network/soup/ResourceHandleSoup.cpp b/WebCore/platform/network/soup/ResourceHandleSoup.cpp > index 6be13e2..04725b4 100644 > --- a/WebCore/platform/network/soup/ResourceHandleSoup.cpp > +++ b/WebCore/platform/network/soup/ResourceHandleSoup.cpp > @@ -151,7 +151,7 @@ static void fillResponseFromMessage(SoupMessage* msg, ResourceResponse* response > response->setHTTPHeaderField(name, value); > > GHashTable* contentTypeParameters = 0; > - String contentType = soup_message_headers_get_content_type(msg->response_headers, &contentTypeParameters); > + String contentType = String::fromUTF8(soup_message_headers_get_content_type(msg->response_headers, &contentTypeParameters)); This seems to be unrelated, and probably not needed? > > // When the server sends multiple Content-Type headers, soup will > // give us their values concatenated with commas as a separator; > @@ -255,8 +255,8 @@ static void gotHeadersCallback(SoupMessage* msg, gpointer data) > // headers; we will not do content sniffing for 304 responses, > // though, since they do not have a body. > const char* contentType = soup_message_headers_get_content_type(msg->response_headers, NULL); > - if ((msg->status_code != SOUP_STATUS_NOT_MODIFIED) > - && (!contentType || !g_ascii_strcasecmp(contentType, "text/plain"))) > + > + if (msg->status_code != SOUP_STATUS_NOT_MODIFIED && !contentType) > return; > > RefPtr<ResourceHandle> handle = static_cast<ResourceHandle*>(data); If you do a git blame you'll see you are basically un-doing what kov did to fix https://bugs.webkit.org/show_bug.cgi?id=24930, so I'd say this does not seem right.

Xan Lopez

Comment 5 2009-06-13 08:21:42 PDT

And for what is worth, IIRC the problem with this bug was text subtypes, not text/plain itself (that's why you can see some patches in bugzilla, but not those marked as patches I think). The actual fix was something like Christian tried on bug https://bugs.webkit.org/show_bug.cgi?id=24903.

Jan Alonzo

Comment 6 2009-06-13 16:32:24 PDT

(In reply to comment #4) > > @@ -151,7 +151,7 @@ static void fillResponseFromMessage(SoupMessage* msg, ResourceResponse* response > > response->setHTTPHeaderField(name, value); > > > > GHashTable* contentTypeParameters = 0; > > - String contentType = soup_message_headers_get_content_type(msg->response_headers, &contentTypeParameters); > > + String contentType = String::fromUTF8(soup_message_headers_get_content_type(msg->response_headers, &contentTypeParameters)); > > This seems to be unrelated, and probably not needed? I think I meant to put this in a separate patch. We have a bug https://bugs.webkit.org/show_bug.cgi?id=17432 that should avoid using String(char*) in Gtk and instead we should be using String::fromUTF8. > > > > > // When the server sends multiple Content-Type headers, soup will > > // give us their values concatenated with commas as a separator; > > @@ -255,8 +255,8 @@ static void gotHeadersCallback(SoupMessage* msg, gpointer data) > > // headers; we will not do content sniffing for 304 responses, > > // though, since they do not have a body. > > const char* contentType = soup_message_headers_get_content_type(msg->response_headers, NULL); > > - if ((msg->status_code != SOUP_STATUS_NOT_MODIFIED) > > - && (!contentType || !g_ascii_strcasecmp(contentType, "text/plain"))) > > + > > + if (msg->status_code != SOUP_STATUS_NOT_MODIFIED && !contentType) > > return; > > > > RefPtr<ResourceHandle> handle = static_cast<ResourceHandle*>(data); > > If you do a git blame you'll see you are basically un-doing what kov did to fix > https://bugs.webkit.org/show_bug.cgi?id=24930, so I'd say this does not seem > right. The patch in bug #24930 comment #7 probably seemed right at that time but looking at it now, it became a blanket solution for text/plain. What we really want is to just avoid displaying content that we can't display eventhough server sends a Content-type: text-plain. And I think we can do that in navigation-requested signal right (please correct me if i'm wrong)?

Jan Alonzo

Comment 7 2009-06-13 16:36:01 PDT

(In reply to comment #5) > And for what is worth, IIRC the problem with this bug was text subtypes, not > text/plain itself (that's why you can see some patches in bugzilla, but not > those marked as patches I think). The actual fix was something like Christian > tried on bug https://bugs.webkit.org/show_bug.cgi?id=24903. Thanks for pointing this out. I'd like to fix our issue with text/plain here and leave the media subtypes issue in bug #24903, as it's affecting some of the tests I want to run (if that's ok with you =)

Jan Alonzo

Comment 8 2009-06-13 16:43:50 PDT

(In reply to comment #6) > > The patch in bug #24930 comment #7 probably seemed right at that time but > looking at it now, it became a blanket solution for text/plain. What we really > want is to just avoid displaying content that we can't display eventhough > server sends a Content-type: text-plain. And I think we can do that in > navigation-requested signal right (please correct me if i'm wrong)? Sorry, I meant "decide policy for mime type" handlers.

Jan Alonzo

Comment 9 2009-06-13 17:32:15 PDT

Created attachment 31246 [details] updated patch to sniff text/plain data too - but only once we have the data Updated patch to bring back the fix in bug #24930 but only sniff once we have the data (i.e. in the got-chunk callback) and if the content-type in the header is text/plain. This also removes the blanket solution we're currently applying for text/plain. One caveat of this update is the tests I've enabled now fail. But those will be fixed once we have the text/ media subtypes issue fixed, which is discussed in a separate bug report.

Jan Alonzo

Comment 10 2009-06-14 01:22:29 PDT

Created attachment 31256 [details] update patch with updated comments and changelog This is almost the same patch except for updated inline comments and the use of g_content_type_equals instead of g_strcasecmp.

Adam Barth

Comment 11 2009-06-14 14:51:36 PDT

Comment on attachment 31256 [details] update patch with updated comments and changelog You really shouldn't sniff from text/plain. That causes big security problems.

Adam Barth

Comment 12 2009-06-14 14:53:11 PDT

Instead, we should use the HTML 5 content sniffing algorithm. See Bug 25064.

Jan Alonzo

Comment 13 2009-06-15 03:43:54 PDT

(In reply to comment #12) > Instead, we should use the HTML 5 content sniffing algorithm. See Bug 25064. Thanks for the review. Making it dependent on 25064.

Gustavo Noronha (kov)

Comment 14 2009-06-15 14:42:22 PDT

(In reply to comment #13) > (In reply to comment #12) > > Instead, we should use the HTML 5 content sniffing algorithm. See Bug 25064. > > Thanks for the review. Making it dependent on 25064. > Though by doing nothing we _are_ still sniffing text/plain. So my suggestion is this: sniff text/plain, and if the type resulting from the sniffing starts with "text/", do not change what we received (keep text/plain). If the type is something else, make it application/octet-stream. This is better than what we currently have security-wise, and gives us a simple solution while we do not get the mime types list updated, and have no HTML5 sniffing.

Adam Barth

Comment 15 2009-06-15 16:11:07 PDT

> Though by doing nothing we _are_ still sniffing text/plain. So my suggestion is > this: sniff text/plain, and if the type resulting from the sniffing starts with > "text/", do not change what we received (keep text/plain). If the type is > something else, make it application/octet-stream. That sounds like a big hack. > This is better than what we > currently have security-wise, and gives us a simple solution while we do not > get the mime types list updated, and have no HTML5 sniffing. If you're mainly concerned about text/plain <=> application/octet-stream, you can implement the text-or-binary algorithm, which is quite simple. Is there a big rush for resolving this issue? I think the best solution is to wait for Bug 25064 to be fixed.

Gustavo Noronha (kov)

Comment 16 2009-06-16 06:41:21 PDT

(In reply to comment #15) > > Though by doing nothing we _are_ still sniffing text/plain. So my suggestion is > > this: sniff text/plain, and if the type resulting from the sniffing starts with > > "text/", do not change what we received (keep text/plain). If the type is > > something else, make it application/octet-stream. > > That sounds like a big hack. Well, our whole sniffing code is a big hack. > > This is better than what we > > currently have security-wise, and gives us a simple solution while we do not > > get the mime types list updated, and have no HTML5 sniffing. > > If you're mainly concerned about text/plain <=> application/octet-stream, you > can implement the text-or-binary algorithm, which is quite simple. Yes, that's the main concern. Do you have a pointer on how to get this algorithm implemented? > Is there a big rush for resolving this issue? I think the best solution is to > wait for Bug 25064 to be fixed. Somewhat. We do snapshot releases quite often, and people are already using the code in preview releases of their software. If nothing else, we need to at least stop sniffing text/plain unconditionally, I think. The whole sniffing hack got written because we had no proper content sniffing in our http library, nor in webkit, and people were getting numerous download requests when, say, flash content was sent as application/octet-stream. What we need is really a quick fix that, even if not ideal, works well enough while we wait for proper sniffing to happen in either libsoup (I'm working on this), or webkit.

Adam Barth

Comment 17 2009-06-16 11:34:13 PDT

> Yes, that's the main concern. Do you have a pointer on how to get this > algorithm implemented? The algorithm is super simple. You can see it specified here: http://tools.ietf.org/html/draft-abarth-mime-sniff-01#section-4 or implemented here in the "LooksBinary" function here: http://src.chromium.org/viewvc/chrome/trunk/src/net/base/mime_sniffer.cc > Somewhat. We do snapshot releases quite often, and people are already using the > code in preview releases of their software. If nothing else, we need to at > least stop sniffing text/plain unconditionally, I think. I'd be happy to review a patch that does that. > What we need is really a quick fix that, even if not ideal, works well enough > while we wait for proper sniffing to happen in either libsoup (I'm working on > this), or webkit. Do you have a pointer to the libsoup discussion? I'd like to make sure it implements the HTML 5 algorithm.

Gustavo Noronha (kov)

Comment 18 2009-06-16 11:41:21 PDT

(In reply to comment #17) > > Yes, that's the main concern. Do you have a pointer on how to get this > > algorithm implemented? > > The algorithm is super simple. You can see it specified here: Thanks for the pointers! > > What we need is really a quick fix that, even if not ideal, works well enough > > while we wait for proper sniffing to happen in either libsoup (I'm working on > > this), or webkit. > > Do you have a pointer to the libsoup discussion? I'd like to make sure it > implements the HTML 5 algorithm. Implementing the HTML5 algorithm is the idea =): http://bugzilla.gnome.org/show_bug.cgi?id=572589

Jan Alonzo

Comment 19 2009-07-03 05:30:49 PDT

Created attachment 32236 [details] remove workaround and rely on libsoup to do the sniffing

Gustavo Noronha (kov)

Comment 20 2009-07-04 16:03:47 PDT

Comment on attachment 32236 [details] remove workaround and rely on libsoup to do the sniffing > + Remove workarounds for content type sniffing and rely on libsoup's > + content sniffer (now that it has one). This is not enough. We need to actually add the sniffer feature to the session, and use the new signal. I have a now-outdated webkitgtk+ patch at http://bugzilla.gnome.org/show_bug.cgi?id=572589 which I was planning to update and propose for reviewing. I'll do it tomorrow, but if you'd like to beat me to it, go ahead =D.

Jan Alonzo

Comment 21 2009-07-07 03:49:48 PDT

Fix landed in 45558. Related bug: https://bugs.webkit.org/show_bug.cgi?id=26982

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware PC

OS Linux

Product WebKit

Component WebKitGTK

Assignee

Gustavo Noronha (kov)

Reported

2009-05-21 10:11 PDT

Modified

2009-07-07 03:49 PDT History

CC List

5 users Show

URL

Keywords

Duplicates (1)

25400 View as bug list

Depends on

25064

Blocks

Dependencies

tree graph