Bug 27267 - HTTP Accept header gives preference of XML over HTML
Summary: HTTP Accept header gives preference of XML over HTML
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC All
: P2 Normal
Assignee: Nobody
URL: http://newmediacampaigns.com/page/web...
Keywords:
: 41914 (view as bug list)
Depends on:
Blocks:
 
Reported: 2009-07-14 10:33 PDT by Kris Jordan
Modified: 2011-03-10 17:05 PST (History)
10 users (show)

See Also:


Attachments
Patch to fix the bug with solution described in bug thread. (2.02 KB, patch)
2011-02-15 12:37 PST, Kris Jordan
no flags Details | Formatted Diff | Diff
Patch with Tests (3.08 KB, patch)
2011-02-15 12:53 PST, Kris Jordan
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kris Jordan 2009-07-14 10:33:07 PDT
For developers wanting to use the HTTP protocol to implement RESTful content negotiation where resources can be represented as HTML or XML WebKit unhelpfully prefers XML over HTML. Marciej has acknowledged this error: 

"Most WebKit-based browsers (and Safari in particular) would probably do a better job rendering HTML than XHTML or generic XML, if only because the code paths are much better tested. So the Accept header is somewhat in error."

Here is a demo URL that will be opened in XML in WebKit and HTML in Firefox 3+: 
http://recessframework.org/demo/content-negotiation/tweet

The accept header appears to be defined on line 200 of WebCore/loader/FrameLoader.cpp

I recommend two alternatives:

Copy Firefox 3.5's: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
or
Simplify: text/html,*/*

For more background information please see:
http://newmediacampaigns.com/page/browser-rest-http-accept-headers
http://newmediacampaigns.com/page/webkit-team-admits-accept-header-error

Thanks!
Comment 1 Mark Rowe (bdash) 2009-07-14 12:30:57 PDT
This could be considered to be a duplicate of bug 12296.
Comment 2 Maciej Stachowiak 2009-07-14 16:31:41 PDT
bug 12296 made a specific proposal for how to change Accept, but I think following the lead of Firefox 3.5 would be better than the suggested header in that bug. So not quite a duplicate.
Comment 3 Matt Bishop 2009-07-29 17:15:52 PDT
FWIW, here are some other default ACCEPT headers:

IE8: image/gif, image/jpeg, image/pjpeg, image/pjpeg, application/x-shockwave-flash, application/xaml+xml, application/vnd.ms-xpsdocument, application/x-ms-xbap, application/x-ms-application, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, application/x-silverlight, */*

Opera: text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1

Chrome: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Comment 4 Sven-S. Porst 2010-04-28 07:10:17 PDT
I had troubles with Safari on author URLs at the arXiv preprint server:

e.g. http://arxiv.org/a/bahns_d_1

While this works perfectly in Firefox (returning and displaying HTML content, presumably because it has text/html as its first accept header), it gives a poor experience in Safari because the server takes the application/xml Accept header as a hint that we'd really prefer prefer XML/Atom format content and then redirects to the Atom feed. That, in turn, causes Safari to not display the page at all on my machine but to  launch my separate RSS reader and open it there. I.e. it results in rather annoying behaviour.

Just in case you needed another example where the difference in Accept headers causes real-world differences in the results seen by the user.
Comment 5 Alexey Proskuryakov 2010-07-09 11:47:49 PDT
*** Bug 41914 has been marked as a duplicate of this bug. ***
Comment 6 yustin.knows 2011-02-15 03:30:31 PST
This bug has been around quite a while and has last been updated half a year ago.

Since I as a backend developer ran into the same problem lately and had to do some seriously ugly hacks, that change the accept header, reciefed from a webkit request to something more desent on my server...

Has there been any progress in this problem.

Are there any plans to implement a better accept-header in webkit in the near future?

Regards,
Yustin
Comment 7 Alexey Proskuryakov 2011-02-15 10:28:13 PST
I think that there is consensus about making the Accept string match Firefox again (the current one matches an older version). Someone just needs to do the work: <http://www.webkit.org/coding/contributing.html>.
Comment 8 Kris Jordan 2011-02-15 11:17:58 PST
I'll take a stab here shortly.

(In reply to comment #7)
> I think that there is consensus about making the Accept string match Firefox again (the current one matches an older version). Someone just needs to do the work: <http://www.webkit.org/coding/contributing.html>.
Comment 9 Kris Jordan 2011-02-15 12:37:47 PST
Created attachment 82505 [details]
Patch to fix the bug with solution described in bug thread.

The default accept header now mirror's FireFox'. The meaningful change is that 'text/html' is now preferred over 'application/xml'.
Comment 10 James Leigh 2011-02-15 12:44:04 PST
Thanks for taking this on Kris. The file LayoutTests/http/tests/misc/xhtml-expected.txt will need to be changed as well. This test calls xhtml.php, which echos the accept header back to the client. It needs to match the file above to pass the test.
Comment 11 Alexey Proskuryakov 2011-02-15 12:46:01 PST
The link I gave above (<http://www.webkit.org/coding/contributing.html>) mentions running layout tests, as well as some other necessary steps.
Comment 12 Kris Jordan 2011-02-15 12:53:24 PST
Created attachment 82508 [details]
Patch with Tests

The expected result of the xhtml test is now included in the patch.
Comment 13 yustin.knows 2011-03-10 08:28:43 PST
So there is a patch now. good. :)

Any chance of getting this patch into an official release anytime soon?

The bug is still in the 'NEW' Status.
Comment 14 Alexey Proskuryakov 2011-03-10 08:58:22 PST
Please mark the patch for review by clicking Details link to the right of it. Otherwise, looks fine at a first glance.
Comment 15 Kris Jordan 2011-03-10 09:16:49 PST
Marked for review (I believe). Let me know if I did not do that correctly.
Comment 16 Alexey Proskuryakov 2011-03-10 09:55:20 PST
It should be r?, not r+.
Comment 17 Kris Jordan 2011-03-10 10:00:01 PST
That makes sense- sorry, first time through. Updated accordingly.
Comment 18 Alexey Proskuryakov 2011-03-10 10:08:54 PST
Comment on attachment 82508 [details]
Patch with Tests

I don't understand the change in XHTMLMP case. Why downgrade application/xml, but not application/xhtml+xml?

I suggest leaving XHTMLMP unchanged, for people who care about it (if any) to consider consequences of changing the Accept string.
Comment 19 James Leigh 2011-03-10 10:46:08 PST
My understanding is that XHTMLMP is not used in any products, so this is pure theoretical and I only post this as an FYI. According to http://www.passani.it/gap/#MIME_TYPE application/xml is not a recommended content-type for mobile. It also says application/xhtml+xml is a better choice than text/html. The current patch (id= 82508) is consistent with best practices as of April 2010.
Comment 20 Kris Jordan 2011-03-10 11:02:45 PST
(In reply to comment #18)
> (From update of attachment 82508 [details])
> I don't understand the change in XHTMLMP case. Why downgrade application/xml, but not application/xhtml+xml?

Two motivations:

1) The heart of this issue is making sure application/xml is downgraded below text/html. How text/html and application/xhtml+xml are arranged is, relatively, inconsequential. I did not want to change any more than was necessary to resolve the fundamental issue of this ticket.

2) (This is *much* more nit picking and *much* less important.) 'application/xhtml+xml;profile='http://www.wapforum.org/xhtml' is not equivalent to 'application/xhtml'. It implies the server should only serve application/xhtml+xml if it can do so with that specific WAP profile. Technically, by the HTTP spec, the XHTMLMP accept header prioritizes a generic 'application/xhtml' below 'application/xml' because it falls under the expansion of */*. In practice, servers often ignore the profile parameter.

> I suggest leaving XHTMLMP unchanged, for people who care about it (if any) to consider consequences of changing the Accept string.

I don't know who or what uses the XHTMLMP case, and I don't have a problem leaving it alone, but I do believe it is a win to get HTML > XML in any case. Let me know if you need me to resubmit sans the XHTMLMP change for this to continue making progress.
Comment 21 Alexey Proskuryakov 2011-03-10 11:16:36 PST
Comment on attachment 82508 [details]
Patch with Tests

OK.
Comment 22 WebKit Commit Bot 2011-03-10 15:44:06 PST
Comment on attachment 82508 [details]
Patch with Tests

Clearing flags on attachment: 82508

Committed r80776: <http://trac.webkit.org/changeset/80776>
Comment 23 WebKit Commit Bot 2011-03-10 15:44:10 PST
All reviewed patches have been landed.  Closing bug.
Comment 24 Kris Jordan 2011-03-10 17:05:31 PST
Thanks for reviewing and getting it in the commit queue Alexey! Great to see this ticket resolved.