Bug 10359

Summary: Change WebKit website to use Apache MultiViews for content negotiation
Product: WebKit Reporter: Nicholas Shanks <nickshanks>
Component: WebKit WebsiteAssignee: Nobody <webkit-unassigned>
Status: RESOLVED WONTFIX    
Severity: Normal CC: mrowe, vladimir.olexa
Priority: P2    
Version: 420+   
Hardware: Mac   
OS: OS X 10.4   
Attachments:
Description Flags
patch hyatt: review-

Description Nicholas Shanks 2006-08-12 00:30:16 PDT
MultiViews = good. use it :)
Comment 1 Nicholas Shanks 2006-08-12 00:32:27 PDT
Created attachment 9998 [details]
patch
Comment 2 Darin Adler 2006-08-12 16:09:00 PDT
Comment on attachment 9998 [details]
patch

I'd review this if I knew what multiviews was and why it's good to take off the .html extension from all these links. A change log entry would be one way to enlighten me.
Comment 3 Dave Hyatt 2006-08-12 16:28:57 PDT
Comment on attachment 9998 [details]
patch

I would prefer the filenames be explicit.
Comment 4 Nicholas Shanks 2006-08-13 01:42:36 PDT
Darin:
It allows for easy changing of the file type used to generate resources. For example at the moment the site is serving flat HTML files with a .html name extension. If at a later date we should wish to change them to PHP files, it would require changing all the links from .html to .php. We may then decide that was a bad idea and have to change them all back again. By just leaving off the extension, we afford ourselves much more freedom to change/add/remove the implementation details for any individual resource.

It also allows apache to serve multiple files for a single resource, depending on what the user has requested. e.g.
User sends GET /resource   Accept-Language: fr, en;q=0.5
Apache serves /resource.fr.html
User reads page in preferred language
-or, say-
User sends GET /resource   Accept: image/gif, image/jpeg, image/png
Apache has three variants available, gif@200KB, png@150KB and bmp@180KB; since the browser cannot understand bmp, it discards that from being selected, then checks quality values. pngs have been configured at q=1.0, and gifs at 0.5, final values are png = 150KB/1.0 = 150,000; gif = 200KB/0.5 = 300,000. Then it serves the one with the lowest score (i.e. highest quality to file size ratio), so /resource.png

Apache can only do content negotiation if you leave off the file extensions. It's the most powerful and under-used feature of apache IMHO. I will repost with a changelog (didn't realise there was a need for one for the website).


Dave: could you give a reason as to why you prefer the filenames to be left as they are? I would strongly disagree with doing that, for the reasons explained to darin above.
Comment 5 Nicholas Shanks 2006-08-13 01:44:29 PDT
obviously i can't divide by 0.5 :-P    I meant 400,000 for gif
Comment 6 Mark Rowe (bdash) 2006-08-14 00:41:19 PDT
(In reply to comment #4)
> Darin:
> It allows for easy changing of the file type used to generate resources. For
> example at the moment the site is serving flat HTML files with a .html name
> extension. If at a later date we should wish to change them to PHP files, it
> would require changing all the links from .html to .php. We may then decide
> that was a bad idea and have to change them all back again. By just leaving off
> the extension, we afford ourselves much more freedom to change/add/remove the
> implementation details for any individual resource.

While this is nice in theory, it's troublesome to deal with correctly under Apache 1.3.  Apache 1.3 uses a fake MIME type of application/x-httpd-php on .php files to pass them off to mod_php.  Unless the client specifies that it will handle application/x-httpd-php in the "Accept" request header (either explicitly, or via */*) they will see the wonderful "406 Not Acceptable" page.  When I last looked, Googlebot only included text/html in the "Accept" header of its requests.

> It also allows apache to serve multiple files for a single resource, depending
> on what the user has requested. e.g.
> User sends GET /resource   Accept-Language: fr, en;q=0.5
> Apache serves /resource.fr.html
> User reads page in preferred language
> -or, say-
> User sends GET /resource   Accept: image/gif, image/jpeg, image/png
> Apache has three variants available, gif@200KB, png@150KB and bmp@180KB; since
> the browser cannot understand bmp, it discards that from being selected, then
> checks quality values. pngs have been configured at q=1.0, and gifs at 0.5,
> final values are png = 150KB/1.0 = 150,000; gif = 200KB/0.5 = 300,000. Then it
> serves the one with the lowest score (i.e. highest quality to file size ratio),
> so /resource.png
> 
> Apache can only do content negotiation if you leave off the file extensions.

This is only partially true.  Apache can only do content negotiation based on MIME type if the request URL omits the file extension, but it can do language-based negotiation even when the request URL includes the extension.

> It's the most powerful and under-used feature of apache IMHO. I will repost
> with a changelog (didn't realise there was a need for one for the website).
> 
> Dave: could you give a reason as to why you prefer the filenames to be left as
> they are? I would strongly disagree with doing that, for the reasons explained
> to darin above.

While using content negotation to provide a multi-lingual website or more diverse range of file formats is a nice idea, they are completely lacking in practicality.  We have no non-English content for the website nor anyone to create and maintain it, and the addition of extra graphical assets in other formats would increase the ongoing effort required for maintenance while providing almost no extra value.  Nice in theory, not very useful in practice.

Personally I would rather see '.html'-less URLs, but that is mainly based on aesthetics.  Some form of technical justification can be found in the W3Cs "Common HTTP Implementation Problems" (http://www.w3.org/TR/chips/), specifically guideline three.  The benefits are mainly theoretical, and the various problems with Apache makes dealing with these issues in the correct manner sufficiently painful that few websites bother to do it.
Comment 7 Darin Adler 2006-08-14 11:37:11 PDT
I like the fact that the website works locally without a web server. For a simple site like this one I'd prefer to retain that unless we're getting significant benefit from the server-side feature.
Comment 8 Nicholas Shanks 2006-08-16 10:22:03 PDT
Valid point. I never considered using it locally as I didn't realise I had the source until two days ago!
You could always set up a VirtualHost to point it's DocumentRoot at your WebKitSite SVN tree :-)
Comment 9 Vladimir Olexa (vladinecko) 2006-10-11 07:47:21 PDT
should this bug be still open since the consensus was to keep the website as it is? 
Comment 10 Mark Rowe (bdash) 2007-03-19 03:06:32 PDT
I think we've decided to leave this as-is.