Bug 7396

Summary: links in this site come up as source code - sometimes
Product: WebKit Reporter: the codist <codist>
Component: EvangelismAssignee: Nobody <webkit-unassigned>
Status: RESOLVED INVALID    
Severity: Major CC: abdulalis, ap, beidson, ddkilzer, ian, p.pedaci, rachael, wdc-opendarwin
Priority: P2 Keywords: InRadar
Version: 417.x   
Hardware: Mac   
OS: OS X 10.4   
URL: http://consonancemusic.com/
Attachments:
Description Flags
Original source of projecteuclid.org/jdg
none
Original source of projecteuclid.org/Dienst/UI/1.0/Journal?authority=euclid.jdg none

Description the codist 2006-02-20 16:16:36 PST
The index page and subpages in this site come up as plain html text the first time you click on it. No DOM at all. Subsequent reloads correctly display the content. However a later view of the page may do it again.

This does not happen on any other browser I tried. All the pages are strict XHTML 1.0 and validated.

Can anyone reproduce? Any clue what's going on here?

Other sites from the same server (I control it) come up fine, only this one has the problem. This is the only strict XHTML 1.0 site however.
Comment 1 Joost de Valk (AlthA) 2006-02-20 22:08:20 PST
I cannot confirm this behavior, neither in the latest nightly, nor in released Safari. Could you give some more information on what you're doing whent his happens?
Comment 2 the codist 2006-02-21 05:11:27 PST
NOthing special, I just go to the site, just tried it again this morning, still happens. Perhaps its some preference setting or environment issue. I do have pithhelmet, I will try this evening with a clean version.
Comment 3 the codist 2006-02-21 16:12:23 PST
I just did a Reset Safari... and now the site comes up normally. Pithhelmet on or off didn't matter but when I cleared out everything it seemed to work normally again. Some cached item must have been the cause but now I don't know what it was.
Comment 4 the codist 2006-02-21 19:44:18 PST
After using safari for a couple hours, I retried the site and again got the source code display. Resetting Safari again cleared the issue. Some timing thing or cache item is causing this, but still can't pin down any specific cause.
Comment 5 Salman Abdulali 2006-03-17 12:44:23 PST
I frequently see the same problem at projecteuclid.org.  Examples of pages where this occurs frequently enough to be annoying are

http://projecteuclid.org/jdg
http://projecteuclid.org/annm
Comment 6 Alex Taylor 2006-04-27 16:46:15 PDT
Intermittant problem, original submitters link appears to be down, reproducable with the links contained Comment #5.

Haven't managed to find a common cause, the XHTML seems to just get dumped onto the screen in text/plain on some renders. Occured first when I used 'Back' to go back to the cached version in history.
Comment 7 tim bates 2006-07-18 02:52:13 PDT
(In reply to comment #5)
> I frequently see the same problem at projecteuclid.org
> http://projecteuclid.org/jdg
> http://projecteuclid.org/annm

Loaded both 20 times each. no problems with 2.0.4 (419.3)  revision 15498
Comment 8 Salman Abdulali 2006-07-18 11:31:30 PDT
(In reply to comment #7)

The problem is still around in 2.0.4 (419.3). Since it is intermittent, it may not always be visible.
Comment 9 Peter Pedaci 2007-01-10 04:26:38 PST
Same thing at www.liebedeinestadt.de
No way to pin it down to a specific reason, can't reproduce the error on the pages linked in comment #5, though... 
Comment 10 David Kilzer (:ddkilzer) 2007-01-10 05:43:30 PST
Need to capture a packet trace of getting a page returned as source code.  It may simply be a server-side issue.  Does the same thing happen in browsers other than Safari, or just in Safari?

Comment 11 Peter Pedaci 2007-01-10 13:58:27 PST
As far as I've been testing it (and that was quite a bit) it only appeared with Safari. Could indeed be a server side thing, though...
Comment 12 the codist 2007-01-13 09:15:12 PST
When I originally got this error I tried extensively in Firefox but never got the same behavior. The original site I referenced is different now. Don't use it for testing.
Comment 13 David Kilzer (:ddkilzer) 2007-01-13 15:05:17 PST
Do any of you have Haxies or Input Managers installed like SafariStand, PithHelment or Saft?

http://www.unsanity.com/haxies/ape
http://pimpmysafari.com/

If so, could you try removing/disabling them, then see if the problem still occurs?

Comment 14 the codist 2007-01-13 16:51:23 PST
When I tested the bug originally, I tested with both pith helmet installed and not, didn't make a difference.
Comment 15 Peter Pedaci 2007-01-14 08:09:54 PST
I have Application Enhancer installed, but since I can't reproduce the error at the moment, I can't either test it with AE disabled... will try again and ask my colleague who did see the error too, if he had AE installed.
Comment 16 Salman Abdulali 2007-01-14 13:47:38 PST
(In reply to comment #13)

I have none of these installed, but regularly run across this problem. A typical (but not always reproducible) experience is:

1.  Go to http://projecteuclid.org/jdg . Page displays correctly.
2.  Visit some other web page.
3.  Now go back to http://projecteuclid.org/jdg . See the source code being displayed.
4.  Click reload. Page displays normally.
Comment 17 David Kilzer (:ddkilzer) 2007-01-14 14:07:50 PST
(In reply to comment #16)
> A typical (but not always reproducible) experience is:
> 
> 1.  Go to http://projecteuclid.org/jdg . Page displays correctly.
> 2.  Visit some other web page.
> 3.  Now go back to http://projecteuclid.org/jdg . See the source code being
> displayed.
> 4.  Click reload. Page displays normally.

W00t!  I just reproduced this!  Confirmed using locally-built debug build of WebKit r18845 with Safari 2.0.4 (419.3) on Mac OS X 10.4.8 (8L127).

Comment 18 David Kilzer (:ddkilzer) 2007-01-14 14:12:51 PST
Steps to reproduce:

1. Open Safari/WebKit.
2. Empty cache (cmd-option-E).
3. Paste URL in to address bar and hit Enter: http://projecteuclid.org/jdg
4. Paste same URL in to address bar and hit Enter again: http://projecteuclid.org/jdg

Expected results:

Page is rendered as HTML.

Actual results:

Page is rendered as source!

Regression:

Haven't tested shipping Safari yet.

Notes:

None.

Wild speculation:

This appears to have something to do with the back/forward cache and page loading.  It may be specific to XHTML documents, too.

Comment 19 David Kilzer (:ddkilzer) 2007-01-14 14:20:06 PST
(In reply to comment #18)
> Regression:
> 
> Haven't tested shipping Safari yet.

Same issue occurs in shipping Safari 2.0.4 (419.3) on Mac OS X 10.4.8 (8L127).

> Wild speculation:
> 
> This appears to have something to do with the back/forward cache and page
> loading.  It may be specific to XHTML documents, too.

May also have something to do with redirects.

Comment 20 David Kilzer (:ddkilzer) 2007-01-14 17:18:48 PST
Here's what happens as the HTTP protocol level (assuming you're starting with an empty browser cache--use Cmd-Opt-E to empty it):

1. Safari requests http://projecteuclid.org/jdg

GET /jdg HTTP/1.1
Accept-Language: en
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/420+ (KHTML, like Gecko) Safari/419.3
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Connection: keep-alive
Host: projecteuclid.org

2. Web server returns response:

HTTP/1.1 200 OK
Date: Mon, 15 Jan 2007 00:31:54 GMT
Server: Apache/1.3.31 (Unix) mod_perl/1.29
Last-Modified: Fri, 01 Oct 2004 16:17:43 GMT
ETag: "12d85a-214-415d8327"
Accept-Ranges: bytes
Content-Length: 532
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html

3. JavaScript in response causes Safari to load: http://projecteuclid.org/Dienst/UI/1.0/Journal?authority=euclid.jdg

GET /Dienst/UI/1.0/Journal?authority=euclid.jdg HTTP/1.1
Accept-Language: en
Accept-Encoding: gzip, deflate
Referer: http://projecteuclid.org/jdg
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/420+ (KHTML, like Gecko) Safari/419.3
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Connection: keep-alive
Host: projecteuclid.org

4. Web server returns response:

HTTP/1.1 200 OK
Server: Apache/1.3.31 (Unix) mod_perl/1.29
Date: Mon, 15 Jan 2007 00:31:54 GMT
Content-Type: text/html; charset=utf-8

5. User requests /jdg in Safari again:  http://projecteuclid.org/jdg

GET /jdg HTTP/1.1
Accept-Language: en
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/420+ (KHTML, like Gecko) Safari/419.3
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Connection: keep-alive
Host: projecteuclid.org

6. Web server returns response:

HTTP/1.1 200 OK
Date: Mon, 15 Jan 2007 00:31:58 GMT
Server: Apache/1.3.31 (Unix) mod_perl/1.29
Last-Modified: Fri, 01 Oct 2004 16:17:43 GMT
ETag: "12d85a-214-415d8327"
Accept-Ranges: bytes
Content-Length: 532
Keep-Alive: timeout=5, max=89
Connection: Keep-Alive
Content-Type: text/html

7. JavaScript in response causes Safari to load: http://projecteuclid.org/Dienst/UI/1.0/Journal?authority=euclid.jdg

GET /Dienst/UI/1.0/Journal?authority=euclid.jdg HTTP/1.1
Accept-Language: en
Accept-Encoding: gzip, deflate
Referer: http://projecteuclid.org/jdg
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/420+ (KHTML, like Gecko) Safari/419.3
If-Modified-Since: Mon, 15 Jan 2007 00:31:54 GMT
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Connection: keep-alive
Host: projecteuclid.org

8. Web server returns response:

HTTP/1.1 304
Date: Mon, 15 Jan 2007 00:31:58 GMT
Server: Apache/1.3.31 (Unix) mod_perl/1.29
Keep-Alive: timeout=5, max=88
Connection: Keep-Alive
Content-Type: text/plain; charset=ISO-8859-1

Connection: Keep-Alive, Keep-Alive
Keep-Alive: timeout=5, max=87


Note two problems here:

1. The Content-Type is set to text/plain instead of text/html.
2. The Content-Type line has a blank line after it (two sets of \r\n characters), which makes the last two lines a part of the body of the response instead of the header.

I have no idea what's causing this, other than some bad server-side code.  (Since it happens on more than one web site, I'm going to have to guess that it's some open-source code that is behaving consistently incorrectly on whatever site it's installed on.)

I'm also not sure whether Safari should "try harder" to get the content type right", or to use its original cached MIME type (text/html) versus the MIME type given by the 304 response (text/plain).

One other interesting thing is that neither Firefox 2.0.0.1 nor Opera 9.10 exhibit this behavior, although OmniWeb 5.5.2 does.

Comment 21 David Kilzer (:ddkilzer) 2007-01-14 18:13:09 PST
A much shorter way to reproduce the issue (sending If-Modified-Since header with request):

$ telnet projecteuclid.org 80
Trying 128.84.158.74...
Connected to projecteuclid.org.
Escape character is '^]'.
GET /Dienst/UI/1.0/Journal?authority=euclid.annm HTTP/1.1
Host: projecteuclid.org
If-Modified-Since: Mon, 15 Jan 2007 01:35:16 GMT
Connection: close

HTTP/1.1 304
Date: Mon, 15 Jan 2007 01:41:35 GMT
Server: Apache/1.3.31 (Unix) mod_perl/1.29
Connection: close
Content-Type: text/plain; charset=ISO-8859-1

Connection: close

Connection closed by foreign host.

--

Why is WebKit caching this page?  It has a question mark in the URL, which commonly means it's a CGI script and shouldn't be cached.  Perhaps it's the use of location.replace() that's causing it to be cached?

This seems to be the root of the problem since other browsers don't appear to cache this document, or they at least don't send an If-Modified-Since header when requesting the document a second time.

Comment 22 David Kilzer (:ddkilzer) 2007-01-14 18:42:29 PST
After talking to Maciej on IRC, it seems that the Foundation classes (like NSURLRequest) are actually doing the caching of this URL and then adding the If-Modified-Since header on the subsequent requests which causes the server to send a broken 304 response.  I will file a Radar bug for the Foundation caching issue.

In the meantime, it may be prudent to evangelize the projecteuclid.org web site administrator to fix the 304 response for their pages.

Comment 23 David Kilzer (:ddkilzer) 2007-01-14 19:13:26 PST
Filed Radar bug:

<rdar://problem/4923955>
WebKit sends If-Modified-Since header for URL that other browsers do not (7396)

I'm going to leave this bug open and change the Component to Evangelism until the projecteuclid.org site is fixed.  It seems the original web site (consonancemusic.com) has already been fixed.

Comment 24 David Kilzer (:ddkilzer) 2007-01-14 19:15:28 PST
Created attachment 12437 [details]
Original source of projecteuclid.org/jdg
Comment 25 David Kilzer (:ddkilzer) 2007-01-14 19:16:29 PST
Created attachment 12438 [details]
Original source of projecteuclid.org/Dienst/UI/1.0/Journal?authority=euclid.jdg
Comment 26 Alexey Proskuryakov 2007-01-15 07:22:09 PST
(In reply to comment #21)
> Why is WebKit caching this page?  It has a question mark in the URL, which
> commonly means it's a CGI script and shouldn't be cached.

FWIW, I couldn't find anything to this effect in the RFC (the only thing special to query URLs seems to be that the cache shouldn't ever treat those as fresh, section 13.9).

Respecting the Content-Type of a 304 response sounds like a bug to me. 
Comment 27 David Kilzer (:ddkilzer) 2007-01-15 13:01:32 PST
(In reply to comment #26)
> Respecting the Content-Type of a 304 response sounds like a bug to me. 

Added note to Radar bug about this as well.  I don't see where WebKit does any 304 http status handling.
Comment 28 David Kilzer (:ddkilzer) 2007-01-23 03:55:06 PST
See also Bug 12341 Comment #19.

Comment 29 David Kilzer (:ddkilzer) 2007-03-03 12:30:11 PST
Note that this issue has been resolved for Safari in a recent Leopard build.

Closing this bug as RESOLVED/INVALID since it is not a WebKit bug, and since it's no longer necessary for the web site administrator to fix the 304 HTTP response.