Summary: | document.lastModified gives no output if the response doesn't have a Last-Modified header | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Rüdiger Cordes <rc> | ||||||||||||||||
Component: | WebKit Misc. | Assignee: | Adam Barth <abarth> | ||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||
Severity: | Major | CC: | abarth, ap, darin, eric, ian, myrdred, webkit.review.bot | ||||||||||||||||
Priority: | P2 | ||||||||||||||||||
Version: | 420+ | ||||||||||||||||||
Hardware: | Mac | ||||||||||||||||||
OS: | OS X 10.4 | ||||||||||||||||||
Attachments: |
|
Description
Rüdiger Cordes
2006-04-19 03:17:58 PDT
Created attachment 7832 [details]
test case
document.lastModified works fine in Safari when the server provides "Last-Modified" HTTP header, see <http://nypop.com/~ap/webkit/last-modified.html>. Although Firefox syntesizes the time if this header is missing (maybe it uses "Date" header?), it's not obvious to me if this is correct behavior. This seems to be a duplicate of bug <A HREF="http://bugzilla.opendarwin.org/show_bug.cgi?id=4363">4363</A> Bug 4363 is related, but different - it is only about preserving the exact value of the Last-Modified header, while this bug asks to synthesize Last-Modified if it's not present. I'm confirming this bug as a difference from Firefox, but additional research is nedded in order to decide if we want to fix it. A really easy way to repro this bug is to stick in a file: <html><head><script>alert(document.lastModified);</script></head><body/></html> Open it from a file:/// URL, you'll get a blank alert. Put it on a Web server, you'll get the actual last modified date. document.lastModified is not part of the HTML4 specification, but it did make it into HTML5. HTML5 is very specific about how the value should be returned (as a precisely formatted string, which seems odd to me... why not use a DOMTimestamp or whatever it's called?) and mandates that document.lastModified returns the "source file's last modification", which seems pretty clear to me that it should return the correct date for local files. Ok. But please note that HTTP responses without a Last-Modified header are different - HTML5 says that the attribute should return "01/01/1970 00:00:00", and that's not what Firefox did when I tested. It is likely that fixes for HTTP and for local files will be separate. Created attachment 27200 [details]
Patch to fix last modified when header is missing.
I recognize that IE and FireFox both return the current time when the last modified is empty. However, the spec says to return the Unix Epoc.
Created attachment 27201 [details]
Fix for lastModified when there is no header from server
Dimitry noticed that unixEpoc should have been static.
Firefox and IE don't return the current date and time for file:// URLs, they return the file's last modified datetime from the OS. This complies with the HTML5 spec at 2.1.2: "The Document's source file's last modification date and time must be derived from relevant features of the networking protocols used, e.g. from the value of the HTTP Last-Modified header of the document, or from metadata in the file system for local files." Returning the epoch is not in compliance when there exists metadata in the filesystem to indicate the actual last modified datetime. (In reply to comment #7) > I recognize that IE and FireFox both return the current time when the last > modified is empty. However, the spec says to return the Unix Epoc. Ian, could you confirm if this is intentional, or a bug in HTML5? Comment on attachment 27201 [details] Fix for lastModified when there is no header from server > + static String unixEpoc = "01/01/1970 00:00:00"; A static String will be destroyed at process exit time, affecting performance. You can use a DEFINE_STATIC_LOCAL macro to create the variable in a way that won't affect performance. > + if (!header || !header.length()) { > + return unixEpoc; > + } > + return header; We don't put braces around single line statements. Also, the check can be just header.isEmpty(). This patch needs to include an automated test, and a ChangeLog modification. r- due to the reasons given above, and because it's not clear if the specified behavior is intentional. The idea is for the spec to exactly match IE here. If that's not what the spec says, please e-mail the list with the test case showing the difference. Thanks! Note that "epoch" is how the word is spelled, not "epoc". Created attachment 27287 [details]
Incorporates the style and other feedback from the review.
I have incorporated the feedback from the review. I still need to figure out how to create a test that returns a custom header.
I understand that this behavior is in conflict with what IE and FF do when the web server does not return a value for the Last-Modified header. However, this is what the specification requires. Although returning the current time may seem like the right thing to do, and 1970 is very unlikely, the fact is that scripts which are returning content should return a valid date and 1970 is more likely to catch someone's eye and lead to a fix for their incomplete headers. This is certainly better than the current behavior of an empty String.
This code has nothing to do with the last modified date of local files returned with file://. I don't know where that URL is handled so I don't know where it is generating the Document. It is true that when the browser is reading a file it should properly set the last-modified date from the file system information. When that is fixed this fix will not longer return 1970.
I appreciate your time in reviewing this and helping me match the WebKit style.
(In reply to comment #14) > Although returning the current time may > seem like the right thing to do, and 1970 is very unlikely, the fact is that > scripts which are returning content should return a valid date and 1970 is more > likely to catch someone's eye and lead to a fix for their incomplete headers. Sure, but helping someone notice they're doing something wrong and getting them to fix their website is a secondary goal. Ensuring compatibility with websites that have been tested only with the two top market share web browsers is a primary goal. Maybe its not a bug? When the header of the webserver contains no entry for last.Modified its the pure truth when JavaScript has no value. Rüdiger Why are providers not giving this information? Rüdiger, scripts are not providing it when they are generating their own headers. It is probably just an oversight on the author's part. I don't think an empty string is ever intentional. The header is required by HTTP so it is just bad form to not have it at all. Darin, I will change the fix to match IE and FireFox. It would be nice to give this feedback to the HTML5 team. It seems to me that the current time is a better answer given the number of files online that were generated in 1970. :) Can someone point me to a test that mocks a custom header so I can create a test? Are their mock time objects since the new behavior will be showing the current time without one? Within JavaScript you probably want to know if the given time is from the server in the header and true or if its just a mock of the browser. What is easier: checking if its empty, if its 1970 or if its the actual time? The last modified time for a static HTML file is easy. What is the time for a PHP file? Maybe when the time of a PHP file I S the actual time you will not be able to decide if it is a mock of the browser or the true value? > Note that "epoch" is how the word is spelled, not "epoc". Epoch fail! > This code has nothing to do with the last modified date > of local files returned with file:// In my view, the problem with file:// URLs *is* the bug. In comparison, the problem when an http:// URL does not return a Last-Modified header is trivial. I mean, sure, WebKit doesn't comply with the spec in that case. Nor does it comply with IE's nor Firefox's breed of noncompliance. And sure, this fix is necessary to bring WebKit closer to full spec compliance. But really, how often does this happen? What modern web server doesn't return a Last-Modified header as a matter of course? The case of opening a file:// URL on the other hand is common, and WebKit fails to comply with the spec every single time a file:// URL is opened. Both Firefox and IE do comply with this very common scenario. I think a good fix would be to have an m_lastModified member of Document itself, and have the loader be responsible for filling this value. The HTTP loader can fill the value with the contents of a Last-Modified header, or the Unix epoch if the header is missing, while the file loader can do whatever OS-specific stuff is needed to extract the file's last modified date and fill in the document's field appropriately. It's not good for Document to refer to HTTP headers anyway; Document is supposed to be scheme-agnostic. That sounds great! ;-) To provide a custom header, you can use PHP, see e.g. <http://trac.webkit.org/browser/trunk/LayoutTests/http/tests/misc/resources/404image.php>. The simplest example is just "<? header("Date: Thu, 01 Jun 2006 06:09:43 GMT"); ?><script>alert(document.lastModified)</script>". This is really about compatibility with other browsers, not deciding what an ideal behavior would be. As Ian asked to, please tell the WHATWG mailing list if the spec doesn't match IE and Firefox (and which versions). While I agree that we should fix document.lastModified for local files, too, but it's not clear whether this can even be done without modifications to underlying libraries such as CFNetwork on the Mac. It will be more practical to track fixing local files as a separate bug. Comment on attachment 27287 [details] Incorporates the style and other feedback from the review. r-, because we want to match other browsers, not the draft spec. You have tabs in ChangeLog, please use spaces. + DEFINE_STATIC_LOCAL(AtomicString,unixEpoch,("01/01/1970 00:00:00")); Not that this matters any more if we are not going to use the UNIX epoch, but our coding style asks for spaces after commas. + // See http://www.w3.org/TR/html5/dom.html#lastmodified + // See http://crbug.com/2400 I'm not sure if these comments add much - once we settle on a certain behavior, and the spec is fixed, it will hopefully be pretty clear why we chose it, even without bug references. And there's always svn blame. What was the conclusion for HTML5? Still needs testing in IE/Firefox. One thing I'm unsure of is whether the local date, or Date header is used to synthesize Last-Modified. I did some testing and changed the spec to return the current time instead of 1970. At the moment even with generating a header in PHP the Javascript alert gives no last.Modified time. <?php header("Date: Thu, 01 Jun 2006 06:09:43 GMT"); ?> <script>alert(document.lastModified)</script> Firefox 3 gives a time but its not the time specified in the php command. Its the time of requesting the file. So seems that the biggest web hosting provider in Germany transmitts no last.Modified time at all? Or its deleted while transfered over DSL? Whereas http://www.topster.de/http-header reports: HTTP/1.1 302 Found Date: Sun, 26 Apr 2009 11:01:23 GMT Server: Apache Location: https://macshot.de/php-aquarium/aquarium/code.php Connection: close Content-Type: text/html; charset=iso-8859-1 The same provider and a different domain WITH last.Modified: (file not mentioned: http://britenweb.de/menue.php) HTTP/1.1 200 OK Date: Sun, 26 Apr 2009 11:42:03 GMT Server: Apache X-Powered-By: PHP/4.4.9 Last-Modified: Sun, 05 Oct 2008 16:56:23 GMT Connection: close Content-Type: text/html Ok. It's time for this bug to come to Jesus. Sayth the spec: The Document's source file's last modification date and time must be derived from relevant features of the networking protocols used, e.g. from the value of the HTTP Last-Modified header of the document, or from metadata in the file system for local files. If the last modification date and time are not known, the attribute must return the current date and time in the above format. http://www.whatwg.org/specs/web-apps/current-work/#dom-document-lastmodified Created attachment 67532 [details]
Patch
Yes, ladies and gentlemen. It appears there were previously zero tests for this feature. Created attachment 67533 [details]
Patch
Comment on attachment 67533 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=67533&action=prettypatch > WebCore/dom/Document.cpp:3623 > + // FIXME: If this document came from the file system, HTML5 tells us to > + // read the last modification date from the file system. I’m not sure HTML5 is the best colloquial name for the HTML specification. > WebCore/dom/Document.cpp:3628 > + return String::format("%02d/%02d/%04d %02d:%02d:%02d", > + date.month()+1, date.monthDay(), date.fullYear(), > + date.hour(), date.minute(), date.second()); We normally do not line subsequent lines up with parentheses like this. We normally put spaces around operators like the "+" here in "+1". Comment on attachment 67533 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=67533&action=prettypatch > WebCore/dom/Document.cpp:3623 > + // read the last modification date from the file system. Ok. What name would you prefer? My thought is that this comment won't be addressed for a long time, so I wanted to be specific in case future versions of HTML do something different. > WebCore/dom/Document.cpp:3628 > + date.hour(), date.minute(), date.second()); Okiedokes. Created attachment 67582 [details]
Patch for landing
Comment on attachment 67582 [details] Patch for landing Clearing flags on attachment: 67582 Committed r67513: <http://trac.webkit.org/changeset/67513> All reviewed patches have been landed. Closing bug. http://trac.webkit.org/changeset/67513 might have broken Qt Linux Release The following changes are on the blame list: http://trac.webkit.org/changeset/67513 http://trac.webkit.org/changeset/67514 Looks like the regression test added in this patch is failing on one GTK and one Qt bot. (In reply to comment #43) > Looks like the regression test added in this patch is failing on one GTK and one Qt bot. It might time zone thing. I'll try to fix it. |