Bug 22466 - REGRESSION (35867): Many resources missing when saving webarchive of webkit.org
Summary: REGRESSION (35867): Many resources missing when saving webarchive of webkit.org
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: Macintosh OS X 10.5
: P2 Normal
Assignee: David Kilzer (:ddkilzer)
URL: http://webkit.org/
Keywords: HasReduction, InRadar, Regression
Depends on:
Blocks:
 
Reported: 2008-11-24 13:55 PST by David Kilzer (:ddkilzer)
Modified: 2008-12-01 17:08 PST (History)
3 users (show)

See Also:


Attachments
WebArchive saved from nightly build r38707 (56.29 KB, application/x-webarchive)
2008-11-24 13:56 PST, David Kilzer (:ddkilzer)
no flags Details
Patch v1 (naïve fix) (1.32 KB, patch)
2008-11-24 14:07 PST, David Kilzer (:ddkilzer)
no flags Details | Formatted Diff | Diff
Patch v2 (22.71 KB, patch)
2008-11-25 06:14 PST, David Kilzer (:ddkilzer)
no flags Details | Formatted Diff | Diff
Patch v3 (23.01 KB, patch)
2008-11-25 13:56 PST, David Kilzer (:ddkilzer)
no flags Details | Formatted Diff | Diff
Patch v4 (23.87 KB, patch)
2008-11-26 21:40 PST, David Kilzer (:ddkilzer)
no flags Details | Formatted Diff | Diff
Patch v5 (24.00 KB, patch)
2008-11-29 12:49 PST, David Kilzer (:ddkilzer)
beidson: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description David Kilzer (:ddkilzer) 2008-11-24 13:55:56 PST
* SUMMARY
When saving a webarchive of <http://webkit.org/> with WebKit nightly build r38707, there are many resources missing that are linked from the main web page.  (See Bug 11850 for images referenced within CSS that are not saved.)

* STEPS TO REPRODUCE
1. Launch Safari.
2. Open URL:  http://webkit.org/
3. Select "Save As..." from the File menu.
4. Change Format: to "Web Archive".
5. Save webkit.webarchive.
6. Open webkit.webarchive in "Property List Editor" application (installed with Xcode).

* RESULTS
Only 5 subresources are saved from the main web page.  Some of the <color>.css files are missing from the archive.

* REGRESSION
I think this is a regression, but I don't have the ability to verify this now.
Comment 1 David Kilzer (:ddkilzer) 2008-11-24 13:56:44 PST
Created attachment 25450 [details]
WebArchive saved from nightly build r38707

This is the webarchive file saved from WebKit nightly build r38707.
Comment 2 David Kilzer (:ddkilzer) 2008-11-24 14:07:37 PST
Created attachment 25451 [details]
Patch v1 (naïve fix)

This patch fixes the bug, but I don't understand why DocumentLoader::cachedResource() doesn't have a reference to resources loaded on the page.  Are we violating a contract, or should we just be checking for all resources in the WebCore cache instead?
Comment 3 David Kilzer (:ddkilzer) 2008-11-24 14:16:04 PST
Comment on attachment 25451 [details]
Patch v1 (naïve fix)

Clearing review flag and obsoleting since we need a ChangeLog and a layout test.
Comment 4 Antti Koivisto 2008-11-24 15:29:56 PST
http://trac.webkit.org/changeset/35867
Comment 5 David Kilzer (:ddkilzer) 2008-11-25 06:14:19 PST
Created attachment 25479 [details]
Patch v2

Patch with layout test and changelog.
Comment 6 Alexey Proskuryakov 2008-11-25 11:32:53 PST
See also: bug 17151.
Comment 7 David Kilzer (:ddkilzer) 2008-11-25 13:56:38 PST
Created attachment 25497 [details]
Patch v3

Same as Patch v2, except we make sure "Etag" and "Last-Modified" headers are normalized as well.
Comment 8 David Kilzer (:ddkilzer) 2008-11-26 11:23:24 PST
<rdar://problem/6403593>
Comment 9 David Kilzer (:ddkilzer) 2008-11-26 21:40:23 PST
Created attachment 25551 [details]
Patch v4

Null check ArchiveResource after creating it from a CacheResource.  Rearranged the logic in LegacyWebArchive::create() to use more continue statements.
Comment 10 David Kilzer (:ddkilzer) 2008-11-29 12:49:14 PST
Created attachment 25599 [details]
Patch v5

Changes since Patch v4:
- Also normalize the "Server" HTTP response header.
- Added radar bug number to ChangeLogs.
Comment 11 Brady Eidson 2008-12-01 16:34:51 PST
Comment on attachment 25599 [details]
Patch v5

Nice, r+
Comment 12 David Kilzer (:ddkilzer) 2008-12-01 16:54:03 PST
$ git svn dcommit
Committing to http://svn.webkit.org/repository/webkit/trunk ...
        M       LayoutTests/ChangeLog
        A       LayoutTests/http/tests/webarchive/resources/test-preload-resources.css
        A       LayoutTests/http/tests/webarchive/test-preload-resources-expected.webarchive
        A       LayoutTests/http/tests/webarchive/test-preload-resources.html
        M       LayoutTests/platform/qt/Skipped
        M       LayoutTests/platform/win/Skipped
        M       WebCore/ChangeLog
        M       WebCore/loader/archive/cf/LegacyWebArchive.cpp
        M       WebKitTools/ChangeLog
        M       WebKitTools/DumpRenderTree/mac/DumpRenderTree.mm
Committed r38884

http://trac.webkit.org/changeset/38884

Comment 13 David Kilzer (:ddkilzer) 2008-12-01 17:08:22 PST
Follow-up fix:

$ git svn dcommit
Committing to http://svn.webkit.org/repository/webkit/trunk ...
        M       WebCore/ChangeLog
        M       WebCore/loader/archive/cf/LegacyWebArchive.cpp
Committed r38885

http://trac.webkit.org/changeset/38885