It is possible to call FrameLoader::load with substitute data that is a WebArchive (or another type of archive, such as MHTML for Chromium). This is exposed via both WebKit1 and WebKit2 API on Mac, see e.g. WKPageLoadWebArchiveData() for the latter. Our bisection shows that this functionality got broken in <http://trac.webkit.org/changeset/137607>. By the time MainDocumentLoader tries to decode the archive, the data it tries to use is null. Here is how a debug stack trace looks like, with an assertion being fired: ASSERTION FAILED: data /Volumes/Data/SafariSources/OpenSource/Source/WebCore/loader/archive/cf/LegacyWebArchive.cpp(269) : static PassRefPtr<WebCore::LegacyWebArchive> WebCore::LegacyWebArchive::create(const WebCore::KURL &, WebCore::SharedBuffer *) 1 0x104657d5a WebCore::LegacyWebArchive::create(WebCore::KURL const&, WebCore::SharedBuffer*) 2 0x10350402f WTF::PassRefPtr<WebCore::Archive> WebCore::archiveFactoryCreate<WebCore::LegacyWebArchive>(WebCore::KURL const&, WebCore::SharedBuffer*) 3 0x103503f18 WebCore::ArchiveFactory::create(WebCore::KURL const&, WebCore::SharedBuffer*, WTF::String const&) 4 0x1038f26ee WebCore::DocumentLoader::maybeCreateArchive() 5 0x1038f254c WebCore::DocumentLoader::finishedLoading() 6 0x1046772d1 WebCore::MainResourceLoader::didFinishLoading(double) 7 0x104677085 WebCore::MainResourceLoader::continueAfterContentPolicy(WebCore::PolicyAction, WebCore::ResourceResponse const&) 8 0x1046773e6 WebCore::MainResourceLoader::continueAfterContentPolicy(WebCore::PolicyAction) 9 0x10467732b WebCore::MainResourceLoader::callContinueAfterContentPolicy(void*, WebCore::PolicyAction) 10 0x1046779f8 WebCore::MainResourceLoader::responseReceived(WebCore::CachedResource*, WebCore::ResourceResponse const&) 11 0x104675824 WebCore::MainResourceLoader::handleSubstituteDataLoadNow(WebCore::RunLoopTimer<WebCore::MainResourceLoader>*) 12 0x104679f83 WebCore::RunLoopTimer<WebCore::MainResourceLoader>::fired() One easy way to reproduce this that I'm aware of is: 1. Open http://www.apple.com/startpage/ in Safari. 2. Select Email This Page from the Share menu (a button to the left of address bar, or just Command+I). 3. In the new email message that appears in Mail.app, change the "Send as web content" selection at top right to PDF.
In Radar as <rdar://problem/13089321>. It seems easy to find some way to access substitute data from maybeCreateArchive(), but not easy to decide which design is right. I suspect that more code paths that use DocumentLoader::mainResourceData() function for substitute data could be broken. Nate, would you be interested to take a look at this?
(In reply to comment #1) > In Radar as <rdar://problem/13089321>. > > It seems easy to find some way to access substitute data from maybeCreateArchive(), but not easy to decide which design is right. I suspect that more code paths that use DocumentLoader::mainResourceData() function for substitute data could be broken. > > Nate, would you be interested to take a look at this? I think this is a pretty simple fix, MainResourceLoader::resourceData() should check m_substituteData for data if it is valid. I haven't tried to write a test yet, though.
Do you think that other callers of this function would be OK with that? An archive is not really truly the main resource, even though it's handled like that at the first stage of loading. I do not know much about how archives work.
(In reply to comment #3) > Do you think that other callers of this function would be OK with that? An archive is not really truly the main resource, even though it's handled like that at the first stage of loading. Another option, it appears that we can safely exit early and get the right result in DocumentLoader::maybeCreateArchive() if mainResourceData() returns null. I don't know whether that has side effects, though. I'm trying to decide whether there are other situations where this underlying issue (Non-empty SubstituteData loads leave DocumentLoader::mainResourceData() empty) could be a problem. My guess is Yes, but I don't have definitive proof.
Created attachment 186185 [details] patch (no test) I'm not sure I like this solution, and I'm not sure if there's a good way to test this (besides the repro steps in the initial report), but here's what I'm thinking at the moment.
Comment on attachment 186185 [details] patch (no test) Attachment 186185 [details] did not pass win-ews (win): Output: http://queues.webkit.org/results/16332442
Comment on attachment 186185 [details] patch (no test) View in context: https://bugs.webkit.org/attachment.cgi?id=186185&action=review > Source/WebCore/loader/DocumentLoader.cpp:134 > if (m_mainResourceLoader) > return m_mainResourceLoader->resourceData(); I've built r137606 now, and how it worked was that m_mainResourceLoader->resourceData() was returning non-null data. Is it intentional that it no longer works this way?
The reason ResourceData used to be non-null is that the data was accumulated in ResourceLoader once received. Now that MainResourceLoader does not inherit from ResourceLoader, the data goes nowhere. A fix that restores previous behavior would be one that makes MainResourceLoader::resourceData() return actual data for substitute loads. I can think of two approaches - either try MainResourceLoader::m_substituteData when MainResourceLoader::m_resource is null, or make sure to generate m_resource for these loads. Which one better fits the overall design?
(In reply to comment #8) > The reason ResourceData used to be non-null is that the data was accumulated in ResourceLoader once received. Now that MainResourceLoader does not inherit from ResourceLoader, the data goes nowhere. > > A fix that restores previous behavior would be one that makes MainResourceLoader::resourceData() return actual data for substitute loads. I can think of two approaches - either try MainResourceLoader::m_substituteData when MainResourceLoader::m_resource is null, or make sure to generate m_resource for these loads. > > Which one better fits the overall design? Given that m_resource has no knowledge of SubstituteData loads, fixing this in MainResourceLoader::resourceData() probably makes more sense. Will fix shortly.
Created attachment 186438 [details] Fix in MainResourceLoader
Comment on attachment 186438 [details] Fix in MainResourceLoader View in context: https://bugs.webkit.org/attachment.cgi?id=186438&action=review It seems wasteful to copy all the data. Is there a way to redesign the storage to avoid such copying? > Source/WebCore/loader/MainResourceLoader.cpp:213 > + if (m_substituteData.isValid()) > + return ResourceBuffer::create(m_substituteData.content()->data(), m_substituteData.content()->size()); > return m_resource ? m_resource->resourceBuffer() : 0; I'd have written this differently, putting common case first. But this is OK too.
(In reply to comment #11) > (From update of attachment 186438 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=186438&action=review > > It seems wasteful to copy all the data. Is there a way to redesign the storage to avoid such copying? > > > Source/WebCore/loader/MainResourceLoader.cpp:213 > > + if (m_substituteData.isValid()) > > + return ResourceBuffer::create(m_substituteData.content()->data(), m_substituteData.content()->size()); > > return m_resource ? m_resource->resourceBuffer() : 0; > > I'd have written this differently, putting common case first. But this is OK too. Good point, will land with the reordering.
Created attachment 186448 [details] Patch for landing
Comment on attachment 186438 [details] Fix in MainResourceLoader View in context: https://bugs.webkit.org/attachment.cgi?id=186438&action=review >>> Source/WebCore/loader/MainResourceLoader.cpp:213 >>> return m_resource ? m_resource->resourceBuffer() : 0; >> >> I'd have written this differently, putting common case first. But this is OK too. > > Good point, will land with the reordering. When it comes to early return, I like “unusual case first”, but I guess here there is a small performance reason to put the common case first.
(In reply to comment #14) > (From update of attachment 186438 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=186438&action=review > > >>> Source/WebCore/loader/MainResourceLoader.cpp:213 > >>> return m_resource ? m_resource->resourceBuffer() : 0; > >> > >> I'd have written this differently, putting common case first. But this is OK too. > > > > Good point, will land with the reordering. > > When it comes to early return, I like “unusual case first”, but I guess here there is a small performance reason to put the common case first. I strongly agree with you in the case where the unusual case is much shorter than the common case. When they're both so short, I'm happy to do whatever the reviewer suggests :)
Comment on attachment 186448 [details] Patch for landing Clearing flags on attachment: 186448 Committed r141811: <http://trac.webkit.org/changeset/141811>
All reviewed patches have been landed. Closing bug.