Bug 25562 - Potential crash after ApplicationCacheStorage::storeNewestCache() fails.
Summary: Potential crash after ApplicationCacheStorage::storeNewestCache() fails.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-05-05 04:18 PDT by Andrei Popescu
Modified: 2009-06-04 13:07 PDT (History)
2 users (show)

See Also:


Attachments
Check return value of ApplicationCacheStorage::storeNewestCache() (2.06 KB, patch)
2009-05-05 04:34 PDT, Andrei Popescu
ap: review-
Details | Formatted Diff | Diff
Fix for 25562 (take 2) (10.29 KB, patch)
2009-05-06 05:40 PDT, Andrei Popescu
no flags Details | Formatted Diff | Diff
30047: Fix for 25562 (take 3) (10.29 KB, patch)
2009-05-06 07:39 PDT, Andrei Popescu
ap: review-
Details | Formatted Diff | Diff
Fix for 25562 (take 4) (12.19 KB, patch)
2009-05-27 04:27 PDT, Andrei Popescu
ap: review-
Details | Formatted Diff | Diff
30702: Fix for 25562 (take 5) (11.51 KB, patch)
2009-06-02 13:53 PDT, Andrei Popescu
no flags Details | Formatted Diff | Diff
30880: 30702: Fix for 25562 (take 5) (13.09 KB, patch)
2009-06-03 04:16 PDT, Andrei Popescu
ap: review+
Details | Formatted Diff | Diff
30899: 30880: 30702: Fix for 25562 (take 6) (13.30 KB, patch)
2009-06-03 07:34 PDT, Andrei Popescu
ap: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andrei Popescu 2009-05-05 04:18:04 PDT
In ApplicationCacheGroup.cpp:736 we have:

(...)
cacheStorage().storeNewestCache(this, maxAppCacheSize());
if (oldNewestCache)
    cacheStorage().remove(oldNewestCache.get());
postListenerTask(someSuccessCallback);
(...)

The problem with this is that the return value of
cacheStorage().storeNewestCache() is never checked. If it fails (e.g. the
user's data partition if full), the code should fire an error event and bail
out. Instead, the existing newest cache is wiped out, thereby leaving an orphan
cache group in the CachedGroups table. This orphan cache group now points to a
non-existing "newest cache" so next time this group will be loaded, there will
be no associated manifest resource and an assertion will fire in
applicationcachegroup.cpp:597:

ApplicationCacheResource* newestManifest = m_newestCache->manifestResource();
ASSERT(newestManifest);
Comment 1 Andrei Popescu 2009-05-05 04:34:08 PDT
Created attachment 30017 [details]
Check return value of ApplicationCacheStorage::storeNewestCache()
Comment 2 Alexey Proskuryakov 2009-05-05 05:30:43 PDT
Comment on attachment 30017 [details]
Check return value of ApplicationCacheStorage::storeNewestCache()

+        Check return value of ApplicationCacheStorage::storeNewestCache()
+        at its call site in ApplicationCacheGroup::checkIfLoadIsComplete().
+        https://bugs.webkit.org/show_bug.cgi?id=25562
+
+        * loader/appcache/ApplicationCacheGroup.cpp:
+        (WebCore::ApplicationCacheGroup::checkIfLoadIsComplete):

I'd structure the ChangeLog differently:

        https://bugs.webkit.org/show_bug.cgi?id=25562
        Potential crash after ApplicationCacheStorage::storeNewestCache() fails

        * loader/appcache/ApplicationCacheGroup.cpp:
        (WebCore::ApplicationCacheGroup::checkIfLoadIsComplete):
        Check return value of ApplicationCacheStorage::storeNewestCache(), and <...>
        if it failed.

+            // We failed to store the new cache. Report an error.
+            postListenerTask(&DOMApplicationCache::callErrorListener, m_associatedDocumentLoaders);

Is this sufficient? I would expect cache failure steps to be run <http://www.whatwg.org/specs/web-apps/current-work/#cache-failure-steps>, just like in case when subresource loading fails. With this code, we just get a non-stored newest cache, which will likely cause problems later on.

r-, because I think that cache failure steps need to be run here.
Comment 3 Andrei Popescu 2009-05-06 05:39:04 PDT
(In reply to comment #2)
> (From update of attachment 30017 [details] [review])
> +        Check return value of ApplicationCacheStorage::storeNewestCache()
> +        at its call site in ApplicationCacheGroup::checkIfLoadIsComplete().
> +        https://bugs.webkit.org/show_bug.cgi?id=25562
> +
> +        * loader/appcache/ApplicationCacheGroup.cpp:
> +        (WebCore::ApplicationCacheGroup::checkIfLoadIsComplete):
> 
> I'd structure the ChangeLog differently:
> 
>         https://bugs.webkit.org/show_bug.cgi?id=25562
>         Potential crash after ApplicationCacheStorage::storeNewestCache() fails
> 

Done.

>         * loader/appcache/ApplicationCacheGroup.cpp:
>         (WebCore::ApplicationCacheGroup::checkIfLoadIsComplete):
>         Check return value of ApplicationCacheStorage::storeNewestCache(), and
> <...>
>         if it failed.
> 
> +            // We failed to store the new cache. Report an error.
> +            postListenerTask(&DOMApplicationCache::callErrorListener,
> m_associatedDocumentLoaders);
> 
> Is this sufficient? I would expect cache failure steps to be run
> <http://www.whatwg.org/specs/web-apps/current-work/#cache-failure-steps>, just
> like in case when subresource loading fails. With this code, we just get a
> non-stored newest cache, which will likely cause problems later on.
> 

Done. 

New patch uploaded.
Comment 4 Andrei Popescu 2009-05-06 05:40:36 PDT
Created attachment 30047 [details]
Fix for 25562 (take 2)
Comment 5 Andrei Popescu 2009-05-06 07:39:06 PDT
Created attachment 30048 [details]
30047: Fix for 25562 (take 3)

Uploading the correct patch this time.
Comment 6 Alexey Proskuryakov 2009-05-07 03:58:52 PDT
I'm yet to think through what happens here, but here are some preliminary comments.

It would be easier to understand this patch if it had detailed per-function comments in the ChangeLog.

+            mainResourceCache->group()->associateDocumentLoaderWithCache(documentLoader, mainResourceCache, false);

We are not doing this in code consistently, but we try to stop using booleans as parameters in cases like this, which are difficult to read. You could define an enum to have named parameters, or you could split this function in two.

+void ApplicationCacheGroup::associateDocumentLoaderWithCache(DocumentLoader* loader, ApplicationCache* cache, bool wasPendingEntry)

There is a slight clash of terminology between HTML5 and WebCore - the pending entries correspond to document loaders, not to any entries in m_pendingEntries map. My choice was to have a comment before definition m_pendingMasterResourceLoaders, and to avoid using the word "entry" for pending master resource loaders. The argument name doesn't follow this convention.

What are the possible side effects of cacheStorage().storeNewestCache() that need to be undone when this function returns false?

+    // The mapped flag is true if the loader was a 'pending master entry' and false otherwise.

This comment could be a little more helpful if it had some sort of a hint at why it is important to remember this forever (my guess is that it's transient information that is only necessary during update, in which case it would be best to keep it in a separate data member).

It's too bad that we don't have any way to test this. One way to simulate errors would be to create a shim library for sqlite that would return errors when asked to. That's out of the scope for this bug, but would be extremely helpful for improving appcache robustness, as well as other uses of sqlite in WebCore.
Comment 7 Alexey Proskuryakov 2009-05-07 04:04:40 PDT
Comment on attachment 30048 [details]
30047: Fix for 25562 (take 3)

Oh yes, and marking r- to get this out of review queue for now, as this needs some more work in any case.
Comment 8 Alexey Proskuryakov 2009-05-13 08:48:18 PDT
A notable discrepancy between WebCore implementation and the HTML5 cache update algorithm is that in HTML5, master entries are not removed from the list of pending master entries after each one is downloaded. While there is of course no requirement to follow the HTML5 algorithm as long as observable effects are the same, in this case it might make code simpler and easier to understand.
Comment 9 Andrei Popescu 2009-05-27 04:26:53 PDT
Hi Alexey,

Thanks a lot for the review. Back from vacation and a work trip so I can get back to fixing this bug:

(In reply to comment #6)
> I'm yet to think through what happens here, but here are some preliminary
> comments.
> 
> It would be easier to understand this patch if it had detailed per-function
> comments in the ChangeLog.
> 

Added.

> +           
> mainResourceCache->group()->associateDocumentLoaderWithCache(documentLoader,
> mainResourceCache, false);
> 
> We are not doing this in code consistently, but we try to stop using booleans
> as parameters in cases like this, which are difficult to read. You could define
> an enum to have named parameters, or you could split this function in two.
> 

Used an enum.

> +void ApplicationCacheGroup::associateDocumentLoaderWithCache(DocumentLoader*
> loader, ApplicationCache* cache, bool wasPendingEntry)
> 
> There is a slight clash of terminology between HTML5 and WebCore - the pending
> entries correspond to document loaders, not to any entries in m_pendingEntries
> map. My choice was to have a comment before definition
> m_pendingMasterResourceLoaders, and to avoid using the word "entry" for pending
> master resource loaders. The argument name doesn't follow this convention.
> 

Fixed.

> What are the possible side effects of cacheStorage().storeNewestCache() that
> need to be undone when this function returns false?
> 

As far as I can tell, the following:

- The document loaders that correspond to the formerly-pedning master entries need to be removed from m_associatedDocumentLoaders. We do this by calling disassociateDocumentLoader().
- If there was an old newset cache for this application cache group, that needs to be restored and the failed cache needs to be deleted.
- If there wasn't any old newset cache, then deleting the failed cache will delete the groups as well.

> +    // The mapped flag is true if the loader was a 'pending master entry' and
> false otherwise.
> 
> This comment could be a little more helpful if it had some sort of a hint at
> why it is important to remember this forever (my guess is that it's transient
> information that is only necessary during update, in which case it would be
> best to keep it in a separate data member).
> 

Added a comment above the new enum. I don't think we need to add a new member for this.

> It's too bad that we don't have any way to test this. One way to simulate
> errors would be to create a shim library for sqlite that would return errors
> when asked to. That's out of the scope for this bug, but would be extremely
> helpful for improving appcache robustness, as well as other uses of sqlite in
> WebCore.
> 

The way I tested was to force a maximum size on the database file. We will be able to do this properly when we fix 22700 (https://bugs.webkit.org/show_bug.cgi?id=22700) as that allows the chrome client implementer to set the maximum size. The test cases can then try to store something that's larger then the limit and cause SQLite to return an error.

Andrei

Comment 10 Andrei Popescu 2009-05-27 04:27:58 PDT
Created attachment 30702 [details]
Fix for 25562 (take 4)
Comment 11 Alexey Proskuryakov 2009-05-29 09:56:12 PDT
Comment on attachment 30702 [details]
Fix for 25562 (take 4)

Discussed this on IRC in some depth. I have two technical concerns about this patch:
1) Since failing ApplicationCacheStorage::store() leaves memory objects with hanging StorageIDs, it's hard to prove that this broken invariant doesn't cause problems later on.
2) There is a good deal of confusion about m_pendingMasterResourceLoaders and "pending master entries", e.g. in comments about NewDocumentLoader. The names New/OldDocumentLoader names are not good, which I think is a sign of using data structures that don't match the logic well.

I'm really holding you up to a somewhat higher bar than customary, sorry for being so picky. That's because appcache code is rather complicated, and because this doesn't have automated testing to guarantee sustained progress in future patches. So, we'd better get it right on the first try.

More globally, it doesn't seem obvious at all that reinstating the old cache is indeed the best behavior. We already have a fully downloaded up to date application in memory, why not use it? Looks like the spec should talk about out of quota situations.
Comment 12 Andrei Popescu 2009-06-02 13:53:19 PDT
Created attachment 30880 [details]
30702: Fix for 25562 (take 5)
Comment 13 Andrei Popescu 2009-06-02 13:57:19 PDT
Thanks Alexey,

(In reply to comment #11)
> (From update of attachment 30702 [details] [review])
> Discussed this on IRC in some depth. I have two technical concerns about this
> patch:
> 1) Since failing ApplicationCacheStorage::store() leaves memory objects with
> hanging StorageIDs, it's hard to prove that this broken invariant doesn't cause
> problems later on.

Done.

> 2) There is a good deal of confusion about m_pendingMasterResourceLoaders and
> "pending master entries", e.g. in comments about NewDocumentLoader. The names
> New/OldDocumentLoader names are not good, which I think is a sign of using data
> structures that don't match the logic well.
> 

Fixed.

> I'm really holding you up to a somewhat higher bar than customary, sorry for
> being so picky. That's because appcache code is rather complicated, and because
> this doesn't have automated testing to guarantee sustained progress in future
> patches. So, we'd better get it right on the first try.
> 

Agreed.

> More globally, it doesn't seem obvious at all that reinstating the old cache is
> indeed the best behavior. We already have a fully downloaded up to date
> application in memory, why not use it? Looks like the spec should talk about
> out of quota situations.
> 

IMHO, we should reinstate the old cache to match what is on the disk. Else, the user will get the new version of the content after a failed update and will fallback to the old content once the browser is restarted. But I will send an email to WHATWG / Hixie and ask this question while pointing to the discussion in this bug.

All the best,
Andrei

Comment 14 Andrei Popescu 2009-06-03 04:16:41 PDT
Created attachment 30899 [details]
30880: 30702: Fix for 25562 (take 5)

Small improvement: the logger object needs to live inside storeNewestCache() in order to be able to also catch the (unlikely) case where the "Update the newest cache in the group" step fails.

Also improved the comments and removed some whitespace changes.
Comment 15 Alexey Proskuryakov 2009-06-03 05:25:22 PDT
Comment on attachment 30880 [details]
30702: Fix for 25562 (take 5)

> Index: WebCore/ChangeLog

There are tabs in ChangeLog that will need to be fixed before landing.

>  void ApplicationCacheGroup::setNewestCache(PassRefPtr<ApplicationCache> newestCache)
>  {
> -    ASSERT(!m_caches.contains(newestCache.get()));
> +    // We can set an old cache as the new cache. This can happen if we failed
> +    // to store a newly downloaded cache to the database and we are now reinstating
> +    // the former newest cache.
>  
>      m_newestCache = newestCache;

This comment will look a bit surprising - it would be perfect in ChangeLog, but it answers a question that wasn't asked for a person reading the code without assertion.

> +class ResourceStorageIDLogger {

All the "log" names look as if we were logging something to console. I'd probably say "journal", not "log", e.g. storageIDJournal.add() instead of logger.log(). Maybe there's a better yet name.

> +        LogRecord() : m_resource(0), m_storageID(0) {}

We usually put a space in "{ }".

> +        RefPtr<ApplicationCacheResource> m_resource;

Can this be a plain pointer, not a RefPtr?

> +            if (!store(it->second.get(), cacheStorageID)) {
>                  return false;
> +            }

We don't put braces around single line blocks.

Please consider addressing some of these comments, but this looks good enough to be landed as is, too.

> But I will send an email to WHATWG / Hixie and ask this question while pointing to
> the discussion in this bug.

Please do! It's great to get a well defined defined behavior where we used to crash - it would be even better to have an agreed upon behavior.
Comment 16 Alexey Proskuryakov 2009-06-03 05:28:05 PDT
Comment on attachment 30899 [details]
30880: 30702: Fix for 25562 (take 5)

Transferring r+ to the newer patch that was added while I was reviewing.
Comment 17 Andrei Popescu 2009-06-03 07:34:47 PDT
Created attachment 30904 [details]
30899: 30880: 30702: Fix for 25562 (take 6)

Thanks Alexey, I've addressed your comments and uploaded a new patch. I wasn't sure whether to obsolete the old ones or not so I didn't :)
Comment 18 Andrei Popescu 2009-06-03 07:37:16 PDT
(In reply to comment #15)
> (From update of attachment 30880 [details] [review])
> > Index: WebCore/ChangeLog
> 
> There are tabs in ChangeLog that will need to be fixed before landing.
> 

Fixed.

> >  void ApplicationCacheGroup::setNewestCache(PassRefPtr<ApplicationCache> newestCache)
> >  {
> > -    ASSERT(!m_caches.contains(newestCache.get()));
> > +    // We can set an old cache as the new cache. This can happen if we failed
> > +    // to store a newly downloaded cache to the database and we are now reinstating
> > +    // the former newest cache.
> >  
> >      m_newestCache = newestCache;
> 
> This comment will look a bit surprising - it would be perfect in ChangeLog, but
> it answers a question that wasn't asked for a person reading the code without
> assertion.
> 

Moved to Changelog.

> > +class ResourceStorageIDLogger {
> 
> All the "log" names look as if we were logging something to console. I'd
> probably say "journal", not "log", e.g. storageIDJournal.add() instead of
> logger.log(). Maybe there's a better yet name.
> 

Used "journal" instead.

> > +        LogRecord() : m_resource(0), m_storageID(0) {}
> 
> We usually put a space in "{ }".
> 

Added.

> > +        RefPtr<ApplicationCacheResource> m_resource;
> 
> Can this be a plain pointer, not a RefPtr?
> 

It can. I was over-cautious, I guess. The resources are guaranteed to live for at least as long as the cache object does. But the cache should survive until after storeNewsetCache() has returned.

> > +            if (!store(it->second.get(), cacheStorageID)) {
> >                  return false;
> > +            }
> 
> We don't put braces around single line blocks.
> 

Removed braces.

> Please consider addressing some of these comments, but this looks good enough
> to be landed as is, too.
> 

Sure, done.

> > But I will send an email to WHATWG / Hixie and ask this question while pointing to
> > the discussion in this bug.
> 
> Please do! It's great to get a well defined defined behavior where we used to
> crash - it would be even better to have an agreed upon behavior.
> 

Ok, sending in a bit.

Thanks,
Andrei
Comment 19 Brent Fulgham 2009-06-04 12:59:03 PDT
Assign for landing.
Comment 20 Brent Fulgham 2009-06-04 13:07:29 PDT
Landed in @r44424.  Corrected tab characters in ChangeLog file.