Bug 22700 - ApplicationCache should have size limit
Summary: ApplicationCache should have size limit
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac OS X 10.5
: P2 Normal
Assignee: Adam Barth
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2008-12-05 16:40 PST by Aaron Golden
Modified: 2009-07-24 02:12 PDT (History)
11 users (show)

See Also:


Attachments
Patch to enforce size limit on app cache (2.39 KB, patch)
2008-12-05 16:56 PST, Aaron Golden
darin: review-
Details | Formatted Diff | Diff
New proposal to limit the maximum size of the application cache. (32.86 KB, patch)
2009-04-28 08:32 PDT, Andrei Popescu
no flags Details | Formatted Diff | Diff
New proposal to limit the maximum size of the application cache (take 2). (32.87 KB, patch)
2009-04-28 08:46 PDT, Andrei Popescu
no flags Details | Formatted Diff | Diff
New proposal to limit the maximum size of the application cache (take 3). (34.34 KB, patch)
2009-04-29 11:34 PDT, Andrei Popescu
no flags Details | Formatted Diff | Diff
New proposal to limit the maximum size of the application cache (take 4). (35.07 KB, patch)
2009-04-30 05:12 PDT, Andrei Popescu
no flags Details | Formatted Diff | Diff
29909: New proposal to limit the maximum size of the application cache (take 5) (34.70 KB, patch)
2009-05-05 04:09 PDT, Andrei Popescu
no flags Details | Formatted Diff | Diff
New proposal to limit the maximum size of the application cache (take 6) (36.63 KB, patch)
2009-06-17 07:37 PDT, Andrei Popescu
abarth: review-
Details | Formatted Diff | Diff
New proposal to limit the maximum size of the application cache (take 7) (64.80 KB, patch)
2009-07-01 17:00 PDT, Andrei Popescu
no flags Details | Formatted Diff | Diff
New proposal to limit the maximum size of the application cache (take 7) (64.23 KB, patch)
2009-07-01 17:08 PDT, Andrei Popescu
ap: review-
Details | Formatted Diff | Diff
New proposal to limit the maximum size of the application cache (take 8) (64.93 KB, patch)
2009-07-14 06:41 PDT, Andrei Popescu
no flags Details | Formatted Diff | Diff
New proposal to limit the maximum size of the application cache (take 9) (65.54 KB, patch)
2009-07-16 14:10 PDT, Andrei Popescu
andersca: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Aaron Golden 2008-12-05 16:40:29 PST
Right now the offline application cache could grow arbitrarily large.  I think we need a mechanism to limit the size of the application cache to something reasonable.
Comment 1 Aaron Golden 2008-12-05 16:56:08 PST
Created attachment 25801 [details]
Patch to enforce size limit on app cache

This patch enforces a size limit on the application cache.  The limit will be either 1 / 32 times the size of the file system or 25% of the size of the remaining space available in the file system, whichever is smaller.  This will allow machines with larger hard drives to store larger application caches, but will prevent machines with large hard drives that happen to be almost out of space anyway from having unreasonably large application caches.
Comment 2 Darin Adler 2008-12-10 14:56:25 PST
Comment on attachment 25801 [details]
Patch to enforce size limit on app cache

> +#include <sys/statvfs.h>

I don't believe that statvfs exists on all the platforms that WebKit supports, so the function to get this information needs to go into FileSystem.h in the platform directory.

> +static long long _maximumCacheDatabaseSize(void)
> +{
> +    static long long maximumCacheDatabaseSize = -1LL;
> +    if (maximumCacheDatabaseSize == -1LL) {
> +    	struct statvfs statistics;
> +    	if (!statvfs("/", &statistics)) {
> +            // Just for readability, pull the relevant bits out of the struct.
> +            long long blocksOnFileSystem = statistics.f_blocks;
> +            long long blocksAvailable = statistics.f_bavail;
> +            long long blockSize = statistics.f_frsize;
> +            
> +            // We limit the cache size to the size of the file system divided by 32 or the size of the remaining
> +            // free space in the file system divided by four, whichever is smaller.
> +            maximumCacheDatabaseSize = MIN(blocksOnFileSystem * blockSize / 32LL, blocksAvailable * blockSize / 4LL);
> +    	} else {
> +            // Fallback strategy:  If for any reason statvfs fails then we will simply use an extremely
> +            // conservative limit for the cache size, 256MB.
> +            maximumCacheDatabaseSize = 256 * 1024 * 1024;
> +        }
> +    }
> +    return maximumCacheDatabaseSize;
> +}

We don't use underscores in our private function names.

There's a cleaner way to write this, which is:

    static long long calculateMaximumCacheDatabaseSize()
    {
        // code goes here; no need to use -1LL as a magic value
    }

    static long long maximumCacheDatabaseSize()
    {
        static long long size = calculateMaximumCacheDatabaseSize();
        return size;
    }

> +        // If the cache has grown too large, wipe it out and start rebuilding.
> +        long long fileSize = 0;
> +        getFileSize(applicationCachePath, fileSize);
> +        if (fileSize > _maximumCacheDatabaseSize())
> +            deleteFile(applicationCachePath);

Is wiping out the entire cache the policy we want? This could be really inconvenient if you're starting up the web browser and the application you expect to be present is entirely gone.

Can we do better?

review- because this needs to be written in a way that will not break compilation on platforms that don't have a <sys/statvfs.h> header.
Comment 3 Andrei Popescu 2009-04-28 08:32:25 PDT
Created attachment 29847 [details]
New proposal to limit the maximum size of the application cache.

In this proposal, the maximum size of the application cache is decided by each ChromeClient implementation and is enforced using SQLiteDatabase::setMaximumSize();

When adding new caches or new resources to existing caches, and the database insertion fails due to reaching the maximum size, the old caches are removed from the database in LRU fashion. The number of caches to remove is determined based on an approximation of the space required to be freed in order to accommodate the new cache.
Comment 4 Andrei Popescu 2009-04-28 08:46:03 PDT
Created attachment 29848 [details]
New proposal to limit the maximum size of the application cache (take 2).

The previous patch (29847) had the wrong column name on the "CREATE INDEX ON Caches" statement in ApplicationCacheStorage.cpp:423. This new patch fixes that.
Comment 5 Michael Nordman 2009-04-28 15:40:01 PDT
Some drive-by comments and questions...

* What happens if a cacheGroup that is in the act of being updated gets evicted? Maybe those not be subject to eviction?

* What precautions prevent the particular cache group that is being updated (or added to) and causing the need to evict something from being evicted itself?

* The eviction logic examines 'caches' and decides which to boot, maybe this should look at 'cacheGroups' instead.

* I'm not sure if a partially updated caches are left around (after the network connection drops midway thru) and the update picks up where it left off upon the next attempt. If so, these partial updates could be the first to go without having to blow away the associated group altogether.

* Maybe the decision making about which cacheGroups to evict should be left to a delegate api provided by the embedding browser? The system currently has no way of determining which groups are more or less important (outside of LRU). I could see that being value added by host browsers.



Comment 6 Andrei Popescu 2009-04-29 11:33:59 PDT
Thanks Michael,

(In reply to comment #5)
> Some drive-by comments and questions...
> 
> * What happens if a cacheGroup that is in the act of being updated gets
> evicted? Maybe those not be subject to eviction?
> 

It seems that an update will simply create a new cache and set it as the "newest cache" for the group, while the old cache is obsoleted. If the update fails due to maximum size being reached, the insertion of this newest cache is rolled back. It could happen that, while trying to make some room for this new cache, the relevant group is wiped out. However, in such a case, it will simply be re-added once the insertion of the new cache is retried after the cache eviction is completed.

> * What precautions prevent the particular cache group that is being updated (or
> added to) and causing the need to evict something from being evicted itself?
> 

If I understand the existing code right, adding a new cache group is an atomic operation. If it turns out that there isn't enough room for it, then all operations will be rolled back. When the eviction routine kicks in, it will not find this new cache group. As for updating and existing cache group, please see my answer above. I could be missing some edge case, though...

> * The eviction logic examines 'caches' and decides which to boot, maybe this
> should look at 'cacheGroups' instead.
> 

Erm, why would that be better than looking at the caches?

> * I'm not sure if a partially updated caches are left around (after the network
> connection drops midway thru) and the update picks up where it left off upon
> the next attempt. If so, these partial updates could be the first to go without
> having to blow away the associated group altogether.
> 

As said above I don't think this is possible (unless I am missing something). The entire cache seems to be downloaded first and then saved, so you'll never have a partially updated cache in the database.

> * Maybe the decision making about which cacheGroups to evict should be left to
> a delegate api provided by the embedding browser? The system currently has no
> way of determining which groups are more or less important (outside of LRU). I
> could see that being value added by host browsers.
> 

It could. However, the chromeclient (i.e. embedding browser) currently doesn't know when an application is being cached. I therefore don't know if it makes sense to ask it about which applications should be evicted either.

I've also uploaded a new patch as I found a couple of problems with my last one:
- I wasn't updating EmptyClients.h so the build would break if SVG is enabled.
- the "PRAGMA page_count" was added only in SQLite 3.6.0 so in older versions getting the current size of the database that way will fail.

I'd be grateful if you could have another look.

Thanks,
Andrei
Comment 7 Andrei Popescu 2009-04-29 11:34:57 PDT
Created attachment 29883 [details]
New proposal to limit the maximum size of the application cache (take 3).
Comment 8 Andrei Popescu 2009-04-30 05:12:54 PDT
Created attachment 29909 [details]
New proposal to limit the maximum size of the application cache (take 4).

While poking around the existing code, I found a bug that can cause a crash:

- in ApplicationCacheGroup.cpp:736 we have:

(...)
cacheStorage().storeNewestCache(this, maxAppCacheSize());
if (oldNewestCache)
    cacheStorage().remove(oldNewestCache.get());
postListenerTask(someSuccessCallback);
(...)

The problem with this is that the return value of cacheStorage().storeNewestCache() is never checked. If it fails (e.g. the user's data partition if full), the code should fire an error event and bail out. Instead, the existing newest cache is wiped out, thereby leaving an orphan cache group in the CachedGroups table. This orphan cache group now points to a non-existing "newest cache" so next time this group will be loaded, there will be no associated manifest resource and an assertion will fire in applicationcachegroup.cpp:597:

ApplicationCacheResource* newestManifest = m_newestCache->manifestResource();
ASSERT(newestManifest);

Rather than creating a separate bug for this, I thought I'd fix it in the same patch.
Comment 9 Alexey Proskuryakov 2009-05-04 12:46:43 PDT
<rdar://problem/6283425>
Comment 10 Alexey Proskuryakov 2009-05-05 01:58:55 PDT
Comment on attachment 29909 [details]
New proposal to limit the maximum size of the application cache (take 4).

Based IRC discussion, clearing the review flag for now. David Levin has provided a lot of style comments, and I have some concerns about whether LRU is an appropriate policy for the application cache.
Comment 11 Andrei Popescu 2009-05-05 04:09:56 PDT
Created attachment 30016 [details]
29909: New proposal to limit the maximum size of the application cache (take 5)

Fixes the style problems pointed out by David Levin. Will only set the review flag after the discussion on webkit-dev, which will hopefully clarify whether LRU is an appropriate behavior when the app cache is full.
Comment 12 Antti Koivisto 2009-05-06 13:37:22 PDT
Comment on attachment 30016 [details]
29909: New proposal to limit the maximum size of the application cache (take 5)

        
> -    bool storeNewestCache(ApplicationCacheGroup*); // Updates the cache group, but doesn't remove old cache.
> -    void store(ApplicationCacheResource*, ApplicationCache*);
> +    bool storeNewestCache(ApplicationCacheGroup*, int64_t maxAppCacheSize); // Updates the cache group, but doesn't remove old cache.
> +    void store(ApplicationCacheResource*, ApplicationCache*, int64_t maxAppCacheSize);
>      bool storeUpdatedType(ApplicationCacheResource*, ApplicationCache*);

Can we avoid adding the maxAppCacheSize arguments here? Cache size should be reachable through the group or cache pointers.
Comment 13 Andrei Popescu 2009-06-17 07:37:42 PDT
Created attachment 31413 [details]
New proposal to limit the maximum size of the application cache (take 6)

Based on the discussion at 

https://lists.webkit.org/pipermail/webkit-dev/2009-May/007560.html

I have removed the LRU mechanism. Instead, a ChromeClient callback is invoked
when the application cache max size is reached.
Comment 14 Adam Barth 2009-06-23 04:19:11 PDT
Comment on attachment 31413 [details]
New proposal to limit the maximum size of the application cache (take 6)

> +ChromeClient* ApplicationCacheGroup::client()
> +{
> +    ASSERT(m_frame);
> +    return m_frame->page()->chrome()->client();
> +}

Because m_frame can be null, it might make more sense to return null if !m_frame.  That at least warns consumers of this API that the client might not be there.

> +int64_t ApplicationCacheResource::size()
> +{
> +    if (!m_size) {

WebKit prefers early exit to increasing nesting depth.  So, we'd prefer:

if (m_size)
    return m_size;

> +            m_size += (it->first.length() + it->second.length() + 2) * sizeof(UChar);

It would be good to name this magic constant 2.  I presume this is strlen(": ")?

> +        m_size += url().string().length() * sizeof(UChar);
> +        m_size += sizeof(int); // response().m_httpStatusCode
> +        m_size += response().url().string().length() * sizeof(UChar);
> +        m_size += sizeof(unsigned); // dataId
> +        m_size += response().mimeType().length() * sizeof(UChar);
> +        m_size += response().textEncodingName().length() * sizeof(UChar);

This code is kind of ridiculous.  There must be a cleaner way to do this.  Do we really need to track the four bytes from the HTTP status code?

> +        // Returns the maximum size (in bytes) of the application caches file.
> +        virtual int64_t maxAppCacheSize() = 0;

Why is this a method on ChromeClient?  This seems more appropriate as a value on the Settings object.  ChromeClient is able showing UI elements that WebCore can't show itself (e.g., a JavaScript alert).

> +        // Callback invoked when the application cache gets hits the maximum size.
> +        // The chrome client would need to take some action such as evicting some
> +        // old caches.
> +        // spaceNeeded denotes the amount of space that would need to be freed
> +        // in order for the operation that caused the out-of-space condition to
> +        // succeed.
> +        virtual void reachedMaxAppCacheSize(int64_t spaceNeeded) = 0;

Again, I don't think ChromeClient is the right place for this.  I suspect the FrameLoaderClient is more correct here.  Something like didReachMaxAppCacheSize.  You should check with Maciej to make sure you put this on the right client.
Comment 15 Adam Barth 2009-06-23 04:20:36 PDT
In the next iteration, you might want to include some tests to exercise the new functionality.
Comment 16 Andrei Popescu 2009-07-01 17:00:37 PDT
Created attachment 32160 [details]
New proposal to limit the maximum size of the application cache (take 7)

- Moved the setMaximumSize() method out of the ChromeClient interface
- The reachedMaximumSize() ChrimeClient callback is now invoked asynchronously
- After the callback is invoked, we retry to save the new (failed) cache.
- Added unit test
Comment 17 Andrei Popescu 2009-07-01 17:08:10 PDT
Created attachment 32162 [details]
New proposal to limit the maximum size of the application cache (take 7)

- Moved the setMaximumSize() method out of the ChromeClient interface
- The reachedMaximumSize() ChrimeClient callback is now invoked asynchronously
- After the callback is invoked, we retry to save the new (failed) cache.
- Added unit test
Comment 18 Andrei Popescu 2009-07-01 17:13:26 PDT
(In reply to comment #14)
> (From update of attachment 31413 [details] [review])
> > +ChromeClient* ApplicationCacheGroup::client()
> > +{
> > +    ASSERT(m_frame);
> > +    return m_frame->page()->chrome()->client();
> > +}
> 
> Because m_frame can be null, it might make more sense to return null if
> !m_frame.  That at least warns consumers of this API that the client might not
> be there.
>

This method is no longer needed.
 
> > +int64_t ApplicationCacheResource::size()
> > +{
> > +    if (!m_size) {
> 
> WebKit prefers early exit to increasing nesting depth.  So, we'd prefer:
> 
> if (m_size)
>     return m_size;
> 

Done.

> > +            m_size += (it->first.length() + it->second.length() + 2) * sizeof(UChar);
> 
> It would be good to name this magic constant 2.  I presume this is strlen(":
> ")?
> 
> > +        m_size += url().string().length() * sizeof(UChar);
> > +        m_size += sizeof(int); // response().m_httpStatusCode
> > +        m_size += response().url().string().length() * sizeof(UChar);
> > +        m_size += sizeof(unsigned); // dataId
> > +        m_size += response().mimeType().length() * sizeof(UChar);
> > +        m_size += response().textEncodingName().length() * sizeof(UChar);
> 
> This code is kind of ridiculous.  There must be a cleaner way to do this.  Do
> we really need to track the four bytes from the HTTP status code?
>

This is the code that is used to save a resource to the app cache db. It therefore seems reasonable to also use it
for approximating the size of a resource. As discussed with Alexey on IRC, I added a comment in ApplicationCacheStorage.cpp to make sure that anyone who changes the code for saving a resource into the
database also changes this code.

 
> > +        // Returns the maximum size (in bytes) of the application caches file.
> > +        virtual int64_t maxAppCacheSize() = 0;
> 
> Why is this a method on ChromeClient?  This seems more appropriate as a value
> on the Settings object.  ChromeClient is able showing UI elements that WebCore
> can't show itself (e.g., a JavaScript alert).
> 

Removed.

> > +        // Callback invoked when the application cache gets hits the maximum size.
> > +        // The chrome client would need to take some action such as evicting some
> > +        // old caches.
> > +        // spaceNeeded denotes the amount of space that would need to be freed
> > +        // in order for the operation that caused the out-of-space condition to
> > +        // succeed.
> > +        virtual void reachedMaxAppCacheSize(int64_t spaceNeeded) = 0;
> 
> Again, I don't think ChromeClient is the right place for this.  I suspect the
> FrameLoaderClient is more correct here.  Something like
> didReachMaxAppCacheSize.  You should check with Maciej to make sure you put
> this on the right client.
>

I disagree. I think this is the right place for this callback, just as the Storage module has defined a similar callback for the situation when an origin hits its quota. A ChromeClient implementation may wish to display some UI that allows
the user to free some space, so having this callback here seems correct.
 
Thanks,
Andrei
Comment 19 Andrei Popescu 2009-07-01 17:14:43 PDT
(In reply to comment #15)
> In the next iteration, you might want to include some tests to exercise the new
> functionality.
> 

Done. Unfortunately, this required quite a few extra changes to the LayoutTestController and WebKit/mac/WebCoreSupport, which made this patch even bigger...
Comment 20 Andrei Popescu 2009-07-09 17:36:25 PDT
Ping
Comment 21 Adam Barth 2009-07-09 23:20:25 PDT
(In reply to comment #20)
> Ping

Since the previous round of reviews, I've decided that I should be more conservative in what patches I review.  My experience with this area of the code is such that the project would benefit from my deferring to other reviewers.

I realize you might find this a bit frustrating.  You've certainly put a lot of effort into this patch, and it's clearly functionality we should support.  Ultimately, it's important to consider the bigger picture and respect the processes that have made WebKit successful.

ap or andersca, would you be willing to review this patch?

Thanks,
Adam
Comment 22 Alexey Proskuryakov 2009-07-10 09:38:13 PDT
Comment on attachment 32162 [details]
New proposal to limit the maximum size of the application cache (take 7)

> +        * loader/appcache/ApplicationCache.cpp:
> +        * loader/appcache/ApplicationCache.h:
> +        Adds the ability to calculate the approximate size of an ApplicationCache object.

I think that the data member and accessor naming should make it clear how the data relates to reality (size() is not a great name for approximate/estimated size).

> +    , m_calledChromeClient(false)

This is a super confusing name. What method was called? Why was it called? Was it at any point in the past, or in some time range we care about?

> +    m_frame->page()->chrome()->client()->reachedMaxAppCacheSize(cacheStorage().spaceNeeded(m_cacheBeingUpdated->size()));
> +    m_calledChromeClient = true;
> +    checkIfLoadIsComplete();

The synchronous call to client matches how it is done for database quotas, but in the grand scheme of things, I'm not sure if WebCore should be blocked while waiting for the UI here.

> +            // the maximum size. In such a case, m_manifestResource may be NULL, as

We do not use NULL in code, so using it in comments isn't appropriate either. I suggest lower case "null" or 0.

> +class ChromeClientCallbackTimer: public TimerBase {
...
> +    ApplicationCacheGroup* m_cacheGroup;

A comment explaining why m_cacheGroup won't be deleted before the timer fires would help.

> +void ApplicationCacheGroup::scheduleReachedMaxAppCacheSize()

Looks like a noun is missing in this name ("callback"?).

> +class ChromeClientCallbackTimer;

I don't think you need a forward declaration here - a friend declaration should be sufficient.

> +#include <sqlite3.h>

ApplicationCacheStorage shouldn't be talking to SQLite directly.
>          unsigned newestCacheID = static_cast<unsigned>(statement.getColumnInt64(2));
>          RefPtr<ApplicationCache> cache = loadCache(newestCacheID);
> +        if (!cache)
> +            continue;

When can this fail? Wouldn't it mean that we left the disk database in inconsistent state?

> +        // Callback invoked when the application cache gets hits the maximum size.

s/gets//. But also, is this comment 100% correct? I'd expect it to be called when something doesn't fit.

> +        // spaceNeeded denotes the amount of space that would need to be freed
> +        // in order for the operation that caused the out-of-space condition to
> +        // succeed.

That part seems obvious from argument name, I'd just drop it.

This looks pretty close to being ready to me. It would be great if Anders took a look, too - especially at ApplicationCacheStorage.cpp changes, which I didn't study very closely.
Comment 23 Andrei Popescu 2009-07-14 06:41:18 PDT
Created attachment 32709 [details]
New proposal to limit the maximum size of the application cache (take 8)
Comment 24 Andrei Popescu 2009-07-14 06:50:11 PDT
Hi Alexey,

Thanks for the comments!

(In reply to comment #22)
> (From update of attachment 32162 [details])
> > +        * loader/appcache/ApplicationCache.cpp:
> > +        * loader/appcache/ApplicationCache.h:
> > +        Adds the ability to calculate the approximate size of an ApplicationCache object.
> 
> I think that the data member and accessor naming should make it clear how the
> data relates to reality (size() is not a great name for approximate/estimated
> size).
>

Done.
 
> > +    , m_calledChromeClient(false)
> 
> This is a super confusing name. What method was called? Why was it called? Was
> it at any point in the past, or in some time range we care about?
>

Changed the name and added a comment that explains when this flag is set, used and reset.
 
> > +    m_frame->page()->chrome()->client()->reachedMaxAppCacheSize(cacheStorage().spaceNeeded(m_cacheBeingUpdated->size()));
> > +    m_calledChromeClient = true;
> > +    checkIfLoadIsComplete();
> 
> The synchronous call to client matches how it is done for database quotas, but
> in the grand scheme of things, I'm not sure if WebCore should be blocked while
> waiting for the UI here.
> 

I agree with you. Given the size of this patch, would it be ok with you to land it 'as is' and then fix this issue in a separate patch?


> > +            // the maximum size. In such a case, m_manifestResource may be NULL, as
> 
> We do not use NULL in code, so using it in comments isn't appropriate either. I
> suggest lower case "null" or 0.
> 

Fixed.

> > +class ChromeClientCallbackTimer: public TimerBase {
> ...
> > +    ApplicationCacheGroup* m_cacheGroup;
> 
> A comment explaining why m_cacheGroup won't be deleted before the timer fires
> would help.
>

Added.
 
> > +void ApplicationCacheGroup::scheduleReachedMaxAppCacheSize()
> 
> Looks like a noun is missing in this name ("callback"?).
>

Yep, added.
 
> > +class ChromeClientCallbackTimer;
> 
> I don't think you need a forward declaration here - a friend declaration should
> be sufficient.
>

True, removed fwd declaration.
 
> > +#include <sqlite3.h>
> 
> ApplicationCacheStorage shouldn't be talking to SQLite directly.

Fixed.

> >          unsigned newestCacheID = static_cast<unsigned>(statement.getColumnInt64(2));
> >          RefPtr<ApplicationCache> cache = loadCache(newestCacheID);
> > +        if (!cache)
> > +            continue;
> 
> When can this fail? Wouldn't it mean that we left the disk database in
> inconsistent state?
> 

It can fail if SQLite fails for whatever reason. It cannot leave the disk database in an inconsistent state since loadCache doesn't write anything to the database. The pattern of checking the result of loadCache() was the same everywhere else: check the result and bail out if null. Here, the check was missing and I think this was a simple omission so I merely fixed it to match the existing pattern. 

Also, I think that 'continue;' is the right thing to do. We are in the cache selection algorithm so if we cannot load a cache, we simply ignore it and move to the next group.

> > +        // Callback invoked when the application cache gets hits the maximum size.
> 
> s/gets//. But also, is this comment 100% correct? I'd expect it to be called
> when something doesn't fit.
> 

Right. Changed the comment.

> > +        // spaceNeeded denotes the amount of space that would need to be freed
> > +        // in order for the operation that caused the out-of-space condition to
> > +        // succeed.
> 
> That part seems obvious from argument name, I'd just drop it.
>

Dropped.
 
> This looks pretty close to being ready to me. It would be great if Anders took
> a look, too - especially at ApplicationCacheStorage.cpp changes, which I didn't
> study very closely.

Thanks again. I have uploaded a new patch.

Andrei
Comment 25 Alexey Proskuryakov 2009-07-16 12:40:38 PDT
Comment on attachment 32709 [details]
New proposal to limit the maximum size of the application cache (take 8)

> +        https://bugs.webkit.org/show_bug.cgi?id=22700
> +
> +        https://lists.webkit.org/pipermail/webkit-dev/2009-May/007560.html

We usually put bug title after its URL in ChangeLogs.

> +    , m_estimatedSizeInDatabase(0)

I don't think that Database is the right word here - do we use it elsewhere in appcache interface? I'd prefer "OnDisk" or "InStorage".
> +    // Note that there is no need to use a RefPtr here.
> +    // The ApplicationCacheGroup instance is guaranteed
> +    // to be alive when the timer fires since invoking
> +    // the ChromeClient callback is part of its normal
> +    // update machinery and nothing can yet cause it to
> +    // get deleted.
> +    ApplicationCacheGroup* m_cacheGroup;

You could make lines longer (I'm not sure if there's an official recommendation, but we certainly don't wrap at 80 characters). Due to its vertical size, this comment looks more important than it is - I usually expect that six-line comments describe architecture.

> +    // This flag is set immediately after the ChromeClient::reachedMaxAppCacheSize()
> +    // callback is invoked as a result of the storage layer failing to save a cache
> +    // due to reaching the maximum size of the application cache database file.
> +    // This flag is used by ApplicationCacheGroup::checkIfLoadIsComplete() to decide
> +    // the course of action in case of this failure (i.e. call the ChromeClient
> +    // callback or run the failure steps).
> +    // This flag is reset in ApplicationCacheGroup::checkIfLoadIsComplete().
> +    bool m_calledReachedMaxAppCacheSize;

Ditto. I'm not sure if this even needs a comment in header.

> @@ -35,6 +35,7 @@ ApplicationCacheResource::ApplicationCac
>      : SubstituteResource(url, response, data)
>      , m_type(type)
>      , m_storageID(0)
> +    , m_size(0)
>  {
>  }

Is resource size exact, or also an estimate?
> +     while (selectURLs.step() == SQLResultRow)
> +        urls->append(selectURLs.getColumnText(0));

Too much indent before "while".

> +    if (!m_database.isOpen())
> +       return false;

Too little before "return".


Anders may have more comments about the SQL part of this patch.
Comment 26 Alexey Proskuryakov 2009-07-16 12:41:32 PDT
(in case it wasn't obvious from the comments, I'm fine with the patch as is).
Comment 27 Andrei Popescu 2009-07-16 14:10:33 PDT
Created attachment 32891 [details]
New proposal to limit the maximum size of the application cache (take 9)
Comment 28 Andrei Popescu 2009-07-16 14:14:33 PDT
(In reply to comment #25)
> (From update of attachment 32709 [details])
> > +        https://bugs.webkit.org/show_bug.cgi?id=22700
> > +
> > +        https://lists.webkit.org/pipermail/webkit-dev/2009-May/007560.html
> 
> We usually put bug title after its URL in ChangeLogs.
> 

Added.

> > +    , m_estimatedSizeInDatabase(0)
> 
> I don't think that Database is the right word here - do we use it elsewhere in
> appcache interface? I'd prefer "OnDisk" or "InStorage".

True, we don't use it elsewhere. For consistency, I used 'InStorage".

> > +    // Note that there is no need to use a RefPtr here.
> > +    // The ApplicationCacheGroup instance is guaranteed
> > +    // to be alive when the timer fires since invoking
> > +    // the ChromeClient callback is part of its normal
> > +    // update machinery and nothing can yet cause it to
> > +    // get deleted.
> > +    ApplicationCacheGroup* m_cacheGroup;
> 
> You could make lines longer (I'm not sure if there's an official
> recommendation, but we certainly don't wrap at 80 characters). Due to its
> vertical size, this comment looks more important than it is - I usually expect
> that six-line comments describe architecture.
> 

Made longer.

> > +    // This flag is set immediately after the ChromeClient::reachedMaxAppCacheSize()
> > +    // callback is invoked as a result of the storage layer failing to save a cache
> > +    // due to reaching the maximum size of the application cache database file.
> > +    // This flag is used by ApplicationCacheGroup::checkIfLoadIsComplete() to decide
> > +    // the course of action in case of this failure (i.e. call the ChromeClient
> > +    // callback or run the failure steps).
> > +    // This flag is reset in ApplicationCacheGroup::checkIfLoadIsComplete().
> > +    bool m_calledReachedMaxAppCacheSize;
> 
> Ditto. I'm not sure if this even needs a comment in header.
>

Made longer (and a bit shorter, too) so I left it in.

> > @@ -35,6 +35,7 @@ ApplicationCacheResource::ApplicationCac
> >      : SubstituteResource(url, response, data)
> >      , m_type(type)
> >      , m_storageID(0)
> > +    , m_size(0)
> >  {
> >  }
> 
> Is resource size exact, or also an estimate?

Oh, it's an estimate as well. Renamed.

> > +     while (selectURLs.step() == SQLResultRow)
> > +        urls->append(selectURLs.getColumnText(0));
> 
> Too much indent before "while".
> 

Fixed.

> > +    if (!m_database.isOpen())
> > +       return false;
> 
> Too little before "return".
>

Fixed.
 
> 
> Anders may have more comments about the SQL part of this patch.

I also tried pinging Anders a while ago but without success...

Many thanks,
Andrei
Comment 29 Andrei Popescu 2009-07-16 14:17:44 PDT
(In reply to comment #28)
> (In reply to comment #25)
>
> Made longer (and a bit shorter, too) so I left it in.
> 

Heh, what I mean is that I made the lines longer but I also made the comment more concise :)
Comment 30 Anders Carlsson 2009-07-17 14:41:02 PDT
Comment on attachment 32891 [details]
New proposal to limit the maximum size of the application cache (take 9)

SQL/storage parts look fine by me!
Comment 31 Adam Barth 2009-07-24 01:34:30 PDT
Will land.
Comment 32 Adam Barth 2009-07-24 02:12:38 PDT
Sending        LayoutTests/ChangeLog
Adding         LayoutTests/http/tests/appcache/max-size-expected.txt
Adding         LayoutTests/http/tests/appcache/max-size.html
Adding         LayoutTests/http/tests/appcache/resources/maxsize.manifest
Sending        WebCore/ChangeLog
Sending        WebCore/WebCore.base.exp
Sending        WebCore/loader/EmptyClients.h
Sending        WebCore/loader/appcache/ApplicationCache.cpp
Sending        WebCore/loader/appcache/ApplicationCache.h
Sending        WebCore/loader/appcache/ApplicationCacheGroup.cpp
Sending        WebCore/loader/appcache/ApplicationCacheGroup.h
Sending        WebCore/loader/appcache/ApplicationCacheResource.cpp
Sending        WebCore/loader/appcache/ApplicationCacheResource.h
Sending        WebCore/loader/appcache/ApplicationCacheStorage.cpp
Sending        WebCore/loader/appcache/ApplicationCacheStorage.h
Sending        WebCore/page/ChromeClient.h
Sending        WebCore/platform/sql/SQLiteDatabase.cpp
Sending        WebCore/platform/sql/SQLiteDatabase.h
Sending        WebKit/ChangeLog
Sending        WebKit/WebKit.xcodeproj/project.pbxproj
Sending        WebKit/gtk/ChangeLog
Sending        WebKit/gtk/WebCoreSupport/ChromeClientGtk.cpp
Sending        WebKit/gtk/WebCoreSupport/ChromeClientGtk.h
Sending        WebKit/mac/ChangeLog
Adding         WebKit/mac/WebCoreSupport/WebApplicationCache.h
Adding         WebKit/mac/WebCoreSupport/WebApplicationCache.mm
Sending        WebKit/mac/WebCoreSupport/WebChromeClient.h
Sending        WebKit/mac/WebCoreSupport/WebChromeClient.mm
Sending        WebKit/mac/WebKit.exp
Sending        WebKit/qt/ChangeLog
Sending        WebKit/qt/WebCoreSupport/ChromeClientQt.cpp
Sending        WebKit/qt/WebCoreSupport/ChromeClientQt.h
Sending        WebKit/win/ChangeLog
Sending        WebKit/win/WebCoreSupport/WebChromeClient.cpp
Sending        WebKit/win/WebCoreSupport/WebChromeClient.h
Sending        WebKit/wx/ChangeLog
Sending        WebKit/wx/WebKitSupport/ChromeClientWx.cpp
Sending        WebKit/wx/WebKitSupport/ChromeClientWx.h
Sending        WebKitTools/ChangeLog
Sending        WebKitTools/DumpRenderTree/LayoutTestController.cpp
Sending        WebKitTools/DumpRenderTree/LayoutTestController.h
Sending        WebKitTools/DumpRenderTree/gtk/LayoutTestControllerGtk.cpp
Sending        WebKitTools/DumpRenderTree/mac/LayoutTestControllerMac.mm
Sending        WebKitTools/DumpRenderTree/win/LayoutTestControllerWin.cpp
Sending        WebKitTools/DumpRenderTree/wx/LayoutTestControllerWx.cpp
Transmitting file data .............................................
Committed revision 46344.
http://trac.webkit.org/changeset/46344