Bug 67135

Summary: ApplicationCache: feature request - scriptable APIs to manage the set of appcaches created by a site.
Product: WebKit Reporter: Michael Nordman <michaeln>
Component: WebCore Misc.Assignee: Michael Nordman <michaeln>
Status: RESOLVED WONTFIX    
Severity: Normal CC: annevk, ap, dgrogan, ian, me, peter
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   

Description Michael Nordman 2011-08-29 10:18:56 PDT
The feature is a javascript API to enable the creation, enumeration, update, and deletion of appcaches on the current origin.  Calls might look something like this:

/** Creates a new cache or updates an existing one with the given manifest URL.  Manifest URL must be in the same origin as the JS */
createOrUpdateCache(String manifestUri, completionCallback, errorCallback);

/** Enumerates the caches present on the current origin */
enumerateCaches(CacheEnumerationCallback callback, ErrorCallback errorCallback);

interface CacheEnumerationCallback { 
  void handleEvent(Cache[] caches); 
}

interface Cache {
  number getManifestUri();
  number getSizeInBytes();
  String getManifestAsText();
  String[] getMasterEntryUris();
  String[] getExplicitEntryUris();
  FallbackEntry[] getFallbackEntries();
  String[] getNetworkWhitelistUris();
  boolean isNetworkWhitelistOpen();
  DateTime getCreationTime();
  DateTime getLastManifestFetchTime(); // The last time the manifest was fetched and checked for differences
  DateTime getLastUpdateTime(); // The last time a manifest fetch caused an actual update
  DateTime getLastAccessTime(); // The last time the cache actually bound to a browsing context
  // Maybe some APIs to signal whether the cache is currently being updated, and whether there is currently a running browsing context bound to it.

  void delete(... some callbacks ...); // Probably fails if there's a running browsing context bound to the cache
  void update(... some callbacks ...); // I guess a no-op if an update is currently in progress or maybe even if it happened very recently
}

interface FallbackEntry {
  String getTriggerUri();
  String getTargetUri();
}

Additional characteristics:
* Must be usable from pages not themselves bound to an appcache, as long as they are served from the same origin as the caches being operated on.
* Must work from workers, shared workers, and background pages, again subject to a same origin check.

The above is a very rough sketch, and needs a bunch of work, but illustrates the features we'd find useful.  An obvious flaw is that it doesn't fit in with the system of progress events etc on the current API, but there are probably many others.  View it mainly as a list of requirements.  Our use cases are as follows:

* Docs maintains a set of appcaches which it uses for various purposes.  Each editor, for example, has a cache.  There are also cases where different documents require different versions of the same editor.
* The set of caches required on a particular browser depends on the documents synced there.  A given set of documents will require a particular (much smaller) set of caches to open.  The set of caches required on a given browser is therefore dynamic, changing as documents enter and leave the set of those synchronized.
* Each time anybody opens a docs property, and perhaps during the lifetimes of some of them, we perform a procedure called 'appcache maintenance', which ensures that the caches necessary for the current set of documents are synced.  This is a fairly nasty process involving many iframes, but it works.  We would like, however, to make this code much simpler, not have it involve the iframes, and make the process of piping progress events back to the host application less awful.  Right now it's such a pain we're not bothering with it.
* We'd like to perform appcache maintenance on existing caches less often, reducing server load.  The timestamps included above would allow us to do that.
* When an appcache is no longer needed by the current set of documents, it is currently just left there.  We would like to be able to clean it up.
* We would like to be able to perform our appcache maintenance procedure from a shared worker, as we have one that can bring new documents into storage.  Right now that is impossible, as I can't open an iframe there.
* When the user 'opts out' of offline, I'd like to be able to remove all their caches.  Right now I have a process by which I can remove many of them, but any not needed by the current set of documents will be left behind.  We set a short-lived cookie on the domain from the client side and then perform our usual maintenance procedure, which loads an invisible iframe referring to each cache.  The presence of the cookie causes the manifest fetches to serve a 404, deleting the cache.  I'd like this cleanup to be more straightforward and for it to cover caches not needed by the current set of documents.  I'd also like to be able to perform it without the server - there's no inherent reason why they shouldn't be able to opt out of offline while actually being offline.
* When a failure in the offline system is detected, or as part of manual bug reports, I'd like to be able to send a dump of the caches currently stored on the browser to the server, thus the APIs which interrogate the details of the cache.
* If we ever want to migrate to an alternate architecture involving a different use of the appcache system, this API would be invaluable for the migration.
Comment 1 Michael Nordman 2011-09-15 12:56:54 PDT
We have similar desirements to enumerate other object types like indexedDBs and sqlDBs, and related desirements to delete these other object types.


We could introduce a top level object in the window/worker namespace that contains methods to perform these things.

window.<something>.enumerateApplicationCaches(...)
window.<something>.enumerateIndexedDatabases(...)
window.<something>.enumerateSqlDatabases(...)
window.<something>.deleteApplicationCache(manifestUrl);
window.<something>.deleteIndexedDatabase(name);
window.<something>.deleteSqlDatabase(name);
Comment 2 Michael Nordman 2011-09-20 15:35:37 PDT
> We could introduce a top level object in the window/worker namespace that contains methods to perform these things.
> window.<something>.enumerateApplicationCaches(...)

Looks like window.storageInfo could work for this.

window.storageInfo.enumerateApplicationCaches(ApplicationCacheEnumerationCallback callback);
window.storageInfo.getApplicationCacheInfo(String manifestUrl, ApplicationCacheInfoCallback callback) raises(DOMException);

interface ApplicationCacheEnumerationCallback { 
  void handleEvent(in ApplicationCacheInfoArray cacheInfos);
}

interface ApplicationCacheInfoCallback { 
  void handleEvent(in ApplicationCacheInfo cacheInfo);
}

interface ApplicationCacheInfo : public DOMApplicationCache {
  // meta data accessors
  readonly attribute String manifestUrl;
  readonly attribute number size;
  readonly attribute DateTime creationTime;
  readonly attribute DateTime lastCheckTime; // The last time a manifest was checked for a newer version
  readonly attribute DateTime lastUpdateTime; // The last time a manifest fetch caused an actual update
  readonly attribute DateTime lastAccessTime; // The last time the cache actually bound to a browsing context

  // ability to delete an appcache
  void delete() raises(DOMException);

  // Other stuff  is inherited from DOMApplicationCache
}
Comment 3 Michael Nordman 2011-09-20 15:38:51 PDT
(In reply to comment #2)
> > We could introduce a top level object in the window/worker namespace that contains methods to perform these things.
> > window.<something>.enumerateApplicationCaches(...)
> 
> Looks like window.storageInfo could work for this.


For this to work, storageInfo should also be available in WorkerContext.idl.
Comment 4 Michael Nordman 2011-09-20 15:40:44 PDT
I think to proceed with this, it'd be good to put it behind a new ENABLE(APPLICATION_CACHE_INFO) flag until its fully ready to go.
Comment 5 Michael Nordman 2011-10-05 19:09:47 PDT
Some notes on our latest thoughts.

StorageInfo {
  // Support for listing, creating, updating, deleting, and monitoring the status of appcaches.
  void getApplicationCaches(ApplicationCachesCallback callback);
  void deleteApplicationCache(String manifestUrl);
  void getApplicationCacheInfo(String manifestUrl, ApplicationCacheInfoCallback callback);

  // Support for listing and deleting sql databases.
  void getSqlDatabases(SqlDatabasesCallback callback);
  void deleteSqlDatabase(String databaseName);
}

interface ApplicationCacheInfo  {
  readonly attribute String manifestUrl;
  readonly attribute number sizeInBytes;
  readonly attribute DateTime lastAccessTime; // The last time the cache was bound to a browsing context

  void createOrUpdate();

  readonly attribute unsigned short status;  // UNCACHED,IDLE,CHECKING,DOWNLOADING,OBSOLETE
  attribute EventListener onchecking;
  attribute EventListener onerror;
  attribute EventListener onnoupdate;
  attribute EventListener ondownloading;
  attribute EventListener onprogress;
  attribute EventListener oncached;
  attribute EventListener onobsolete;
}

interface ApplicationCachesCallback { 
  void handleEvent(in ApplicationCacheInfo[] info);
}

interface ApplicationCacheInfoCallback { 
  void handleEvent(in ApplicationCacheInfo cache);
}

interface SqlDatabaseInfo  {
  readonly attribute String name;
  readonly attribute number sizeInBytes;
  readonly attribute DateTime lastAccessTime; // The last time the database was opened
}

interface SqlDatabasesCallback  { 
  void handleEvent(in SqlDatabaseInfo[] info);
}
Comment 6 Ian 'Hixie' Hickson 2011-10-05 20:19:14 PDT
Seems mostly reasonable.

Any particular reason for using a separate interface than ApplicationCache to represent the cache? Most of this object seems to duplicate the existing interface.

The name "createOrUpdate()" is a bit ugly, I'd go with the less precise but cleaner "update()", personally.

Why have deleteApplicationCache() as a separate method rather than a method on the ApplicationCache object?

Don't forget to vendor-prefix the API since it's not specced yet.
Comment 7 Michael Nordman 2011-10-06 13:40:58 PDT
(pinch me... some feedback/discussion that doesn't start and end with "whats your use case" :)

> Any particular reason for using a separate interface than ApplicationCache to represent the cache? Most of this object seems to duplicate the existing interface.

The existing DOMApplicationCache and the new 'info' interfaces overlap quite alot, particularly the event handlers and status accessor, but not entirely. The swapCache() method doesn't apply to the info object, and a status value of UPDATE_READY doesn't make sense either. There is no 'create' capability with DOMApplicationCache.

I see them as different object types. The DOMApplicationCache is that which is bound to your document. Info objects are a new object type, not bound to your document or frame in anyway.

I'd be more in favor of adding a 'manifestUrl' attribute to the DOMApplicationCache interface than merging them together. Given the attr callers could easily lookup the 'info' for that manifestUrl as needed for management purposes.

> The name "createOrUpdate()" is a bit ugly, I'd go with the less precise but cleaner "update()", personally.

Wanted to be clear that this gave a means of creating it.

> Why have deleteApplicationCache() as a separate method rather than a method on the ApplicationCache object?

A usecase I've seen would be streamlined by not having to first 'get' and then 'delete'. Here's what they do now. 
   <iframe style='display: none;' src='unwantedManifest1'></iframe>
   <iframe style='display: none;' src='unwantedManifest2'></iframe>
Also symmetry with the deleteSqlDatabase() function. In some cases we're seeing a need to delete databases because they cant be opened.

> Don't forget to vendor-prefix the API since it's not specced yet.

Yup, the 'storageInfo' object is itself prefixed since that is not yet specced. I'm not sure of these additions need additional prefixing.
   webkitStorageInfo.getApplicationCaches(...); 
     vs
   webkitStorageInfo.webkitGetApplicationCaches(...)
Comment 8 Ian 'Hixie' Hickson 2011-10-06 14:56:08 PDT
(In reply to comment #7)
> (pinch me... some feedback/discussion that doesn't start and end with "whats your use case" :)

I'm assuming your use cases are "a big site with many applications would like to be able to provide a page that deletes all the cached applications on that site", and "a big site with many applications would like to be able to install or update multiple applications at once".


Regarding the objects: I think it would be unfortunate to use an entirely separate interface. IMHO we should either use the same objects (my preference), in particular having the navigator.applicationCache object === the object in the enumeration, if the current page is cached, or, if there's really good reasons not to do that, we should at least have the Info objects inherit from an AbstractApplicationCache interface that hosts all the common bits. Having two almost entirely duplicate interfaces seems like poor design.


> swapCache()

It can just throw an exception in the case where the cache isn't the current one (InvalidStateError).


> status value of UPDATE_READY doesn't make sense

That's no big deal. The attribute would just never take that value.


> no 'create' capability with DOMApplicationCache.

We can just update() creates instead of throwing an exception.


Regarding deleting, it's not like it's going to happen so often that saving a few keystrokes is going to matter:

   window.storageInfo.getApplicationCacheInfo(manifestURL, function (o) { o.delete(); });

...vs:

   window.storageInfo.deleteApplicationCacheInfo(manifestURL);

I mean, you could just as easily suggest that we add an "updateApplicationCacheInfo(manifestURL)" method so that you don't have to get the object to update it, or even a "getApplicationCacheStatus(manifestURL)" method... Why is deletion special?


> Also symmetry with the deleteSqlDatabase() function.

Since WebSQLDatabase is a vestigial feature only implemented in certain UAs, and not part of the Web platform proper, I don't think that's a great goal. Soon enough the SQL things will disappear and being consistent with it would be pointless. Having said that, though, I'd move the consistency the other way: just put delete() into SqlDatabaseInfo.


> > Don't forget to vendor-prefix the API since it's not specced yet.
> 
> Yup, the 'storageInfo' object is itself prefixed since that is not yet specced.

LGTM.
Comment 9 Michael Nordman 2011-10-06 16:13:49 PDT
Thnx for taking the time hixie!

(In reply to comment #8)
> I'm assuming your use cases are "a big site with many applications would like to be able to provide a page that deletes all the cached applications on that site", and "a big site with many applications would like to be able to install or update multiple applications at once".

Yup, of course.

> Regarding the objects: I think it would be unfortunate to use an entirely separate interface. IMHO we should either use the same objects (my preference), in particular having the navigator.applicationCache object === the object in the enumeration, if the current page is cached, or, if there's really good reasons not to do that, we should at least have the Info objects inherit from an AbstractApplicationCache interface that hosts all the common bits. Having two almost entirely duplicate interfaces seems like poor design.

Interesting, an earlier draft employed interface inheritance. Other feedback told me that was misleading since it didn't employ object inheritance. I can see reusing the same interface or deriving from a common interface to compose two different ones. But i maintain that there are two different object types, associatedCache vs cacheInfo

In particular, having window.applicationCache === oneParticularInfoInstance is troublesome. That particular info object would behave subtlety differently than the rest if your page happened to be bound to manifest (it would enter the UPDATE_READY state, .swapCache() would have meaning, you would not be able to see latest 'info' about the current version w/o calling swapCache()). Also, its mysteriously tied to a particular Document. When that document goes away, it would break (or mysteriously revert to being an 'unassociated' instance)... all that is wonky.

> > swapCache()
> It can just throw an exception in the case where the cache isn't the current one (InvalidStateError).
> > status value of UPDATE_READY doesn't make sense
> That's no big deal. The attribute would just never take that value.
> > no 'create' capability with DOMApplicationCache.
> We can just update() creates instead of throwing an exception.

Sure, it could be that way, if we reuse the same interface.

> Regarding deleting, it's not like it's going to happen so often that saving a few keystrokes is going to matter:
>    window.storageInfo.getApplicationCacheInfo(manifestURL, function (o) { o.delete(); });
> I mean, you could just as easily suggest that we add an "updateApplicationCacheInfo(manifestURL)" method so that you don't have to get the object to update it, or even a "getApplicationCacheStatus(manifestURL)" method... Why is deletion special?

Good point, symmetry with IndexedDB which has a delete() method is more desirable anyway.

> Since WebSQLDatabase is a vestigial feature only implemented in certain UAs, and not part of the Web platform proper, I don't think that's a great goal. Soon enough the SQL things will disappear and being consistent with it would be pointless. Having said that, though, I'd move the consistency the other way: just put delete() into SqlDatabaseInfo.

A delete() method on the db object actually doesn't work in some cases. There are difficult cases where open() can fail to produce an instance and throws an exception instead (the sync nature + version checking semantics built into open() are problematic). This new vestige wants to be a standalone method.
Comment 10 Ian 'Hixie' Hickson 2011-10-06 17:02:24 PDT
(In reply to comment #9)
> Thnx for taking the time hixie!

My pleasure. I'm likely to end up having to spec this anyway one day. ;-)


Regarding window.applicationCache === oneParticularInfoInstance, you raise some very valid points. Let's not do that.

Having a common interface may be the way to go. Having almost duplicate interface definitions really raises some red flags for me. We're very likely to end up with subtle differences in the APIs if we go that route, e.g. someone typos it in one of the definitions, or the copy/paste is done slightly wrong, or whatnot. Better to clearly indicate that they're supposed to be identical right up front by having just one copy of the definitions (and ideally sharing as much implementation code as possible too).


> A delete() method on the db object actually doesn't work in some cases.

Not on the db object, on the SqlDatabaseInfo object.
Comment 11 Michael Nordman 2011-10-06 17:36:21 PDT
> My pleasure. I'm likely to end up having to spec this anyway one day. ;-)

nice!

> Regarding window.applicationCache === oneParticularInfoInstance, you raise some very valid points. Let's not do that.
> 
> Having a common interface may be the way to go. Having almost duplicate interface definitions really raises some red flags for me. We're very likely to end up with subtle differences in the APIs if we go that route, e.g. someone typos it in one of the definitions, or the copy/paste is done slightly wrong, or whatnot. Better to clearly indicate that they're supposed to be identical right up front by having just one copy of the definitions (and ideally sharing as much implementation code as possible too).

Sgtm. All of the additional members on the 'info' object in comment #5 (plus a delete method) could also be provided directly on the window.applicationCache object. My plan for delete is to have the group asyncly enter the OBSOLETE state and raise events accordingly.

> Not on the db object, on the SqlDatabaseInfo object.

yes, that works for me


I'll run these refinements by some would be internal customers for this. I'm not sure we really need the datetime values, i'll see about dropping those too.
Comment 12 Ian 'Hixie' Hickson 2012-10-23 18:33:20 PDT
Did any of this end up getting implemented?
Comment 13 Michael Nordman 2012-10-24 11:52:05 PDT
(In reply to comment #12)
> Did any of this end up getting implemented?

Nope.
Comment 14 Ian 'Hixie' Hickson 2012-10-25 12:00:41 PDT
Ok, I've started spec work on this (based on the comments above) at:
   https://www.w3.org/Bugs/Public/show_bug.cgi?id=18609
Comment 15 Anne van Kesteren 2024-03-06 00:21:14 PST
This is a feature we've disabled/removed.