In future API versions, where WebKitWebsiteDataManager's localstorage and IndexedDB directory configuration APIs are removed, we can change WebsiteDataStore::defaultShouldUseCustomStoragePaths to allow WPE and GTK to use the new general storage directory instead of the legacy custom localstorage and IndexedDB directories.
It was removed in https://commits.webkit.org/258591@main
I quickly skimmed through that commit. * Old WPE/GTK API has to keep using UnifiedOriginStorageLevel::None to support use of custom directories for different data types. * New WPE/GTK API no longer allows setting custom directories for every data type, and only allows configuring the base data and base cache directories. So it should use at least UnifiedOriginStorageLevel::Basic. This should partition everything except cache? * We certainly want to partition cache data too. Is that the difference between UnifiedOriginStorageLevel::Basic vs. UnifiedOriginStorageLevel::Standard? Need to look closer. Apple is now putting cache under their origin directory, commingling cache and data. This would not be desirable for us because we should maintain clear separation between cache and data. I think we want one origin storage directory for cache and one for data.
(In reply to Michael Catanzaro from comment #2) > I quickly skimmed through that commit. > > * Old WPE/GTK API has to keep using UnifiedOriginStorageLevel::None to > support use of custom directories for different data types. > * New WPE/GTK API no longer allows setting custom directories for every > data type, and only allows configuring the base data and base cache > directories. So it should use at least UnifiedOriginStorageLevel::Basic. > This should partition everything except cache? Technically UnifiedOriginStorageLevel only indicates whether to the "General" directory for data types. In our previous implementation, the directory structure looks like: - WebsiteData/ - CacheStorage - [Hashed Partitioned Origin]/ - [Origin File] - [Record]/ Each type has its own directory, and origin file if the the origin directory name is hashed. We later introduced a General directory, and the disk layout becomes: - WebsiteData/ - [General] - [Hashed Partitioned Origin] - [Origin File] - CacheStorage - [Record] - IndexedDB - LocalStorage With new layout, we could eliminate unnecessary origin files and directories (good for backup). And it will be easier to apply same policy to all types (e.g. when a new type is added, we can just add another [Type] directory under partitioned origin directory without considering partitioning), and perform storage management (quote computation, data deletion, etc). The new layout involves data migration (from custom directory to new directory) and some other work (refactoring, new architecture, testing, etc), so we are doing this type by type. We might not use the general directory for all types in the end, since some types use one single file for all origins for performance, and some types have different partition strategy, e.g. cookies. Basically UnifiedOriginStorageLevel indicates the stage we are right now: UnifiedOriginStorageLevel::None - don't use general directory for existing types UnifiedOriginStorageLevel::Basic - use it for IndexedDB, LocalStorage UnifiedOriginStorageLevel::Standard - use it for IndexedDB, LocalStorage, and CacheStorage We are working towards UnifiedOriginStorageLevel::Standard now. (Note FileSystem is a new type added after General directory is introduced, so it will be under this directory regardless of the level). > * We certainly want to partition cache data too. Is that the difference > between UnifiedOriginStorageLevel::Basic vs. > UnifiedOriginStorageLevel::Standard? Need to look closer. Apple is now > putting cache under their origin directory, commingling cache and data. This > would not be desirable for us because we should maintain clear separation > between cache and data. I think we want one origin storage directory for > cache and one for data. If an existing type is partitioned now, it will be partitioned and UnifiedOriginStorageLevel only decides where the data is. As shown in example above, CacheStorage data is partitioned no matter it is Basic or Standard. CacheStorage is not "cache", as its lifetime should be controlled by web clients (https://w3c.github.io/ServiceWorker/#cache-lifetimes), like IndexedDB. For the other real cache types (e.g. http cache), we currently have no plan to move them. It might make sense to have another [General Cache] directory in parallel with [General] if we are going to do this.
(In reply to Sihui Liu from comment #3) > CacheStorage is not "cache", as its lifetime should be controlled by web > clients (https://w3c.github.io/ServiceWorker/#cache-lifetimes), like > IndexedDB. OK, surprise! I suppose that makes sense, because it looks like we want this data to persist even if users delete their cache dirs. (Users will report bugs complaining that we've put CacheStorage in a data directory rather than a cache directory, but oh well.) Unfortunately, WebsiteDataStoreConfiguration::initializePaths does not agree, nor does WebsiteDataStoreConfiguration::defaultCacheStorageDirectory. So we've got a migration problem. (There's also WebsiteDataStore::defaultApplicationCacheDirectory which is classed as cache rather than data. But surely that should be removed at some point, as it's deprecated in favor of service workers?)
(In reply to Michael Catanzaro from comment #4) > (In reply to Sihui Liu from comment #3) > > CacheStorage is not "cache", as its lifetime should be controlled by web > > clients (https://w3c.github.io/ServiceWorker/#cache-lifetimes), like > > IndexedDB. > > OK, surprise! I suppose that makes sense, because it looks like we want this > data to persist even if users delete their cache dirs. (Users will report > bugs complaining that we've put CacheStorage in a data directory rather than > a cache directory, but oh well.) > > Unfortunately, WebsiteDataStoreConfiguration::initializePaths does not > agree, nor does WebsiteDataStoreConfiguration::defaultCacheStorageDirectory. > So we've got a migration problem. Yes, we treated CacheStorage as cache, and got report from web developers about CacheStorage data disappears (since cache can be evicted in certain conditions, but data will not), so we are going to fix this by migration to general directory. > > (There's also WebsiteDataStore::defaultApplicationCacheDirectory which is > classed as cache rather than data. But surely that should be removed at some > point, as it's deprecated in favor of service workers?) For deprecated types I think we are going to leave them as it is now, and delete their data (and related code) at some point.
In that case, it sounds safe for us to use UnifiedOriginStorageLevel::Standard in the new GTK/WPE API version, and stick with UnifiedOriginStorageLevel::None for the old API version. I guess we'd have a one-time loss of CacheStorage data since there is no migration yet, but I think that's OK.
Sorry, I misread Sihui's comment and didn't look closely enough at the code. I see WebKit really does handle the migration, so even better.
I notice the default UnifiedStorageLevel in NetworkSessionParameters is UnifiedOriginStorageLevel::Basic, but in WebsiteDataStore::defaultUnifiedOriginStorageLevel it is UnifiedOriginStorageLevel::None. I understand the WebsiteDataStore's value takes precedence, but that's certainly a little confusing.
Pull request: https://github.com/WebKit/WebKit/pull/10168
Committed 260375@main (c5ec4e0204d8): <https://commits.webkit.org/260375@main> Reviewed commits have been landed. Closing PR #10168 and removing active labels.