Bug 244058

Summary: [GLib] New API should use UnifiedOriginStorageLevel::Basic
Product: WebKit Reporter: Michael Catanzaro <mcatanzaro>
Component: WebKitGTKAssignee: Michael Catanzaro <mcatanzaro>
Status: RESOLVED FIXED    
Severity: Normal CC: bugs-noreply, cgarcia, mcatanzaro, sihui_liu
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 210100    

Michael Catanzaro
Reported 2022-08-17 14:02:55 PDT
In future API versions, where WebKitWebsiteDataManager's localstorage and IndexedDB directory configuration APIs are removed, we can change WebsiteDataStore::defaultShouldUseCustomStoragePaths to allow WPE and GTK to use the new general storage directory instead of the legacy custom localstorage and IndexedDB directories.
Attachments
Carlos Garcia Campos
Comment 1 2023-02-01 01:04:28 PST
Michael Catanzaro
Comment 2 2023-02-01 07:58:00 PST
I quickly skimmed through that commit. * Old WPE/GTK API has to keep using UnifiedOriginStorageLevel::None to support use of custom directories for different data types. * New WPE/GTK API no longer allows setting custom directories for every data type, and only allows configuring the base data and base cache directories. So it should use at least UnifiedOriginStorageLevel::Basic. This should partition everything except cache? * We certainly want to partition cache data too. Is that the difference between UnifiedOriginStorageLevel::Basic vs. UnifiedOriginStorageLevel::Standard? Need to look closer. Apple is now putting cache under their origin directory, commingling cache and data. This would not be desirable for us because we should maintain clear separation between cache and data. I think we want one origin storage directory for cache and one for data.
Sihui Liu
Comment 3 2023-02-01 10:21:34 PST
(In reply to Michael Catanzaro from comment #2) > I quickly skimmed through that commit. > > * Old WPE/GTK API has to keep using UnifiedOriginStorageLevel::None to > support use of custom directories for different data types. > * New WPE/GTK API no longer allows setting custom directories for every > data type, and only allows configuring the base data and base cache > directories. So it should use at least UnifiedOriginStorageLevel::Basic. > This should partition everything except cache? Technically UnifiedOriginStorageLevel only indicates whether to the "General" directory for data types. In our previous implementation, the directory structure looks like: - WebsiteData/ - CacheStorage - [Hashed Partitioned Origin]/ - [Origin File] - [Record]/ Each type has its own directory, and origin file if the the origin directory name is hashed. We later introduced a General directory, and the disk layout becomes: - WebsiteData/ - [General] - [Hashed Partitioned Origin] - [Origin File] - CacheStorage - [Record] - IndexedDB - LocalStorage With new layout, we could eliminate unnecessary origin files and directories (good for backup). And it will be easier to apply same policy to all types (e.g. when a new type is added, we can just add another [Type] directory under partitioned origin directory without considering partitioning), and perform storage management (quote computation, data deletion, etc). The new layout involves data migration (from custom directory to new directory) and some other work (refactoring, new architecture, testing, etc), so we are doing this type by type. We might not use the general directory for all types in the end, since some types use one single file for all origins for performance, and some types have different partition strategy, e.g. cookies. Basically UnifiedOriginStorageLevel indicates the stage we are right now: UnifiedOriginStorageLevel::None - don't use general directory for existing types UnifiedOriginStorageLevel::Basic - use it for IndexedDB, LocalStorage UnifiedOriginStorageLevel::Standard - use it for IndexedDB, LocalStorage, and CacheStorage We are working towards UnifiedOriginStorageLevel::Standard now. (Note FileSystem is a new type added after General directory is introduced, so it will be under this directory regardless of the level). > * We certainly want to partition cache data too. Is that the difference > between UnifiedOriginStorageLevel::Basic vs. > UnifiedOriginStorageLevel::Standard? Need to look closer. Apple is now > putting cache under their origin directory, commingling cache and data. This > would not be desirable for us because we should maintain clear separation > between cache and data. I think we want one origin storage directory for > cache and one for data. If an existing type is partitioned now, it will be partitioned and UnifiedOriginStorageLevel only decides where the data is. As shown in example above, CacheStorage data is partitioned no matter it is Basic or Standard. CacheStorage is not "cache", as its lifetime should be controlled by web clients (https://w3c.github.io/ServiceWorker/#cache-lifetimes), like IndexedDB. For the other real cache types (e.g. http cache), we currently have no plan to move them. It might make sense to have another [General Cache] directory in parallel with [General] if we are going to do this.
Michael Catanzaro
Comment 4 2023-02-01 15:26:42 PST
(In reply to Sihui Liu from comment #3) > CacheStorage is not "cache", as its lifetime should be controlled by web > clients (https://w3c.github.io/ServiceWorker/#cache-lifetimes), like > IndexedDB. OK, surprise! I suppose that makes sense, because it looks like we want this data to persist even if users delete their cache dirs. (Users will report bugs complaining that we've put CacheStorage in a data directory rather than a cache directory, but oh well.) Unfortunately, WebsiteDataStoreConfiguration::initializePaths does not agree, nor does WebsiteDataStoreConfiguration::defaultCacheStorageDirectory. So we've got a migration problem. (There's also WebsiteDataStore::defaultApplicationCacheDirectory which is classed as cache rather than data. But surely that should be removed at some point, as it's deprecated in favor of service workers?)
Sihui Liu
Comment 5 2023-02-01 16:06:33 PST
(In reply to Michael Catanzaro from comment #4) > (In reply to Sihui Liu from comment #3) > > CacheStorage is not "cache", as its lifetime should be controlled by web > > clients (https://w3c.github.io/ServiceWorker/#cache-lifetimes), like > > IndexedDB. > > OK, surprise! I suppose that makes sense, because it looks like we want this > data to persist even if users delete their cache dirs. (Users will report > bugs complaining that we've put CacheStorage in a data directory rather than > a cache directory, but oh well.) > > Unfortunately, WebsiteDataStoreConfiguration::initializePaths does not > agree, nor does WebsiteDataStoreConfiguration::defaultCacheStorageDirectory. > So we've got a migration problem. Yes, we treated CacheStorage as cache, and got report from web developers about CacheStorage data disappears (since cache can be evicted in certain conditions, but data will not), so we are going to fix this by migration to general directory. > > (There's also WebsiteDataStore::defaultApplicationCacheDirectory which is > classed as cache rather than data. But surely that should be removed at some > point, as it's deprecated in favor of service workers?) For deprecated types I think we are going to leave them as it is now, and delete their data (and related code) at some point.
Michael Catanzaro
Comment 6 2023-02-03 08:45:39 PST
In that case, it sounds safe for us to use UnifiedOriginStorageLevel::Standard in the new GTK/WPE API version, and stick with UnifiedOriginStorageLevel::None for the old API version. I guess we'd have a one-time loss of CacheStorage data since there is no migration yet, but I think that's OK.
Michael Catanzaro
Comment 7 2023-02-06 10:22:11 PST
Sorry, I misread Sihui's comment and didn't look closely enough at the code. I see WebKit really does handle the migration, so even better.
Michael Catanzaro
Comment 8 2023-02-06 10:56:05 PST
I notice the default UnifiedStorageLevel in NetworkSessionParameters is UnifiedOriginStorageLevel::Basic, but in WebsiteDataStore::defaultUnifiedOriginStorageLevel it is UnifiedOriginStorageLevel::None. I understand the WebsiteDataStore's value takes precedence, but that's certainly a little confusing.
Michael Catanzaro
Comment 9 2023-02-15 13:26:24 PST
EWS
Comment 10 2023-02-16 09:28:41 PST
Committed 260375@main (c5ec4e0204d8): <https://commits.webkit.org/260375@main> Reviewed commits have been landed. Closing PR #10168 and removing active labels.
Note You need to log in before you can comment on or make changes to this bug.