WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
239521
[WPE][GTK] Website data leaked outside specified data/cache directories
https://bugs.webkit.org/show_bug.cgi?id=239521
Summary
[WPE][GTK] Website data leaked outside specified data/cache directories
dave
Reported
2022-04-19 15:16:00 PDT
I'm initializing a new web context like this, according to the API instructions, to put all cache/data directories under data_dir, which is a hidden subdirectory of my home directory: manager = webkit_website_data_manager_new ("base-cache-directory", data_dir, "base-data-directory", data_dir, "disk-cache-directory", data_dir, "dom-cache-directory", data_dir, "hsts-cache-directory", data_dir, "indexeddb-directory", data_dir, "itp-directory", data_dir, "local-storage-directory", data_dir, "offline-application-cache-directory", data_dir, "service-worker-registrations-directory", data_dir, NULL); context = webkit_web_context_new_with_website_data_manager (manager); I've also tried it using a separate subdirectory for each line, all under the same directory. In either case, it does not work. The subdirectory gets created and there are a few (empty) entries put there, but actual local storage remains under ~/.local/share/webkitgtk and the actual cache is put in a ~/.cache subdirectory. The hsts-cache does appear to use the user-supplied path, but nothing else does. I'm marking this as a major bug, because it prevents different WebKit browsers from using their own individual cookies, cache, etc, which totally screws me.
Attachments
Add attachment
proposed patch, testcase, etc.
Michael Catanzaro
Comment 1
2022-07-01 05:52:07 PDT
(In reply to dave from
comment #0
)
> I'm marking this as a major bug
Storing user data outside of the specified directories is pretty dire indeed. You *should* only need to set base-cache-directory and base-data-directory: all the other directories should be relative to those two by default. But specifying them all should not hurt anything. (Note: for your bug report to be noticed in a timely manner, it must be assigned to the WebKitGTK or WPE WebKit component.)
Michael Catanzaro
Comment 2
2022-07-01 12:21:18 PDT
Huh, we are leaking a lot. Just starting Epiphany creates: ~/.var/app/org.gnome.Epiphany.Devel/data/webkitgtk/deviceidhashsalts ~/.var/app/org.gnome.Epiphany.Devel/data/webkitgtk/storage Opening the web inspector additionally creates: ~/.var/app/org.gnome.Epiphany.Devel/data/webkitgtk-4.1/WebInspector/deviceidhashsalts ~/.var/app/org.gnome.Epiphany.Devel/data/webkitgtk-4.1/WebInspector/indexeddb ~/.var/app/org.gnome.Epiphany.Devel/data/webkitgtk-4.1/WebInspector/localstorage I don't see anything else being created in the wrong place, so I do not believe that everything but HSTS data is being leaked unless you can demonstrate otherwise.
dave
Comment 3
2022-07-01 13:38:12 PDT
I initialize the data manager like this: manager = webkit_website_data_manager_new ("base-cache-directory", dir1, "base-data-directory", dir2, "disk-cache-directory", dir3, "dom-cache-directory", dir4, "hsts-cache-directory", dir5, "indexeddb-directory", dir6, "itp-directory", dir7, "local-storage-directory", dir8, "offline-application-cache-directory", dir9, "service-worker-registrations-directory", dir10, NULL); None of these directories is set to ~/.local/share/webkitgtk. And yet that is where WebKit decides to put 'database', 'deviceidhashsalts', 'localstorage', and 'storage', the latter of which contain data stored from web sites. Consequently, all WebKit browsers on the system share this data.
Michael Catanzaro
Comment 4
2022-07-01 15:41:34 PDT
I know how to fix "deviceidhashsalts" and "storage" but "database" and "localstorage" both look fine to me. Are you sure you're actually using the WebKitWebsiteDataManager properly, giving it to either a WebKitWebView or a WebKitWebContext?
Michael Catanzaro
Comment 5
2022-07-01 15:44:14 PDT
BTW it seems the web inspector intentionally uses a separate WebsiteDataStore.
dave
Comment 6
2022-07-01 16:04:46 PDT
Yes, I am sure that this is a bug in WebKit. The very next line is correct: context = webkit_web_context_new_with_website_data_manager (manager); No, it is not correct for ~/.local/share/webkitgtk to contain local storage data from websites, when I specifically asked WebKit to store that data in 'dir8.'
Michael Catanzaro
Comment 7
2022-07-01 16:24:53 PDT
You'll need to provide test code that reproduces the bug, then, because it doesn't happen for me and it's not clear how it could happen.
Michael Catanzaro
Comment 8
2022-07-05 13:04:58 PDT
(In reply to Michael Catanzaro from
comment #7
)
> it's not clear how it could happen.
It could happen if WebsiteDataStore::defaultDataStore gets used somewhere by mistake. This is fragile.
Michael Catanzaro
Comment 9
2022-07-05 14:53:40 PDT
(In reply to Michael Catanzaro from
comment #8
)
> It could happen if WebsiteDataStore::defaultDataStore gets used somewhere by > mistake. This is fragile.
It's never actually created, fortunately, but we should do something about it anyway.
Michael Catanzaro
Comment 10
2022-07-06 14:50:10 PDT
(In reply to Michael Catanzaro from
comment #4
)
> I know how to fix "deviceidhashsalts" and "storage"
I have a solution for these incoming, to make them respect the base data directory. I've also been working on some changes to WebsiteDataStoreConfiguration to make it harder to mess up in the future. This is a little awkward because the base data directory and base cache directory concepts are specific to GTK and WPE, and to make this safer and harder to mess up *we* really need to push them into the cross-platform API. I've attempted to do that in a way that is minimally annoying for Apple ports, which don't use them. This will need owner approval. There are three more data types that are not implemented at all: WebsiteDataStore::defaultMediaCacheDirectory, WebsiteDataStore::defaultAlternativeServicesDirectory, and WebsiteDataStore::defaultJavaScriptConfigurationDirectory, all three defined as stubs in WebsiteDataStore.cpp. We have
bug #156369
to decide where to put the media cache directory. Since we never resolved that, I guess nobody cared enough to file bugs for alternative services or JavaScript configuration. I have not investigated to see what bad things happen when these are unset. We also need to either (a) add new public API to WebKitWebsiteDataManager to allow configuring the new paths, or else (b) deprecate all the WebKitWebsiteDataManager path properties except for base-cache-directory and base-data-directory. I favor (b) because this will ensure we do not need to keep updating the public API when new data types are added in the future. In particular, while "storage" is new in 2.36, "deviceidhashsalts" has been around for several years and we simply missed it until now.
Michael Catanzaro
Comment 11
2022-07-06 14:51:46 PDT
(In reply to Michael Catanzaro from
comment #10
)
> I have a solution for these incoming, to make them respect the base data > directory.
Specifically "deviceidhashsalts" and "storages". Couldn't find anything wrong with "database" or "localstorage".
> This is a little awkward because > the base data directory and base cache directory concepts are specific to > GTK and WPE, and to make this safer and harder to mess up *we* really need > to push them into the cross-platform API.
Here I meant to emphasize *really* rather than *we*.
Michael Catanzaro
Comment 12
2022-07-06 17:29:45 PDT
(In reply to Michael Catanzaro from
comment #9
)
> It's never actually created, fortunately, but we should do something about > it anyway.
Um, it is created for prewarmed processes, but that shouldn't hurt as long as it doesn't actually get used to store anything, which seems to never happen. I am worried about this comment in WebProcessProxyGLib.cpp, though: // Prewarmed processes don't have a WebsiteDataStore yet, so use the primary WebsiteDataStore from the WebProcessPool. // The process won't be used if current WebsiteDataStore is different than the WebProcessPool primary one. If that is true, then I wonder how prewarmed processes are ever used, because the primary WebsiteDataStore from WebProcessPool is the default WebsiteDataStore, and WebKitWebContext will *always* configure a non-default WebsiteDataStore. Either the API user creates a WebKitWebsiteDataManager (which wraps WebsiteDataStore), or else WebKitWebContext will create its own. This interaction with prewarmed processes requires further investigation.
Michael Catanzaro
Comment 13
2022-07-06 17:55:31 PDT
(In reply to Michael Catanzaro from
comment #10
)
> We also need to either (a) add new public API to WebKitWebsiteDataManager to > allow configuring the new paths, or else (b) deprecate all the > WebKitWebsiteDataManager path properties except for base-cache-directory and > base-data-directory. I favor (b) because this will ensure we do not need to > keep updating the public API when new data types are added in the future. In > particular, while "storage" is new in 2.36, "deviceidhashsalts" has been > around for several years and we simply missed it until now.
Ah there's a disadvantage of this: MiniBrowser is using it for about:data to show where particular data types are stored. I suppose other applications could do the same. I just noticed this when about to submit my MR, so... I'll go ahead anyway, but we need to consider whether we should really change this or whether it was just a bad idea.
Michael Catanzaro
Comment 14
2022-07-06 18:13:17 PDT
Pull request:
https://github.com/WebKit/WebKit/pull/2147
Michael Catanzaro
Comment 15
2022-07-15 14:01:01 PDT
(In reply to Michael Catanzaro from
comment #12
)
> If that is true, then I wonder how prewarmed processes are ever used, > because the primary WebsiteDataStore from WebProcessPool is the default > WebsiteDataStore, and WebKitWebContext will *always* configure a non-default > WebsiteDataStore.
Maybe I'm misreading the comment. Perhaps it means "always use the default data store here, or prewarming will not work."
EWS
Comment 16
2022-08-26 10:48:39 PDT
Committed
253824@main
(38ffac2857ee): <
https://commits.webkit.org/253824@main
> Reviewed commits have been landed. Closing PR #2147 and removing active labels.
dave
Comment 17
2022-08-26 11:05:38 PDT
Great! Thanks a lot!
dave
Comment 18
2022-09-19 07:00:39 PDT
I can't wait for 2.40 for this bug to be fixed, so I went ahead and applied the given patch to 2.38 (with some modification.) Unfortunately, this problem is NOT yet resolved. I'm now setting the base cache and base data directories and leaving everything else alone, but WebKit just does its own thing and creates a directory styled after the executable name in ~/.cache , which it then begins filling with cached material, instead of putting in ~/.foo where I asked. Picture me banging my head into a wall, repeatedly. To the guy working on this problem, I feel for you. Whoever "designed" this current storage system ought to be taken out back and shot. I've never seen a more convulated, crazy rats nest of logic.
Michael Catanzaro
Comment 19
2022-09-19 07:15:46 PDT
(In reply to dave from
comment #18
)
> Whoever "designed" this current storage system ought to be taken out back and shot.
Do you want to reconsider this statement?
dave
Comment 20
2022-09-19 09:39:31 PDT
Nope. Did you think I was speaking literally? The more time I spend in the WebKit source code, the more it becomes clear that there is something seriously wrong with the methodology used to develop this software. In general the source has become quite crufty over the years. Layer upon layer of complexity has gradually accumulated, sometimes with bad design decisions that were never revisited and resolved--which all contributes to now making it so difficult to resolve a bug that, quite frankly, never should have occurred in the first place. Just like the "stop button doesn't actually stop" bug, for example; another very obvious example of allowing things to go Terribly Wrong for a long, long time. Looking through this datastore code, it's like the left hand doesn't know what the right hand is doing. The same functionality is duplicated in different places. There's often functions with names like ensureThingIsDoneIfNecessary, where it's not clear if Thing is really being done at the right time here, or if this is something that was added in to Just Make Sure, to fix bug #298381 or whatever. Lots of unnecessary/wrong redundancy here and confusing logic. Every different cache or data storage subsystem and/or user of the web store seems to be its own independent private corporation, loosely tied back to a central control scheme, with a few examples (like Inspector, favicon cache, cookie storage, etc) not even tied in at all, or poorly so. The "fix" created here, while well meaning, doesn't go nearly far enough. It's like rearranging the deck chairs on the Titanic. The problem is NOT yet resolved, and won't really be until this entire subsystem is substantially reworked. WebKit as a whole could really use an entire version or two dedicating to nothing more than code cleanup and simplification. There's a LOT of junk that could be removed, or simplified, and you could probably get a 50%+ performance boost out of the deal, besides making bugs way easier to avoid in the future.
dave
Comment 21
2022-09-19 10:46:51 PDT
Excuse me? This bug is NOT resolved. WebKit still does not use the user-specified directories for cached data. Data is still leaked to other WebKit browsers. If you are refusing to fix the problem, then you should instead close it with WONTFIX. Regardless, I will have to implement my own fix locally, as I--the stupid and useless end user--simply need this software to work in a sane manner, to address SERIOUS SECURITY CONCERNS.
Michael Catanzaro
Comment 22
2022-09-19 11:26:36 PDT
I've already fixed all the bugs I was able to find here via
253824@main
. If you want additional changes, then (a) provide code that demonstrates the bug you think you've found, and (b) stop trolling.
dave
Comment 23
2022-09-19 15:44:51 PDT
Here, let me try this again: Your code is broken. It's poorly designed. It does not work properly. I already showed you the example code above. I already explained the exact undesired outcome, which makes logical sense according to how the code is written. Anyone can duplicate the same outcome. This is very simple and basic stuff. At this point, I'm wondering if *I'm* the one being trolled...? So you aren't able to find one single instance in the code of where (after the provided patch) say a WebkitDataStore might be created with WebbkitDataStoreConfiguration::createWithBaseDirectories (NULL, NULL), thus leading to the completely undesired behavior of creating a storage directory under ~/.local/share or ~/.cache named after the executable file (which is generated by the programName function in WebsiteDataStoreGLib.cpp)? Then there's the Inspector, which just goes off in left field creating its own special snowflake 'webkitgtk-4.1' subdirectory for its own personal use (but shared in common with all webkitgtk instances systemwide), with no way to configure that either. Actually, just by running a simple example app on a clean system, there are numerous obvious examples of how various directories are being created in random places, almost none of which are the directory the caller actually specified to be used. Ironically, even with this "fix", the user-specified cache and data directory is hardly used AT ALL. Just a few junk files and directories are created there. Everything else is going into other directories. This is not a "bug" in the sense of bizarre, unpredictable, emergent behavior, as you seem to be attempting to characterize it. "Just this crazy lunatic out here making things up." No, it's clearly exactly what the code is *designed* to do, bizarrely--and obviously incorrectly, if the specified behavior in the WebKit API documentation is to be taken as any kind of standard as to what it *should* be doing. Furthermore, there is no one specified person to whom the fingers can be pointed as the "guilty party" behind this bug (and surely numerous other related one.) This was clearly a group effort of piling cruft on top of cruft for decades until we arrived at this point. Now it's time to really fix the problem, and the solution to such problems usually isn't fun. I actually started to do this work myself, before stopping midway through in absolute disgust. Why should *I* have to drop everything to do *your* work for you? I don't get paid for this. Show some pride in your workmanship, please. Also, if you're having your trouble wrapping your head around the existence of this bug, man, you'd really hate it if I were to ever file that Critical level bug for your Broken Ass JIT which simply does not work on Phenom II architecture, apparently. Loading Javascript on a page results in a WebKit crash with an Illegal Instruction (SIGILL), which I traced back far enough to see it was JIT related. Disabling JIT and using Cloop is the only way I can even get WebKit to work on this PC, and it's slow as hell, of course. And no, it's not Just My PC, or some bizarre untraceable bug specific to me, it's the code doing exactly what it was told to do, which is issue an illegal instruction on my architecture, for reasons unknown. I wonder how many other people have experienced this same problem, and couldn't even get their foot in the door with WebKit. If you can't even get your data/cache directory storage architecture right, can't believe it's busted when it's clearly wrong, and want to throw up nebulous demands for "example code" at me just to confirm what's obviously broken, imagine the shit I'd have to wade through trying to help you fix YOUR JIT also. Not very enticing.
Michael Catanzaro
Comment 24
2022-09-20 12:13:16 PDT
If you can find a specific case where data is stored in the wrong place, then please show it. Otherwise, I'm just going to keep closing this bug again.
> So you aren't able to find one single instance in the code of where (after the provided patch) say a WebkitDataStore might be created with WebbkitDataStoreConfiguration::createWithBaseDirectories (NULL, NULL), thus leading to the completely undesired behavior of creating a storage directory under ~/.local/share or ~/.cache named after the executable file (which is generated by the programName function in WebsiteDataStoreGLib.cpp)?
It can happen if you don't configure a WebKitWebsiteDataManager on your WebKitWebContext, sure. That would be your fault, though... and surely you wouldn't make a mistake like that. ;) The only other cases are WebKitTestRunner and the CookieStorageFile API test, neither of which are reachable for applications. I don't see any other cases where this could happen after
253824@main
, and it certainly doesn't happen for me anymore after that commit. I spent several days working on fixing this, by the way, and you'll notice I tried to make it very difficult to do the wrong thing unintentionally rather than leave things fragile like they were before, so I'm rather surprised by your ungrateful reaction.
> Then there's the Inspector, which just goes off in left field creating its own special snowflake 'webkitgtk-4.1' subdirectory for its own personal use (but shared in common with all webkitgtk instances systemwide), with no way to configure that either.
I don't like what the web inspector is doing, but it is separate and clearly intentional. Applications have no control over the web inspector, so it's not completely unreasonable for it to treat itself as its own separate application. I'd rather it be segregated on a per-app basis, but I'm not going to spend more time on this myself, sorry.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug