Bug 144048 - [SOUP] Use a webkit subdirectory for the disk cache
Summary: [SOUP] Use a webkit subdirectory for the disk cache
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: Soup
Depends on:
Blocks:
 
Reported: 2015-04-22 10:19 PDT by Carlos Garcia Campos
Modified: 2015-04-24 00:18 PDT (History)
4 users (show)

See Also:


Attachments
Patch (3.61 KB, patch)
2015-04-22 10:22 PDT, Carlos Garcia Campos
no flags Details | Formatted Diff | Diff
Updated patch (7.32 KB, patch)
2015-04-23 01:58 PDT, Carlos Garcia Campos
mrobinson: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Carlos Garcia Campos 2015-04-22 10:19:33 PDT
Recent versions of libsoup remove any file in cache dir not referenced by the index when the cache is loaded to workaround leaked resources when load/dump is unbalanced for whatever reason like a crash. We currently use $XDG_CACHE_HOME/app-name as default disk cache directory, but that directory could be used by apps to cache other things, and the soup cache might end up deleting other stuff. The soup cache assumes the given directory is only for the disk cache, so we should ensure that. See https://bugzilla.gnome.org/show_bug.cgi?id=667682.
Comment 1 Carlos Garcia Campos 2015-04-22 10:22:34 PDT
Created attachment 251335 [details]
Patch
Comment 2 Martin Robinson 2015-04-22 10:39:58 PDT
Comment on attachment 251335 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=251335&action=review

> Source/WebKit2/NetworkProcess/soup/NetworkProcessSoup.cpp:86
> +    String diskCachePath = WebCore::pathByAppendingComponent(parameters.diskCacheDirectory, "webkit");

How do you feel about webkitgtk-disk-cache or something like that?
Comment 3 Michael Catanzaro 2015-04-22 10:48:35 PDT
This patch looks good. It certainly solves the problem going forward.

As a bonus, I think we should also check if the original directory $XDG_CACHE_HOME/app-name contains a soup cache, and if so delete the entire directory so that we automatically clean up the original cache rather than leaving it on disk forever. Simply because it's easy to do, and we had users complaining about rather spectacular cache leaks there (60 GB at [1], 12 GB at [2]). This should not be a problem because an application that stores anything under $XDG_CACHE_HOME should be prepared for it to disappear.

[1] https://lists.webkit.org/pipermail/webkit-gtk/2014-July/0
[2] https://bugzilla.gnome.org/show_bug.cgi?id=667682
Comment 4 Michael Catanzaro 2015-04-22 10:49:01 PDT
Fixed link: https://lists.webkit.org/pipermail/webkit-gtk/2014-July/001980.html
Comment 5 Carlos Garcia Campos 2015-04-23 00:20:21 PDT
(In reply to comment #2)
> Comment on attachment 251335 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=251335&action=review
> 
> > Source/WebKit2/NetworkProcess/soup/NetworkProcessSoup.cpp:86
> > +    String diskCachePath = WebCore::pathByAppendingComponent(parameters.diskCacheDirectory, "webkit");
> 
> How do you feel about webkitgtk-disk-cache or something like that?

I avoided using gtk because this code is common among ports using SOUP and adding an #ifdef just for the name is probably too much noise for little benefit. I considered webkitcache, but it sounded redundant because it's in the $XDG_CACHE_HOME directory. Also, the new cache always creates this extra dir, named WebKitCache, so that would conflict with webkitcache in some file systems.
Comment 6 Carlos Garcia Campos 2015-04-23 00:23:20 PDT
(In reply to comment #3)
> This patch looks good. It certainly solves the problem going forward.
> 
> As a bonus, I think we should also check if the original directory
> $XDG_CACHE_HOME/app-name contains a soup cache, and if so delete the entire
> directory so that we automatically clean up the original cache rather than
> leaving it on disk forever. Simply because it's easy to do, and we had users
> complaining about rather spectacular cache leaks there (60 GB at [1], 12 GB
> at [2]). This should not be a problem because an application that stores
> anything under $XDG_CACHE_HOME should be prepared for it to disappear.

Yes I agree, we can even try to be more conservative and remove only the files containing only numeric characters and all soup.cache*, and leaving subdirectories. I'll update the patch. We will probably need to do the same when we switch to the new cache.

> [1] https://lists.webkit.org/pipermail/webkit-gtk/2014-July/0
> [2] https://bugzilla.gnome.org/show_bug.cgi?id=667682
Comment 7 Carlos Garcia Campos 2015-04-23 01:58:43 PDT
Created attachment 251420 [details]
Updated patch

Clear the previous soup cache if exists
Comment 8 Martin Robinson 2015-04-23 11:32:12 PDT
Comment on attachment 251420 [details]
Updated patch

View in context: https://bugs.webkit.org/attachment.cgi?id=251420&action=review

> Source/WebCore/platform/network/soup/SoupNetworkSession.cpp:170
> +static inline bool strIsNumeric(const char* str)

This should probably be called stringIsInteger.
Comment 9 Carlos Garcia Campos 2015-04-24 00:18:26 PDT
Committed r183255: <http://trac.webkit.org/changeset/183255>