Summary: | [CURL] Add on-disk file cache | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Marco Barisione <marco.barisione> | ||||||
Component: | WebCore Misc. | Assignee: | Marco Barisione <marco.barisione> | ||||||
Status: | NEW --- | ||||||||
Severity: | Normal | CC: | a.butenka, ben, bugzilla, christian, denis.cheremisov, ht990332, louis, svillar | ||||||
Priority: | P2 | Keywords: | Curl | ||||||
Version: | 528+ (Nightly build) | ||||||||
Hardware: | All | ||||||||
OS: | All | ||||||||
Attachments: |
|
Description
Marco Barisione
2008-09-30 03:59:39 PDT
Created attachment 26675 [details]
Implement on-disk cache
This is an old patch that I didn't have time to finish and propose for review.
In the patch I paid attention to make it possible to have more processes accessing the cache at the same time without race, corruptions, etc.
From a quick look the obvious problems are:
- Too many .utf8().data() calls, they were only meant as a temporary workaround as the patch to do a proper conversion was not merged yet.
- Tons of g_print for debugging.
- The cache is saved in the current directory and not in a proper location for testing reasons
- The cache should be nester, the cache entry with hash aabbccdd.. should be saved in aa/bbccdd...
- Not completely portable to windows, I have some ideas on how to do that.
ATM there's no way to know the total size of the cache or to remove files but I had some ideas on how to implement it, so ping me on IRC if you want to discuss it.
ok. here is an update of Marco patch. There is some issues which i was to discuss here. - Too many .utf8().data() calls, they were only meant as a temporary workaround as the patch to do a proper conversion was not merged yet. i switched most of the String vars into char*. - Tons of g_print for debugging. Thanks, debug is very nice. Patch still have them. What should we do with it? Add ifdef(DEBUG)? - The cache is saved in the current directory and not in a proper location for testing reasons fixed. Current location is $HOME/.cache/webkit/. Do we need any api to redefine this location from applications? From one point, unified cache directory should be a nice way to handle it. Once user have few webkit based applications, he will be able to use all cache from all his applications. From other point its a lack of configuration for advanced. users. - The cache should be nester, the cache entry with hash aabbccdd.. should be saved in aa/bbccdd... fixed. - Not completely portable to windows, I have some ideas on how to do that. Sorry, no idea here. I have no idea about development for windows plus i dont have any for test. - CURL support is broken in mine patch. What should we do here. Marco and I is using GIO way of sending data to client. Is this way will be portable to all OS that webkit supports? What can be done, is that i can move GIO stuff out from soup file into separate file and reuse it for curl. - Cookies have an 'Expire' field. should we honor it and send non expired files from cache without confirmation from the server that it not out outdated? Created attachment 26786 [details]
Updated patch v0.1
patch not for a commit yet, but will be nice if somebody will check it.
There is some things is missing, i will update it soon.
(In reply to comment #4) > Created an attachment (id=26786) [review] > Updated patch v0.1 > > patch not for a commit yet, but will be nice if somebody will check it. > There is some things is missing, i will update it soon. It was mentioned in discussions on IRC that cache as a libSoup session feature makes much more sense than a WebKit internal implementation. We are only going to support libSoup in the future, and being part of the network interface makes it usable outside of WebKit. Leaving only Curl as keyword, as this may still be useful for Curl; see Christian Dywan's comments regarding GTK+/Soup. Moving away from WebKitGTK So, what the situation with this issue now? I'm using latest webkit, and it seems both Epiphany-Webkit and Midori use disk too intensive (compared to Opera, Firefox, Chromium, etc) (In reply to comment #8) > So, what the situation with this issue now? I'm using latest webkit, and it > seems both Epiphany-Webkit and Midori use disk too intensive (compared to > Opera, Firefox, Chromium, etc) Xan is working on a SoupSessionFeature for this, ie. a cache that is implemented in libSoup and can be used by all WebKit applications. The location of the cache should be configurable and should not point to $HOME/.cache/webkit/ as this is not webkit http cache, but curl/soup cache and a specific implementation at that. Not to mention that it is really gtkwebkit and not webkit. As it is the cache patch is not shared with chromium or arora (both on linux and both using webkit, not to say we couldn't improve our cache's to all work together) If in a year someone comes along and comes up with a better way to store the curl/soup cache they might not want to have both cache's in the same directory. On Linux $HOME/.cache should not be used, but the location should be determined from the environment variable $XDG_CACHE_HOME falling back to $HOME/.cache when it is empty. See http://standards.freedesktop.org/basedir-spec/basedir-spec-0.6.html One OS X and Windows of course this is also different and not $HOME/.cache. Lastly although I do not see it in the xdg spec looking at my .cache I see that the way that the .cache directory is being used is $XDG_CACHE_HOME/$COMPANY_NAME/$APPLICATION_NAME/ for example $HOME/.cache/midori/ or $HOME/.cache/Trolltech/demobrowser Different applications have different needs for cache. One application might hit the same few website(s) for years such as an rss reader. For that application it is very valuable that the cache not be deleted, but a web browser cache usage is different and could completely change every hour. To make this clear: WebKitGTK+ is not using CURL anymore and hence this feature request is obsolete as far as WebKitGTK+ is concerned. I guess it's better to close this now that we have https://bugs.webkit.org/show_bug.cgi?id=44261 Shouldn't you leave the old bug with lots of useful information open and close the new duplicate bug that contains one link and move that link here? (In reply to comment #13) > Shouldn't you leave the old bug with lots of useful information open and close the new duplicate bug that contains one link and move that link here? I don't think so. Mainly because although they both target the same feature they are based on very different implementations. The new one is just for discussing that other implementation. As kov said curl cache won't be integrated in webkitgtk+ I thought it could be useful for other people to know about new plans. Anyway this can be left open, not strong opinion about it. |