This is a work-in-progress patch for implementing an on-disc file cache for the curl network backend.
Created attachment 215165 [details] work in progress patch
Created attachment 216042 [details] work in progress patch 2 Updated patch
Hi, I've implemented a http discache module for WinCE port's default WinInet backend, and later migrated to cURL. The impl. is based on Qt's source code, but later trimmed a lot. The cache file format mainly contains 3 parts: a fixed-size header struct; a http response header, but can be skipped by header-length field of previous struct; the response body. And it also uses Qt's tricks: such as storing cache file first into a temple file, and later renamed to a url-hashed file name. This may maximize the `ACID` property of low-level file operations. Hope these information may help you!
Created attachment 216164 [details] Proposed patch
Comment on attachment 216164 [details] Proposed patch View in context: https://bugs.webkit.org/attachment.cgi?id=216164&action=review A very nice start! There seems to be a bit of excess copying going on, as well as a few coding standard violations in this patch. Please revise and resubmit so we can push it forward. > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:47 > +CurlCacheEntry::CurlCacheEntry(String url, String cacheDir) These should be passed by const ref to avoid copying. > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:87 > + return &m_requestHeaders; If the m_requestHeaders member always exists, why do we return as a pointer? Why not return it by reference? > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:90 > +bool CurlCacheEntry::saveCachedData(const char *data, int size) We write this as "const char* data" It seems like size should be a "size_t" type. > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:92 > + PlatformFileHandle contentFile = openFile(m_contentFilename, OpenForWrite); It's too bad we don't have some kind of smart pointer for file operations (to ensure it is closed for any return state). > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:105 > +bool CurlCacheEntry::loadCachedData(ResourceHandle *job) We write this as "ResourceHandle* job" > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:117 > +bool CurlCacheEntry::saveResponseHeaders(ResourceResponse &response) We write this as "ResourceResponse& response". > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:142 > +// load the headers from cache file into memory This comment isn't really needed. > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:149 > + // create strings from buffer This comment doesn't really add anything. > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:167 > +void CurlCacheEntry::setResponseFromCachedHeaders(ResourceResponse &response) We write this as "ResourceResponse& response". > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:173 > + HTTPHeaderMap cachedResponseHeaders = m_cachedResponse.httpHeaderFields(); This is making a copy; is that intentional? > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:207 > +void CurlCacheEntry::generateBaseFilename(CString url) Cstring might be fairly big; shouldn't this be a const reference? > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:220 > +bool CurlCacheEntry::loadFileToBuffer(String filepath, Vector<char> &buffer) Strings can be big; should probably pass by const reference. We write this as "Vector<char>& buffer" > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:265 > +bool CurlCacheEntry::parseResponseHeaders(ResourceResponse &response) We write this as "ResourceResponse& response" > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:271 > + bool maxAgeIsValid = false; In general, we prefer to define values as close to the point they are used as possible. Several of these could be located further down in the method implementation. > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:338 > + Extra space should be removed. > Source/WebCore/platform/network/curl/CurlCacheEntry.h:42 > +private: We put private stuff at the end of our header files, not the top. > Source/WebCore/platform/network/curl/CurlCacheEntry.h:54 > + bool loadFileToBuffer(String filepath, Vector<char> &buffer); See my comments about using const references and proper syntax (e.g., use "Vector<char>& buffer", etc.) > Source/WebCore/platform/network/curl/CurlCacheEntry.h:58 > + CurlCacheEntry(String url, String cacheDir); Ditto const references. > Source/WebCore/platform/network/curl/CurlCacheEntry.h:62 > + bool isInMemory() { return m_headerInMemory; } This should be const. > Source/WebCore/platform/network/curl/CurlCacheEntry.h:65 > + bool saveCachedData(const char* data, int size); It seems like size should be a "size_t" type. > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:54 > +void CurlCacheManager::setCacheDirectory(String directory) Pass by const reference. > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:90 > + long long filesize = -1; You might be better off using one of the C99 size types, such as "int64_t" to represent a long-long. What should this be for a 64-bit build, for example? Should it be a 128-bit value? > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:100 > + int bufferReadSize = 4096; The "4096" magic number should be declared as a constant somewhere and given a meaningful name. > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:158 > +void CurlCacheManager::didReceiveResponse(ResourceHandle *job, ResourceResponse& response) We write this as "ResourceHandle* job". > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:167 > + Remove this blank line. > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:182 > + Remove this blank line. > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:185 > +void CurlCacheManager::didFinishLoading(String url) Pass by const reference. > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:195 > +bool CurlCacheManager::isCached(String url) Pass by const reference. > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:210 > +HTTPHeaderMap* CurlCacheManager::requestHeaders(String url) Pass by const reference. > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:222 > +void CurlCacheManager::didReceiveData(String url, const char *data, int size) Pass by const reference. We write this as "const char* data" "size" should probably be a size_t, unless you are using -1 as a sentinel value. > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:233 > +void CurlCacheManager::saveResponseHeaders(String url, ResourceResponse& response) Pass url as const reference. > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:244 > +void CurlCacheManager::invalidateCacheEntry(String url) Pass url as const reference. Why copy a few hundred bytes if you don't need to? > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:256 > +void CurlCacheManager::didFail(String url) Pass by const reference. > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:261 > +void CurlCacheManager::loadCachedData(String url, ResourceHandle *job, ResourceResponse& response) Pass by const reference. We write this as "ResourceHandle* job" > Source/WebCore/platform/network/curl/CurlCacheManager.h:40 > +private: Private stuff goes at the bottom of the file. > Source/WebCore/platform/network/curl/CurlCacheManager.h:43 > + CurlCacheManager(CurlCacheManager const &); We write this as "CurlCacheManager const&" > Source/WebCore/platform/network/curl/CurlCacheManager.h:44 > + void operator=(CurlCacheManager const &); Ditto. > Source/WebCore/platform/network/curl/CurlCacheManager.h:55 > + void loadCachedData(String, ResourceHandle*, ResourceResponse&); The above should be passing by const reference. > Source/WebCore/platform/network/curl/CurlCacheManager.h:58 > + static CurlCacheManager & getInstance() This should be written as "CurlCacheManager& getInstance". Move this implementation into the cpp file. You don't want this being emitted into every compilation unit, do you? > Source/WebCore/platform/network/curl/CurlCacheManager.h:64 > + String getCacheDirectory() { return m_cacheDir; } Return as const reference. Make the method const. > Source/WebCore/platform/network/curl/CurlCacheManager.h:65 > + void setCacheDirectory(String); Pass a const reference. > Source/WebCore/platform/network/curl/CurlCacheManager.h:68 > + HTTPHeaderMap* requestHeaders(String); // load headers Pass the above two Strings as const references. > Source/WebCore/platform/network/curl/CurlCacheManager.h:71 > + void didReceiveData(String, const char*, int); // save data Consider making this 'int' type a size_t. Pass String by const reference. > Source/WebCore/platform/network/curl/CurlCacheManager.h:73 > + void didFinishLoading(String); Pass the above strings as const references. > Source/WebCore/platform/network/curl/ResourceHandleManager.cpp:953 > + HTTPHeaderMap* requestHeaders = CurlCacheManager::getInstance().requestHeaders(url); It seems like requestHeaders should be returned by reference.
Created attachment 216378 [details] Proposed patch 2 Thanks for the detailed review! I fixed the issues you've mentioned. > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:90 > + long long filesize = -1; I've used "long long" because getFileSize() in FileSystem.h expects a long long variable as parameter: bool getFileSize(const String&, long long& result);
Comment on attachment 216378 [details] Proposed patch 2 View in context: https://bugs.webkit.org/attachment.cgi?id=216378&action=review This is looking really good! I had a couple of more minor things, then I think we're ready to land this. Thanks for doing this work! > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:105 > +bool CurlCacheEntry::loadCachedData(ResourceHandle *job) ResourceHandle* please. > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:165 > +void CurlCacheEntry::setResponseFromCachedHeaders(ResourceResponse &response) ResourceResponse& please. > Source/WebCore/platform/network/curl/CurlCacheEntry.cpp:326 > + DOn't need this line. > Source/WebCore/platform/network/curl/CurlCacheManager.cpp:158 > + CString urlLatin1 = it->key.latin1(); This is making a copy. I think you could use a const reference to avoid that.
Created attachment 216491 [details] Proposed patch 3
Comment on attachment 216491 [details] Proposed patch 3 r=me
Comment on attachment 216491 [details] Proposed patch 3 Clearing flags on attachment: 216491 Committed r159432: <http://trac.webkit.org/changeset/159432>
All reviewed patches have been landed. Closing bug.