Bug 37874 - Provide mechanism to cache metadata for a resource
: Provide mechanism to cache metadata for a resource
Status: RESOLVED FIXED
Product: WebKit
Classification: Unclassified
Component: WebCore Misc.
: 528+ (Nightly build)
: All All
: P2 Enhancement
Assigned To: Nobody
:
Depends on:
Blocks: 38661 38665
  Show dependency treegraph
 
Reported: 2010-04-20 11:12 PDT by Tony Gentilcore
Modified: 2010-05-10 11:14 PDT (History)
11 users (show)

See Also:


Attachments
Rough patch to get design feedback (13.34 KB, patch)
2010-04-20 13:22 PDT, Tony Gentilcore
no flags Details | Formatted Diff | Diff
Added CacheableMetadata struct per abarth's recommendation (19.58 KB, patch)
2010-04-21 21:09 PDT, Tony Gentilcore
no flags Details | Formatted Diff | Diff
Fix compile by overloading rather than specializing in WebData. (19.64 KB, patch)
2010-04-22 13:39 PDT, Tony Gentilcore
abarth: review-
Details | Formatted Diff | Diff
Removed ID and type enum (21.81 KB, patch)
2010-04-23 10:28 PDT, Tony Gentilcore
no flags Details | Formatted Diff | Diff
Provide default, empty implementation of dispatchDidGenerateCacheableMetadata so that each port doesn't need to implement it (21.81 KB, patch)
2010-04-26 12:50 PDT, Tony Gentilcore
no flags Details | Formatted Diff | Diff
Pretty much rewritten to correct interface and more robustly store metadata. (29.03 KB, patch)
2010-05-03 13:28 PDT, Tony Gentilcore
abarth: review-
Details | Formatted Diff | Diff
Get rid of map and fix smart pointers (20.73 KB, patch)
2010-05-04 17:35 PDT, Tony Gentilcore
abarth: review-
Details | Formatted Diff | Diff
Make CachedMetadata own data and reference count it (17.59 KB, patch)
2010-05-06 10:23 PDT, Tony Gentilcore
no flags Details | Formatted Diff | Diff
Add adoptRefs, remove protected section (17.57 KB, patch)
2010-05-06 11:43 PDT, Tony Gentilcore
no flags Details | Formatted Diff | Diff
Patch (19.42 KB, patch)
2010-05-10 10:50 PDT, Tony Gentilcore
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tony Gentilcore 2010-04-20 11:12:59 PDT
I'd like to add the ability to store metadata for a CachedResource.

This metadata will contain data which speeds up the processing of the resource. When it is generated by WebCore, the host application is notified so it may persist it to disk cache. When the host application loads a resource from disk cache, it can provide the cached metadata so that WebCore doesn't have to redo the work.

The immediate use-case is to persist V8's pre compilation data to Chrome's disk cache (http://crbug.com/32407). But it is conceivable that other types of resources could have useful cacheable metadata.
Comment 1 Tony Gentilcore 2010-04-20 13:22:57 PDT
Created attachment 53873 [details]
Rough patch to get design feedback
Comment 2 Tony Gentilcore 2010-04-21 21:09:32 PDT
Created attachment 54021 [details]
Added CacheableMetadata struct per abarth's recommendation
Comment 3 WebKit Review Bot 2010-04-22 11:46:57 PDT
Attachment 54021 [details] did not build on chromium:
Build output: http://webkit-commit-queue.appspot.com/results/1854050
Comment 4 WebKit Review Bot 2010-04-22 11:48:51 PDT
Attachment 54021 [details] did not build on gtk:
Build output: http://webkit-commit-queue.appspot.com/results/1794045
Comment 5 Tony Gentilcore 2010-04-22 13:39:33 PDT
Created attachment 54091 [details]
Fix compile by overloading rather than specializing in WebData.
Comment 6 WebKit Review Bot 2010-04-22 13:43:01 PDT
Attachment 54091 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style']" exit_code: 1
WebKit/chromium/ChangeLog:5:  One or more unexpected \r (^M) found; better to use only a \n  [whitespace/carriage_return] [1]
Suppressing further [whitespace/carriage_return] reports for this file.
Total errors found: 2 in 16 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 7 Adam Barth 2010-04-22 14:13:14 PDT
Comment on attachment 54091 [details]
Fix compile by overloading rather than specializing in WebData.

WebCore/loader/CachedResource.cpp:168
 +  void CachedResource::setCachedMetadata(CacheableMetadata::Type type, PassRefPtr<SharedBuffer> data, FrameLoader* frameLoader)
Why pass a FrameLoader* here?  Usually we pass around Frame*.

WebCore/loader/CachedResource.cpp:175
 +      if (frameLoader && frameLoader->client())
client() is always non-NULL.

WebCore/loader/CachedResource.h:148
 +      void setCachedMetadata(CacheableMetadata::Type, PassRefPtr<SharedBuffer>, FrameLoader*);
Does CachedResource have any other dependencies on Frame?  In general, we want the Frame to be less involved in loading, not more.

WebCore/loader/FrameLoaderClient.h:47
 +      struct CacheableMetadata;
We prefer classes to structs.

WebCore/platform/network/ResourceResponseBase.h:48
 +      long long m_id;
This whole thing is very strangely layered.  In WebKit, there's no necessary connection between the client and the networking layer.  This patch somehow assumes some strange round-tripping between the two.

Why is the client involved in this process at all?  It seems like WebCore should just ask the networking layer to cache these bytes.

WebCore/platform/network/ResourceResponseBase.h:52
 +          V8_SCRIPTDATA
Code in this location isn't allowed to know about V8.  We need a better way of assigning these type numbers.  Also, are these number persisted on disk?  We might need some comments explaining the rules for modifying these bits.
Comment 8 Tony Gentilcore 2010-04-22 15:07:13 PDT
(In reply to comment #7)
> (From update of attachment 54091 [details])
> WebCore/loader/CachedResource.cpp:168
>  +  void CachedResource::setCachedMetadata(CacheableMetadata::Type type,
> PassRefPtr<SharedBuffer> data, FrameLoader* frameLoader)
> Why pass a FrameLoader* here?  Usually we pass around Frame*.

No reason except my unfamiliarity with the code. Done.

Because I have a couple of questions inline below, I'm holding off on uploading a new patch until they are all worked out.

> 
> WebCore/loader/CachedResource.cpp:175
>  +      if (frameLoader && frameLoader->client())
> client() is always non-NULL.

Changed to only test frame. Let me know if loader needs to be tested as well.

> 
> WebCore/loader/CachedResource.h:148
>  +      void setCachedMetadata(CacheableMetadata::Type,
> PassRefPtr<SharedBuffer>, FrameLoader*);
> Does CachedResource have any other dependencies on Frame?  In general, we want
> the Frame to be less involved in loading, not more.

Not really. In the case that the resource is not cached, It has a reference to a DocLoader (which has a reference to Frame). Passing a frame in this way does feel like a strange thing to do, but it was the best I could come up with. Do you have any alternate ideas?

> 
> WebCore/loader/FrameLoaderClient.h:47
>  +      struct CacheableMetadata;
> We prefer classes to structs.

Good, me too. I took your previous comment to add a struct too literally. Fixed.

> 
> WebCore/platform/network/ResourceResponseBase.h:48
>  +      long long m_id;
> This whole thing is very strangely layered.  In WebKit, there's no necessary
> connection between the client and the networking layer.  This patch somehow
> assumes some strange round-tripping between the two.
> 
> Why is the client involved in this process at all?  It seems like WebCore
> should just ask the networking layer to cache these bytes.

That does sound nice. But I'm clearly missing something about the overall picture of the architecture. As far as I've been able to discover, routing through the FrameLoaderClient is the only way to get a message back to the embedding application. Can you point me to the interface where WebCore can send a notification to the networking layer with this data.

> 
> WebCore/platform/network/ResourceResponseBase.h:52
>  +          V8_SCRIPTDATA
> Code in this location isn't allowed to know about V8.  We need a better way of
> assigning these type numbers.  Also, are these number persisted on disk?  We
> might need some comments explaining the rules for modifying these bits.

Yes, it does seem odd. That's why I had initially not passed a type along with the metadata. That leaves it in the hands of each implementer of metadata to determine if the data belongs to them or not. In this case, V8Proxy would just see if the data starts with its magic number or not. An enum, by definition, means that we are naming each type of metadata that can be stored -- be it for JSC, V8, or something else -- and that we need a registry of those types. If you want to keep a type, but lose the enum, we could basically extract the magic number out of the stream and make the type a GUID that each implementer chooses (relying on the size of the space for uniqueness).
Comment 9 Tony Gentilcore 2010-04-23 08:40:47 PDT
(In reply to comment #8)
> (In reply to comment #7)
> > (From update of attachment 54091 [details] [details])
> > WebCore/loader/CachedResource.cpp:168
> >  +  void CachedResource::setCachedMetadata(CacheableMetadata::Type type,
> > PassRefPtr<SharedBuffer> data, FrameLoader* frameLoader)
> > Why pass a FrameLoader* here?  Usually we pass around Frame*.
> 
> No reason except my unfamiliarity with the code. Done.
> 
> Because I have a couple of questions inline below, I'm holding off on uploading
> a new patch until they are all worked out.
> 
> > 
> > WebCore/loader/CachedResource.cpp:175
> >  +      if (frameLoader && frameLoader->client())
> > client() is always non-NULL.
> 
> Changed to only test frame. Let me know if loader needs to be tested as well.
> 
> > 
> > WebCore/loader/CachedResource.h:148
> >  +      void setCachedMetadata(CacheableMetadata::Type,
> > PassRefPtr<SharedBuffer>, FrameLoader*);
> > Does CachedResource have any other dependencies on Frame?  In general, we want
> > the Frame to be less involved in loading, not more.
> 
> Not really. In the case that the resource is not cached, It has a reference to
> a DocLoader (which has a reference to Frame). Passing a frame in this way does
> feel like a strange thing to do, but it was the best I could come up with. Do
> you have any alternate ideas?
> 
> > 
> > WebCore/loader/FrameLoaderClient.h:47
> >  +      struct CacheableMetadata;
> > We prefer classes to structs.
> 
> Good, me too. I took your previous comment to add a struct too literally.
> Fixed.
> 
> > 
> > WebCore/platform/network/ResourceResponseBase.h:48
> >  +      long long m_id;
> > This whole thing is very strangely layered.  In WebKit, there's no necessary
> > connection between the client and the networking layer.  This patch somehow
> > assumes some strange round-tripping between the two.
> > 
> > Why is the client involved in this process at all?  It seems like WebCore
> > should just ask the networking layer to cache these bytes.
> 
> That does sound nice. But I'm clearly missing something about the overall
> picture of the architecture. As far as I've been able to discover, routing
> through the FrameLoaderClient is the only way to get a message back to the
> embedding application. Can you point me to the interface where WebCore can send
> a notification to the networking layer with this data.

FYI -- I found a way to get rid of the ID. Just need to figure out how to handle type now.

> 
> > 
> > WebCore/platform/network/ResourceResponseBase.h:52
> >  +          V8_SCRIPTDATA
> > Code in this location isn't allowed to know about V8.  We need a better way of
> > assigning these type numbers.  Also, are these number persisted on disk?  We
> > might need some comments explaining the rules for modifying these bits.
> 
> Yes, it does seem odd. That's why I had initially not passed a type along with
> the metadata. That leaves it in the hands of each implementer of metadata to
> determine if the data belongs to them or not. In this case, V8Proxy would just
> see if the data starts with its magic number or not. An enum, by definition,
> means that we are naming each type of metadata that can be stored -- be it for
> JSC, V8, or something else -- and that we need a registry of those types. If
> you want to keep a type, but lose the enum, we could basically extract the
> magic number out of the stream and make the type a GUID that each implementer
> chooses (relying on the size of the space for uniqueness).
Comment 10 Tony Gentilcore 2010-04-23 10:28:36 PDT
Created attachment 54170 [details]
Removed ID and type enum

- Regarding ID:
I realized that the fact that chrome needs a response time is an implementation detail of the embedding application. Also, response time was already in ResourceLoaderBridge::ResponseInfo, so all I had to do was add it to WebURLResponse and then the ID could be removed from CacheableData.

- Regarding Type:
I removed the enum in favor of a GUID-style scheme. Let me know if you like that better or if you have another idea. An additional alternative is just to go back to the original approach where the magic number is in the metadata.
Comment 11 WebKit Review Bot 2010-04-23 12:15:32 PDT
Attachment 54170 [details] did not build on gtk:
Build output: http://webkit-commit-queue.appspot.com/results/1892004
Comment 12 Tony Gentilcore 2010-04-26 12:50:47 PDT
Created attachment 54323 [details]
Provide default, empty implementation of dispatchDidGenerateCacheableMetadata so that each port doesn't need to implement it
Comment 13 Tony Gentilcore 2010-05-03 13:28:52 PDT
Created attachment 54957 [details]
Pretty much rewritten to correct interface and more robustly store metadata.

Per our offline discussion, I've basically rewritten this patch.
 - Uses ResourceHandle interface to request that platform cache data.
 - Uses didReceiveCachedMetadata to receive cached data from platform.
 - Has map of metadata keyed by type.
Comment 14 Adam Barth 2010-05-03 14:07:56 PDT
Comment on attachment 54957 [details]
Pretty much rewritten to correct interface and more robustly store metadata.

Structurally, this looks a lot better.  There's some confusing about how RefPtr/PassRefPtr/OwnPtr/PassOwnPtr works.  If you haven't read http://webkit.org/coding/RefPtr.html, it can be helpful in explaining how these classes are meant to be used.

I don't like the fancy template / subclass / reinterpret_cast way of serializing metadata.  It's a bit hard to tell from this patch how this machinery will be used, but I'd start with something dead simple and we can make it more complicated if we need to.  The simplest thing is probably a static registry of IDs and only allowing for kind of cached metadata per resource.  That will get rid of all the mystical code about subclasses being forbidden from adding data members.  In any case, it looks like the clients of this system need to covert their metadata into byte arrays already.

WebCore/loader/CachedMetadata.h:5
 +      modify it under the terms of the GNU Library General Public
Google usually contributes under the BSD license.  This license is fine, but just a little outside the norm.

WebCore/loader/CachedMetadata.h:38
 +  // Subclasses must not contain non-static member variables because they would
Why not just make the destructor virtual?

WebCore/loader/CachedMetadataMap.cpp:32
 +  // static
We usually skip these comments in WebKit.

WebCore/loader/CachedMetadata.h:42
 +      static CachedMetadata* create(const char* data, size_t size)
This should probably return a PassOwnPtr
If you haven't read http://webkit.org/coding/RefPtr.html, you might find it helpful.
WebCore/loader/CachedMetadataMap.cpp:57
 +      return buffer;
return buffer.release();

WebCore/loader/CachedMetadataMap.cpp:76
 +              // The dataSize has lied to us which means the format is corrupt.
Do we want to signal this error condition somehow instead of failing silently?

WebCore/loader/CachedMetadataMap.cpp:79
 +              CachedMetadata* cachedMetadata = CachedMetadata::create(reinterpret_cast<const char*>(buffer + bufferPosition), dataSize);
Why do we need this reinterpret_cast here?  Isn't buffer already a const char*?

WebCore/loader/CachedMetadataMap.cpp:91
 +  void CachedMetadataMap::appendValueToBuffer(const T* value, PassRefPtr<SharedBuffer> buffer) const
This should be a raw SharedBuffer*.  We use PassRefPtr to signal that we're passing ownership of the refcounted object.

WebCore/loader/CachedMetadataMap.cpp:93
 +      buffer->append(reinterpret_cast<const char*>(value), sizeof(T));
Yuck.  This isn't strictly ANSI compliant.

WebCore/loader/CachedMetadataMap.cpp:108
 +      *bufferPosition += readSize;
WebKit usually uses non-const reference for output parameters instead of pointers.

WebCore/loader/CachedMetadataMap.h:59
 +          // the static_casts to work properly. This means no member variables.
Oh, I see.  Hum...  There's got to be a better design.  This is too much low-level memory poking.
I feel like you're trying to use the C++ type system to do a bunch of work that it's not very good at doing.  Why do you have a separate subtype for each kind of thing you might want to cache?  It feels like you just want to have some sort of static registry or something to avoid conflicts.

A lot of this feels like duplication of Pickle.  I guess we don't have that here, do we...
WebCore/loader/CachedResource.cpp:172
 +          m_cachedMetadataMap.set(CachedMetadataMap::create().release());
I don't think this release() is correct.

WebCore/loader/CachedResource.cpp:407
 +      ASSERT(m_cachedMetadataMap->size() == 1);
Hum...  This mechanism seems like drastic overkill if there's only ever one thing in the map.  Maybe make the simpler thing first and we can elaborate it as needed later.

WebCore/loader/CachedResource.h:166
 +              return m_cachedMetadataMap->get<T>();
I'm not really sure what this templatization is buying you.

WebCore/platform/network/ResourceHandle.cpp:148
 +  void ResourceHandle::cacheMetadata(const ResourceResponse&, PassRefPtr<SharedBuffer>)
This PassRefPtr is confusing here.  Are we taking ownership of the SharedBuffer?  If so, is it ok to do nothing here?

WebCore/platform/network/ResourceHandle.h:115
 +      static void cacheMetadata(const ResourceResponse&, PassRefPtr<SharedBuffer>);
I suspect this should be a SharedBuffer*, but I'm not sure.

WebKit/chromium/tests/CachedMetadataMapTest.cpp:13
 +   * distribution.
This is the more usual license for Google contributions.

WebKit/chromium/tests/CachedMetadataMapTest.cpp:32
 +  // original KURL's.
This comment seems wrong.
Comment 15 Tony Gentilcore 2010-05-04 17:34:26 PDT
(In reply to comment #14)
> (From update of attachment 54957 [details])
> Structurally, this looks a lot better.  There's some confusing about how
> RefPtr/PassRefPtr/OwnPtr/PassOwnPtr works.  If you haven't read
> http://webkit.org/coding/RefPtr.html, it can be helpful in explaining how these
> classes are meant to be used.

Thanks, I hadn't seen that doc. It was extremely helpful. You are right, I didn't understand how the WTF smart pointers worked.

Would it be controversial if I mailed a patch to add a comment pointing to the documentation in RefPtr.h?

> 
> I don't like the fancy template / subclass / reinterpret_cast way of
> serializing metadata.  It's a bit hard to tell from this patch how this
> machinery will be used, but I'd start with something dead simple and we can
> make it more complicated if we need to.  The simplest thing is probably a
> static registry of IDs and only allowing for kind of cached metadata per
> resource.  That will get rid of all the mystical code about subclasses being
> forbidden from adding data members.  In any case, it looks like the clients of
> this system need to covert their metadata into byte arrays already.

Okay, I've tossed the map and gone back to the assumption that there is one type of metadata per resource. This simplifies things a great deal, but still requires each WebCore generator of metadata to choose a GUID-like ID. I can't think of any way to have a static registry unless you are okay with a single bottle-neck file somewhere that has ID names like "V8Metadata", "JSCMetadata", etc.

> 
> WebCore/loader/CachedMetadata.h:5
>  +      modify it under the terms of the GNU Library General Public
> Google usually contributes under the BSD license.  This license is fine, but
> just a little outside the norm.

Oops. I just copied the license from the nearest file not realizing that files within WebKit have different licences. Hopefully I've got the right one now.

> 
> WebCore/loader/CachedMetadata.h:38
>  +  // Subclasses must not contain non-static member variables because they
> would
> Why not just make the destructor virtual?

Moot.

> 
> WebCore/loader/CachedMetadataMap.cpp:32
>  +  // static
> We usually skip these comments in WebKit.

Moot.

> 
> WebCore/loader/CachedMetadata.h:42
>  +      static CachedMetadata* create(const char* data, size_t size)
> This should probably return a PassOwnPtr
> If you haven't read http://webkit.org/coding/RefPtr.html, you might find it
> helpful.

Thanks again for the doc. You are right.

> WebCore/loader/CachedMetadataMap.cpp:57
>  +      return buffer;
> return buffer.release();

Moot.

> 
> WebCore/loader/CachedMetadataMap.cpp:76
>  +              // The dataSize has lied to us which means the format is
> corrupt.
> Do we want to signal this error condition somehow instead of failing silently?

Moot.

> 
> WebCore/loader/CachedMetadataMap.cpp:79
>  +              CachedMetadata* cachedMetadata =
> CachedMetadata::create(reinterpret_cast<const char*>(buffer + bufferPosition),
> dataSize);
> Why do we need this reinterpret_cast here?  Isn't buffer already a const char*?

Moot.

> 
> WebCore/loader/CachedMetadataMap.cpp:91
>  +  void CachedMetadataMap::appendValueToBuffer(const T* value,
> PassRefPtr<SharedBuffer> buffer) const
> This should be a raw SharedBuffer*.  We use PassRefPtr to signal that we're
> passing ownership of the refcounted object.

I'm now passing a const Vector<char>&. Is that ideal?

> 
> WebCore/loader/CachedMetadataMap.cpp:93
>  +      buffer->append(reinterpret_cast<const char*>(value), sizeof(T));
> Yuck.  This isn't strictly ANSI compliant.
> 

Are you referring to ANSI C (i.e. char could be another size) or ANSI unicode (i.e. doesn't handle UTF-8)?

If C: I could switch to uint8_t/uint32_t instead of char/unsigned. But I'd still have to cast at the API level to make this useful.

If Unicode: Keep in mind that I'm expecting this to be a serialization of bytes of metadata. It is not a string.

If something else, please explain.

> WebCore/loader/CachedMetadataMap.cpp:108
>  +      *bufferPosition += readSize;
> WebKit usually uses non-const reference for output parameters instead of
> pointers.

Moot.

> 
> WebCore/loader/CachedMetadataMap.h:59
>  +          // the static_casts to work properly. This means no member
> variables.
> Oh, I see.  Hum...  There's got to be a better design.  This is too much
> low-level memory poking.
> I feel like you're trying to use the C++ type system to do a bunch of work that
> it's not very good at doing.  Why do you have a separate subtype for each kind
> of thing you might want to cache?  It feels like you just want to have some
> sort of static registry or something to avoid conflicts.
> 
> A lot of this feels like duplication of Pickle.  I guess we don't have that
> here, do we...

Okay. I was hoping to make the API just slightly cleaner by exposing:
cachedResource.get<MyDataType>();
instead of:
cachedResource.get(myDataTypeID);

But in the interest of simplicity, I've gotten rid of all of the templates.

Yes, if Pickle were available, I'd be able to use that. WebCore/bindings/v8/SerializedScriptValue.cpp has similar serialization logic. But since my format is so simple (an unsigned followed by a byte bucket), I've just rolled my own. Let me know if you have a better idea for serialization.

> WebCore/loader/CachedResource.cpp:172
>  +          m_cachedMetadataMap.set(CachedMetadataMap::create().release());
> I don't think this release() is correct.

Moot.

> 
> WebCore/loader/CachedResource.cpp:407
>  +      ASSERT(m_cachedMetadataMap->size() == 1);
> Hum...  This mechanism seems like drastic overkill if there's only ever one
> thing in the map.  Maybe make the simpler thing first and we can elaborate it
> as needed later.

Done.

> 
> WebCore/loader/CachedResource.h:166
>  +              return m_cachedMetadataMap->get<T>();
> I'm not really sure what this templatization is buying you.

See previous comment. I thought it made the API more natural. It is gone now.

> 
> WebCore/platform/network/ResourceHandle.cpp:148
>  +  void ResourceHandle::cacheMetadata(const ResourceResponse&,
> PassRefPtr<SharedBuffer>)
> This PassRefPtr is confusing here.  Are we taking ownership of the
> SharedBuffer?  If so, is it ok to do nothing here?

I didn't understand PassRefPtr properly before reading that document. It is now a raw pointer.

> 
> WebCore/platform/network/ResourceHandle.h:115
>  +      static void cacheMetadata(const ResourceResponse&,
> PassRefPtr<SharedBuffer>);
> I suspect this should be a SharedBuffer*, but I'm not sure.

Yep, done.

> 
> WebKit/chromium/tests/CachedMetadataMapTest.cpp:13
>  +   * distribution.
> This is the more usual license for Google contributions.
> 

I've used this license throughout.

> WebKit/chromium/tests/CachedMetadataMapTest.cpp:32
>  +  // original KURL's.
> This comment seems wrong.

Yep, unintentionally branched from another test. Removed.
Comment 16 Tony Gentilcore 2010-05-04 17:35:52 PDT
Created attachment 55079 [details]
Get rid of map and fix smart pointers
Comment 17 Adam Barth 2010-05-05 09:59:04 PDT
> Would it be controversial if I mailed a patch to add a comment pointing to the
> documentation in RefPtr.h?

Please do.  Not enough people read that doc.  :)

> > WebCore/loader/CachedMetadataMap.cpp:91
> >  +  void CachedMetadataMap::appendValueToBuffer(const T* value,
> > PassRefPtr<SharedBuffer> buffer) const
> > This should be a raw SharedBuffer*.  We use PassRefPtr to signal that we're
> > passing ownership of the refcounted object.
> 
> I'm now passing a const Vector<char>&. Is that ideal?

That's probably good.  I'll look.

> > WebCore/loader/CachedMetadataMap.cpp:93
> >  +      buffer->append(reinterpret_cast<const char*>(value), sizeof(T));
> > Yuck.  This isn't strictly ANSI compliant.
> 
> Are you referring to ANSI C (i.e. char could be another size) or ANSI unicode
> (i.e. doesn't handle UTF-8)?
> 
> If C: I could switch to uint8_t/uint32_t instead of char/unsigned. But I'd
> still have to cast at the API level to make this useful.
> 
> If Unicode: Keep in mind that I'm expecting this to be a serialization of bytes
> of metadata. It is not a string.
> 
> If something else, please explain.

I meant I don't think it's a strictly legal reinterpret_cast according to ANSI C.  Just my way of complaining about low-level memory poking.

Looking at the new patch now.
Comment 18 Adam Barth 2010-05-05 10:11:09 PDT
Comment on attachment 55079 [details]
Get rid of map and fix smart pointers

This is getting close.  I'd like to see this code used by something so it's not dead code though.  :)

WebCore/loader/CachedMetadataStore.h:108
 +      static const size_t unsignedSize = sizeof(unsigned);
This constant makes things harder to read.  Just use sizeof(unsigned).  That's perfectly clear.

WebCore/loader/CachedMetadataStore.h:107
 +      static const size_t dataTypeIDStart = 0;
This constant also doesn't really help.

WebCore/loader/CachedMetadataStore.h:54
 +      static PassOwnPtr<CachedMetadataStore> deserialize(const char* data, size_t size)
Why two parameters instead of a const Vector<char>&  ?

WebCore/loader/CachedMetadata.h:39
 +  class CachedMetadata : public RefCounted<CachedMetadata> {
I don't understand the role this class plays.  How is it different than Vector<char> ?  Maybe because it doesn't take ownership of the bytes?

WebCore/loader/CachedResource.h:148
 +      // Sets the serialized metadata retrieved from the platform's cache.
If this is a setter, maybe it should have "set" in its name?

WebCore/loader/CachedResource.h:232
 +      OwnPtr<CachedMetadataStore> m_cachedMetadataStore; // Lazy
Comment not needed.

WebCore/loader/CachedResource.h:153
 +      void setCachedMetadata(unsigned dataTypeID, CachedMetadata* cachedMetadata);
cachedMetadata parameter name not needed.

WebCore/loader/SubresourceLoader.cpp:181
 +      // So don't deliver any data to the loader yet.
I don't understand this comment.
Comment 19 Tony Gentilcore 2010-05-05 11:04:47 PDT
I have a couple of questions inline below. I'll hold off on uploading the next patch until those are answered.

By the way, do you think that reinterpret_cast is okay or should I ask around/research it a little more?

(In reply to comment #18)
> (From update of attachment 55079 [details])
> This is getting close.  I'd like to see this code used by something so it's not
> dead code though.  :)

This patch is hairy enough that I don't think I should add to it. I'll upload a separate patch for bindings/v8 now. Should it use this bug or a new bug?

> 
> WebCore/loader/CachedMetadataStore.h:108
>  +      static const size_t unsignedSize = sizeof(unsigned);
> This constant makes things harder to read.  Just use sizeof(unsigned).  That's
> perfectly clear.

Done.

> 
> WebCore/loader/CachedMetadataStore.h:107
>  +      static const size_t dataTypeIDStart = 0;
> This constant also doesn't really help.

I disagree. I think it is crucial for readability to have the offsets of the format in one place. If I just inline it, then I'm left with code that says readUnsigned(0) instead of readUnsigned(dataTypeIDStart). If you still feel strongly then I'll defer to you and inline it.

> 
> WebCore/loader/CachedMetadataStore.h:54
>  +      static PassOwnPtr<CachedMetadataStore> deserialize(const char* data,
> size_t size)
> Why two parameters instead of a const Vector<char>&  ?
> 

This is a side-effect of the main API. It looks like:


Load:     ResourceLoader::didReceiveCachedMetadata(const char*, int)
Persist:  ResourceHandle::cacheMetadata(const ResourceResponse&, const Vector<char>&)

They should probably both use Vectors or both use char*/length. My thought is to remove the Vector in favor of char*/length. But I wanted to confirm with you before making the change.

> WebCore/loader/CachedMetadata.h:39
>  +  class CachedMetadata : public RefCounted<CachedMetadata> {
> I don't understand the role this class plays.  How is it different than
> Vector<char> ?  Maybe because it doesn't take ownership of the bytes?
> 

It is basically a class to point to data without owning it (to avoid a copy). This allows CachedResource::cachedMetadata(unsigned) to return one object rather than the much uglier CachedResource::cachedMetadata(unsigned, const char**, int*). I couldn't find a suitable generic container in WebKit. Do you have a suggestion for a better way to return a pointer/length combination without a copy?

> WebCore/loader/CachedResource.h:148
>  +      // Sets the serialized metadata retrieved from the platform's cache.
> If this is a setter, maybe it should have "set" in its name?

Done. setSerializedCachedMetadata().

> 
> WebCore/loader/CachedResource.h:232
>  +      OwnPtr<CachedMetadataStore> m_cachedMetadataStore; // Lazy
> Comment not needed.

Done.

> 
> WebCore/loader/CachedResource.h:153
>  +      void setCachedMetadata(unsigned dataTypeID, CachedMetadata*
> cachedMetadata);
> cachedMetadata parameter name not needed.

Done.

> 
> WebCore/loader/SubresourceLoader.cpp:181
>  +      // So don't deliver any data to the loader yet.
> I don't understand this comment.

Oops, that was branched from didReceiveData, but it doesn't make sense here since its a one-shot. Removed.
Comment 20 Adam Barth 2010-05-05 11:34:33 PDT
(In reply to comment #19)
> I have a couple of questions inline below. I'll hold off on uploading the next
> patch until those are answered.
> 
> By the way, do you think that reinterpret_cast is okay or should I ask
> around/research it a little more?

I'd rather we didn't have to reinterpret_cast, but I'm not sure where's a way around it.

> > This is getting close.  I'd like to see this code used by something so it's not
> > dead code though.  :)
> 
> This patch is hairy enough that I don't think I should add to it. I'll upload a
> separate patch for bindings/v8 now. Should it use this bug or a new bug?

Good idea.  Please use a separate bug.  We can cross-link them if we need to using the "blocks" field.

> > WebCore/loader/CachedMetadataStore.h:107
> >  +      static const size_t dataTypeIDStart = 0;
> > This constant also doesn't really help.
> 
> I disagree. I think it is crucial for readability to have the offsets of the
> format in one place. If I just inline it, then I'm left with code that says
> readUnsigned(0) instead of readUnsigned(dataTypeIDStart). If you still feel
> strongly then I'll defer to you and inline it.

Ok.

> > WebCore/loader/CachedMetadataStore.h:54
> >  +      static PassOwnPtr<CachedMetadataStore> deserialize(const char* data,
> > size_t size)
> > Why two parameters instead of a const Vector<char>&  ?
> 
> This is a side-effect of the main API. It looks like:
> 
> Load:     ResourceLoader::didReceiveCachedMetadata(const char*, int)
> Persist:  ResourceHandle::cacheMetadata(const ResourceResponse&, const
> Vector<char>&)
> 
> They should probably both use Vectors or both use char*/length. My thought is
> to remove the Vector in favor of char*/length. But I wanted to confirm with you
> before making the change.

Using Vector is generally better because it's clear who owns the memory.  The main reason to use char* + length is if the malloc in Vector is too expensive or there's a critical memcpy that can't be removed.

> > WebCore/loader/CachedMetadata.h:39
> >  +  class CachedMetadata : public RefCounted<CachedMetadata> {
> > I don't understand the role this class plays.  How is it different than
> > Vector<char> ?  Maybe because it doesn't take ownership of the bytes?
> 
> It is basically a class to point to data without owning it (to avoid a copy).
> This allows CachedResource::cachedMetadata(unsigned) to return one object
> rather than the much uglier CachedResource::cachedMetadata(unsigned, const
> char**, int*). I couldn't find a suitable generic container in WebKit. Do you
> have a suggestion for a better way to return a pointer/length combination
> without a copy?

I guess my main concern is that the ownship of the memory is unclear.  How long can the caller hold on to this object before it explodes?  I think this is the problem that SharedBuffer is trying to solve.
Comment 21 Tony Gentilcore 2010-05-06 10:23:26 PDT
Created attachment 55255 [details]
Make CachedMetadata own data and reference count it
Comment 22 Adam Barth 2010-05-06 11:16:54 PDT
Comment on attachment 55255 [details]
Make CachedMetadata own data and reference count it

Looks good.  Thanks for iterating on this patch.  Please address the minor nits below before committing.  If you'd like the commit bot to land your patch, you can upload a new patch with these comments addressed.

WebCore/loader/CachedMetadata.h:66
 +          return new CachedMetadata(dataTypeID, data, size);
adoptRef?

WebCore/loader/CachedMetadata.h:71
 +          return new CachedMetadata(data, size);
adoptRef?

WebCore/loader/CachedMetadata.h:62
 +      friend class CachedResource;
You can just make these methods public.  WebKit isn't super tight about hiding interfaces.

WebCore/loader/ResourceLoader.h:101
 +          virtual void didReceiveCachedMetadata(ResourceHandle*, const char* data, int length) { didReceiveCachedMetadata(data, length); }
These methods don't seem to fit the pattern of the other methods.
Comment 23 Tony Gentilcore 2010-05-06 11:41:54 PDT
(In reply to comment #22)
> (From update of attachment 55255 [details])
> Looks good.  Thanks for iterating on this patch.  Please address the minor nits
> below before committing.  If you'd like the commit bot to land your patch, you
> can upload a new patch with these comments addressed.
> 
> WebCore/loader/CachedMetadata.h:66
>  +          return new CachedMetadata(dataTypeID, data, size);
> adoptRef?

Done.

> 
> WebCore/loader/CachedMetadata.h:71
>  +          return new CachedMetadata(data, size);
> adoptRef?

Done.

> 
> WebCore/loader/CachedMetadata.h:62
>  +      friend class CachedResource;
> You can just make these methods public.  WebKit isn't super tight about hiding
> interfaces.

Done.

> 
> WebCore/loader/ResourceLoader.h:101
>  +          virtual void didReceiveCachedMetadata(ResourceHandle*, const char*
> data, int length) { didReceiveCachedMetadata(data, length); }
> These methods don't seem to fit the pattern of the other methods.

The diff didn't show enough context. There are other methods a couple of lines below which have a one-line forward inlined like this. Do you want me to move it to the cpp file or is it fine to leave like the others?
Comment 24 Tony Gentilcore 2010-05-06 11:43:09 PDT
Created attachment 55279 [details]
Add adoptRefs, remove protected section
Comment 25 WebKit Commit Bot 2010-05-08 05:58:47 PDT
Comment on attachment 55279 [details]
Add adoptRefs, remove protected section

Clearing flags on attachment: 55279

Committed r59023: <http://trac.webkit.org/changeset/59023>
Comment 26 WebKit Commit Bot 2010-05-08 05:58:55 PDT
All reviewed patches have been landed.  Closing bug.
Comment 27 WebKit Review Bot 2010-05-08 06:34:12 PDT
http://trac.webkit.org/changeset/59023 might have broken Chromium Win Release
Comment 28 Eric Seidel 2010-05-08 09:58:50 PDT
Rolled out, due to chromium link failures.  webkit-patch rollout hit a bug and failed to re-open this one, sorry.  Fixing the webkit-patch bug now.
Comment 29 Eric Seidel 2010-05-08 11:10:17 PDT
https://bugs.webkit.org/show_bug.cgi?id=38803 was the webkit-patch rollout bug.  Fix posted.
Comment 30 Eric Seidel 2010-05-08 23:08:14 PDT
Comment on attachment 55255 [details]
Make CachedMetadata own data and reference count it

Cleared Adam Barth's review+ from obsolete attachment 55255 [details] so that this bug does not appear in http://webkit.org/pending-commit.
Comment 31 Tony Gentilcore 2010-05-10 10:50:45 PDT
Created attachment 55568 [details]
Patch
Comment 32 Adam Barth 2010-05-10 10:53:29 PDT
Comment on attachment 55568 [details]
Patch

ok
Comment 33 WebKit Commit Bot 2010-05-10 11:14:21 PDT
Comment on attachment 55568 [details]
Patch

Clearing flags on attachment: 55568

Committed r59087: <http://trac.webkit.org/changeset/59087>
Comment 34 WebKit Commit Bot 2010-05-10 11:14:30 PDT
All reviewed patches have been landed.  Closing bug.