Bug 37874 - Provide mechanism to cache metadata for a resource
: Provide mechanism to cache metadata for a resource
Status: RESOLVED FIXED
: WebKit
WebCore Misc.
: 528+ (Nightly build)
: All All
: P2 Enhancement
Assigned To:
:
:
:
: 38661 38665
  Show dependency treegraph
 
Reported: 2010-04-20 11:12 PST by
Modified: 2010-05-10 11:14 PST (History)


Attachments
Rough patch to get design feedback (13.34 KB, patch)
2010-04-20 13:22 PST, Tony Gentilcore
no flags Review Patch | Details | Formatted Diff | Diff
Added CacheableMetadata struct per abarth's recommendation (19.58 KB, patch)
2010-04-21 21:09 PST, Tony Gentilcore
no flags Review Patch | Details | Formatted Diff | Diff
Fix compile by overloading rather than specializing in WebData. (19.64 KB, patch)
2010-04-22 13:39 PST, Tony Gentilcore
abarth: review-
Review Patch | Details | Formatted Diff | Diff
Removed ID and type enum (21.81 KB, patch)
2010-04-23 10:28 PST, Tony Gentilcore
no flags Review Patch | Details | Formatted Diff | Diff
Provide default, empty implementation of dispatchDidGenerateCacheableMetadata so that each port doesn't need to implement it (21.81 KB, patch)
2010-04-26 12:50 PST, Tony Gentilcore
no flags Review Patch | Details | Formatted Diff | Diff
Pretty much rewritten to correct interface and more robustly store metadata. (29.03 KB, patch)
2010-05-03 13:28 PST, Tony Gentilcore
abarth: review-
Review Patch | Details | Formatted Diff | Diff
Get rid of map and fix smart pointers (20.73 KB, patch)
2010-05-04 17:35 PST, Tony Gentilcore
abarth: review-
Review Patch | Details | Formatted Diff | Diff
Make CachedMetadata own data and reference count it (17.59 KB, patch)
2010-05-06 10:23 PST, Tony Gentilcore
no flags Review Patch | Details | Formatted Diff | Diff
Add adoptRefs, remove protected section (17.57 KB, patch)
2010-05-06 11:43 PST, Tony Gentilcore
no flags Review Patch | Details | Formatted Diff | Diff
Patch (19.42 KB, patch)
2010-05-10 10:50 PST, Tony Gentilcore
no flags Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2010-04-20 11:12:59 PST
I'd like to add the ability to store metadata for a CachedResource.

This metadata will contain data which speeds up the processing of the resource. When it is generated by WebCore, the host application is notified so it may persist it to disk cache. When the host application loads a resource from disk cache, it can provide the cached metadata so that WebCore doesn't have to redo the work.

The immediate use-case is to persist V8's pre compilation data to Chrome's disk cache (http://crbug.com/32407). But it is conceivable that other types of resources could have useful cacheable metadata.
------- Comment #1 From 2010-04-20 13:22:57 PST -------
Created an attachment (id=53873) [details]
Rough patch to get design feedback
------- Comment #2 From 2010-04-21 21:09:32 PST -------
Created an attachment (id=54021) [details]
Added CacheableMetadata struct per abarth's recommendation
------- Comment #3 From 2010-04-22 11:46:57 PST -------
Attachment 54021 [details] did not build on chromium:
Build output: http://webkit-commit-queue.appspot.com/results/1854050
------- Comment #4 From 2010-04-22 11:48:51 PST -------
Attachment 54021 [details] did not build on gtk:
Build output: http://webkit-commit-queue.appspot.com/results/1794045
------- Comment #5 From 2010-04-22 13:39:33 PST -------
Created an attachment (id=54091) [details]
Fix compile by overloading rather than specializing in WebData.
------- Comment #6 From 2010-04-22 13:43:01 PST -------
Attachment 54091 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style']" exit_code: 1
WebKit/chromium/ChangeLog:5:  One or more unexpected \r (^M) found; better to use only a \n  [whitespace/carriage_return] [1]
Suppressing further [whitespace/carriage_return] reports for this file.
Total errors found: 2 in 16 files


If any of these errors are false positives, please file a bug against check-webkit-style.
------- Comment #7 From 2010-04-22 14:13:14 PST -------
(From update of attachment 54091 [details])
WebCore/loader/CachedResource.cpp:168
 +  void CachedResource::setCachedMetadata(CacheableMetadata::Type type, PassRefPtr<SharedBuffer> data, FrameLoader* frameLoader)
Why pass a FrameLoader* here?  Usually we pass around Frame*.

WebCore/loader/CachedResource.cpp:175
 +      if (frameLoader && frameLoader->client())
client() is always non-NULL.

WebCore/loader/CachedResource.h:148
 +      void setCachedMetadata(CacheableMetadata::Type, PassRefPtr<SharedBuffer>, FrameLoader*);
Does CachedResource have any other dependencies on Frame?  In general, we want the Frame to be less involved in loading, not more.

WebCore/loader/FrameLoaderClient.h:47
 +      struct CacheableMetadata;
We prefer classes to structs.

WebCore/platform/network/ResourceResponseBase.h:48
 +      long long m_id;
This whole thing is very strangely layered.  In WebKit, there's no necessary connection between the client and the networking layer.  This patch somehow assumes some strange round-tripping between the two.

Why is the client involved in this process at all?  It seems like WebCore should just ask the networking layer to cache these bytes.

WebCore/platform/network/ResourceResponseBase.h:52
 +          V8_SCRIPTDATA
Code in this location isn't allowed to know about V8.  We need a better way of assigning these type numbers.  Also, are these number persisted on disk?  We might need some comments explaining the rules for modifying these bits.
------- Comment #8 From 2010-04-22 15:07:13 PST -------
(In reply to comment #7)
> (From update of attachment 54091 [details] [details])
> WebCore/loader/CachedResource.cpp:168
>  +  void CachedResource::setCachedMetadata(CacheableMetadata::Type type,
> PassRefPtr<SharedBuffer> data, FrameLoader* frameLoader)
> Why pass a FrameLoader* here?  Usually we pass around Frame*.

No reason except my unfamiliarity with the code. Done.

Because I have a couple of questions inline below, I'm holding off on uploading a new patch until they are all worked out.

> 
> WebCore/loader/CachedResource.cpp:175
>  +      if (frameLoader && frameLoader->client())
> client() is always non-NULL.

Changed to only test frame. Let me know if loader needs to be tested as well.

> 
> WebCore/loader/CachedResource.h:148
>  +      void setCachedMetadata(CacheableMetadata::Type,
> PassRefPtr<SharedBuffer>, FrameLoader*);
> Does CachedResource have any other dependencies on Frame?  In general, we want
> the Frame to be less involved in loading, not more.

Not really. In the case that the resource is not cached, It has a reference to a DocLoader (which has a reference to Frame). Passing a frame in this way does feel like a strange thing to do, but it was the best I could come up with. Do you have any alternate ideas?

> 
> WebCore/loader/FrameLoaderClient.h:47
>  +      struct CacheableMetadata;
> We prefer classes to structs.

Good, me too. I took your previous comment to add a struct too literally. Fixed.

> 
> WebCore/platform/network/ResourceResponseBase.h:48
>  +      long long m_id;
> This whole thing is very strangely layered.  In WebKit, there's no necessary
> connection between the client and the networking layer.  This patch somehow
> assumes some strange round-tripping between the two.
> 
> Why is the client involved in this process at all?  It seems like WebCore
> should just ask the networking layer to cache these bytes.

That does sound nice. But I'm clearly missing something about the overall picture of the architecture. As far as I've been able to discover, routing through the FrameLoaderClient is the only way to get a message back to the embedding application. Can you point me to the interface where WebCore can send a notification to the networking layer with this data.

> 
> WebCore/platform/network/ResourceResponseBase.h:52
>  +          V8_SCRIPTDATA
> Code in this location isn't allowed to know about V8.  We need a better way of
> assigning these type numbers.  Also, are these number persisted on disk?  We
> might need some comments explaining the rules for modifying these bits.

Yes, it does seem odd. That's why I had initially not passed a type along with the metadata. That leaves it in the hands of each implementer of metadata to determine if the data belongs to them or not. In this case, V8Proxy would just see if the data starts with its magic number or not. An enum, by definition, means that we are naming each type of metadata that can be stored -- be it for JSC, V8, or something else -- and that we need a registry of those types. If you want to keep a type, but lose the enum, we could basically extract the magic number out of the stream and make the type a GUID that each implementer chooses (relying on the size of the space for uniqueness).
------- Comment #9 From 2010-04-23 08:40:47 PST -------
(In reply to comment #8)
> (In reply to comment #7)
> > (From update of attachment 54091 [details] [details] [details])
> > WebCore/loader/CachedResource.cpp:168
> >  +  void CachedResource::setCachedMetadata(CacheableMetadata::Type type,
> > PassRefPtr<SharedBuffer> data, FrameLoader* frameLoader)
> > Why pass a FrameLoader* here?  Usually we pass around Frame*.
> 
> No reason except my unfamiliarity with the code. Done.
> 
> Because I have a couple of questions inline below, I'm holding off on uploading
> a new patch until they are all worked out.
> 
> > 
> > WebCore/loader/CachedResource.cpp:175
> >  +      if (frameLoader && frameLoader->client())
> > client() is always non-NULL.
> 
> Changed to only test frame. Let me know if loader needs to be tested as well.
> 
> > 
> > WebCore/loader/CachedResource.h:148
> >  +      void setCachedMetadata(CacheableMetadata::Type,
> > PassRefPtr<SharedBuffer>, FrameLoader*);
> > Does CachedResource have any other dependencies on Frame?  In general, we want
> > the Frame to be less involved in loading, not more.
> 
> Not really. In the case that the resource is not cached, It has a reference to
> a DocLoader (which has a reference to Frame). Passing a frame in this way does
> feel like a strange thing to do, but it was the best I could come up with. Do
> you have any alternate ideas?
> 
> > 
> > WebCore/loader/FrameLoaderClient.h:47
> >  +      struct CacheableMetadata;
> > We prefer classes to structs.
> 
> Good, me too. I took your previous comment to add a struct too literally.
> Fixed.
> 
> > 
> > WebCore/platform/network/ResourceResponseBase.h:48
> >  +      long long m_id;
> > This whole thing is very strangely layered.  In WebKit, there's no necessary
> > connection between the client and the networking layer.  This patch somehow
> > assumes some strange round-tripping between the two.
> > 
> > Why is the client involved in this process at all?  It seems like WebCore
> > should just ask the networking layer to cache these bytes.
> 
> That does sound nice. But I'm clearly missing something about the overall
> picture of the architecture. As far as I've been able to discover, routing
> through the FrameLoaderClient is the only way to get a message back to the
> embedding application. Can you point me to the interface where WebCore can send
> a notification to the networking layer with this data.

FYI -- I found a way to get rid of the ID. Just need to figure out how to handle type now.

> 
> > 
> > WebCore/platform/network/ResourceResponseBase.h:52
> >  +          V8_SCRIPTDATA
> > Code in this location isn't allowed to know about V8.  We need a better way of
> > assigning these type numbers.  Also, are these number persisted on disk?  We
> > might need some comments explaining the rules for modifying these bits.
> 
> Yes, it does seem odd. That's why I had initially not passed a type along with
> the metadata. That leaves it in the hands of each implementer of metadata to
> determine if the data belongs to them or not. In this case, V8Proxy would just
> see if the data starts with its magic number or not. An enum, by definition,
> means that we are naming each type of metadata that can be stored -- be it for
> JSC, V8, or something else -- and that we need a registry of those types. If
> you want to keep a type, but lose the enum, we could basically extract the
> magic number out of the stream and make the type a GUID that each implementer
> chooses (relying on the size of the space for uniqueness).
------- Comment #10 From 2010-04-23 10:28:36 PST -------
Created an attachment (id=54170) [details]
Removed ID and type enum

- Regarding ID:
I realized that the fact that chrome needs a response time is an implementation detail of the embedding application. Also, response time was already in ResourceLoaderBridge::ResponseInfo, so all I had to do was add it to WebURLResponse and then the ID could be removed from CacheableData.

- Regarding Type:
I removed the enum in favor of a GUID-style scheme. Let me know if you like that better or if you have another idea. An additional alternative is just to go back to the original approach where the magic number is in the metadata.
------- Comment #11 From 2010-04-23 12:15:32 PST -------
Attachment 54170 [details] did not build on gtk:
Build output: http://webkit-commit-queue.appspot.com/results/1892004
------- Comment #12 From 2010-04-26 12:50:47 PST -------
Created an attachment (id=54323) [details]
Provide default, empty implementation of dispatchDidGenerateCacheableMetadata so that each port doesn't need to implement it
------- Comment #13 From 2010-05-03 13:28:52 PST -------
Created an attachment (id=54957) [details]
Pretty much rewritten to correct interface and more robustly store metadata.

Per our offline discussion, I've basically rewritten this patch.
 - Uses ResourceHandle interface to request that platform cache data.
 - Uses didReceiveCachedMetadata to receive cached data from platform.
 - Has map of metadata keyed by type.
------- Comment #14 From 2010-05-03 14:07:56 PST -------
(From update of attachment 54957 [details])
Structurally, this looks a lot better.  There's some confusing about how RefPtr/PassRefPtr/OwnPtr/PassOwnPtr works.  If you haven't read http://webkit.org/coding/RefPtr.html, it can be helpful in explaining how these classes are meant to be used.

I don't like the fancy template / subclass / reinterpret_cast way of serializing metadata.  It's a bit hard to tell from this patch how this machinery will be used, but I'd start with something dead simple and we can make it more complicated if we need to.  The simplest thing is probably a static registry of IDs and only allowing for kind of cached metadata per resource.  That will get rid of all the mystical code about subclasses being forbidden from adding data members.  In any case, it looks like the clients of this system need to covert their metadata into byte arrays already.

WebCore/loader/CachedMetadata.h:5
 +      modify it under the terms of the GNU Library General Public
Google usually contributes under the BSD license.  This license is fine, but just a little outside the norm.

WebCore/loader/CachedMetadata.h:38
 +  // Subclasses must not contain non-static member variables because they would
Why not just make the destructor virtual?

WebCore/loader/CachedMetadataMap.cpp:32
 +  // static
We usually skip these comments in WebKit.

WebCore/loader/CachedMetadata.h:42
 +      static CachedMetadata* create(const char* data, size_t size)
This should probably return a PassOwnPtr
If you haven't read http://webkit.org/coding/RefPtr.html, you might find it helpful.
WebCore/loader/CachedMetadataMap.cpp:57
 +      return buffer;
return buffer.release();

WebCore/loader/CachedMetadataMap.cpp:76
 +              // The dataSize has lied to us which means the format is corrupt.
Do we want to signal this error condition somehow instead of failing silently?

WebCore/loader/CachedMetadataMap.cpp:79
 +              CachedMetadata* cachedMetadata = CachedMetadata::create(reinterpret_cast<const char*>(buffer + bufferPosition), dataSize);
Why do we need this reinterpret_cast here?  Isn't buffer already a const char*?

WebCore/loader/CachedMetadataMap.cpp:91
 +  void CachedMetadataMap::appendValueToBuffer(const T* value, PassRefPtr<SharedBuffer> buffer) const
This should be a raw SharedBuffer*.  We use PassRefPtr to signal that we're passing ownership of the refcounted object.

WebCore/loader/CachedMetadataMap.cpp:93
 +      buffer->append(reinterpret_cast<const char*>(value), sizeof(T));
Yuck.  This isn't strictly ANSI compliant.

WebCore/loader/CachedMetadataMap.cpp:108
 +      *bufferPosition += readSize;
WebKit usually uses non-const reference for output parameters instead of pointers.

WebCore/loader/CachedMetadataMap.h:59
 +          // the static_casts to work properly. This means no member variables.
Oh, I see.  Hum...  There's got to be a better design.  This is too much low-level memory poking.
I feel like you're trying to use the C++ type system to do a bunch of work that it's not very good at doing.  Why do you have a separate subtype for each kind of thing you might want to cache?  It feels like you just want to have some sort of static registry or something to avoid conflicts.

A lot of this feels like duplication of Pickle.  I guess we don't have that here, do we...
WebCore/loader/CachedResource.cpp:172
 +          m_cachedMetadataMap.set(CachedMetadataMap::create().release());
I don't think this release() is correct.

WebCore/loader/CachedResource.cpp:407
 +      ASSERT(m_cachedMetadataMap->size() == 1);
Hum...  This mechanism seems like drastic overkill if there's only ever one thing in the map.  Maybe make the simpler thing first and we can elaborate it as needed later.

WebCore/loader/CachedResource.h:166
 +              return m_cachedMetadataMap->get<T>();
I'm not really sure what this templatization is buying you.

WebCore/platform/network/ResourceHandle.cpp:148
 +  void ResourceHandle::cacheMetadata(const ResourceResponse&, PassRefPtr<SharedBuffer>)
This PassRefPtr is confusing here.  Are we taking ownership of the SharedBuffer?  If so, is it ok to do nothing here?

WebCore/platform/network/ResourceHandle.h:115
 +      static void cacheMetadata(const ResourceResponse&, PassRefPtr<SharedBuffer>);
I suspect this should be a SharedBuffer*, but I'm not sure.

WebKit/chromium/tests/CachedMetadataMapTest.cpp:13
 +   * distribution.
This is the more usual license for Google contributions.

WebKit/chromium/tests/CachedMetadataMapTest.cpp:32
 +  // original KURL's.
This comment seems wrong.
------- Comment #15 From 2010-05-04 17:34:26 PST -------
(In reply to comment #14)
> (From update of attachment 54957 [details] [details])
> Structurally, this looks a lot better.  There's some confusing about how
> RefPtr/PassRefPtr/OwnPtr/PassOwnPtr works.  If you haven't read
> http://webkit.org/coding/RefPtr.html, it can be helpful in explaining how these
> classes are meant to be used.

Thanks, I hadn't seen that doc. It was extremely helpful. You are right, I didn't understand how the WTF smart pointers worked.

Would it be controversial if I mailed a patch to add a comment pointing to the documentation in RefPtr.h?

> 
> I don't like the fancy template / subclass / reinterpret_cast way of
> serializing metadata.  It's a bit hard to tell from this patch how this
> machinery will be used, but I'd start with something dead simple and we can
> make it more complicated if we need to.  The simplest thing is probably a
> static registry of IDs and only allowing for kind of cached metadata per
> resource.  That will get rid of all the mystical code about subclasses being
> forbidden from adding data members.  In any case, it looks like the clients of
> this system need to covert their metadata into byte arrays already.

Okay, I've tossed the map and gone back to the assumption that there is one type of metadata per resource. This simplifies things a great deal, but still requires each WebCore generator of metadata to choose a GUID-like ID. I can't think of any way to have a static registry unless you are okay with a single bottle-neck file somewhere that has ID names like "V8Metadata", "JSCMetadata", etc.

> 
> WebCore/loader/CachedMetadata.h:5
>  +      modify it under the terms of the GNU Library General Public
> Google usually contributes under the BSD license.  This license is fine, but
> just a little outside the norm.

Oops. I just copied the license from the nearest file not realizing that files within WebKit have different licences. Hopefully I've got the right one now.

> 
> WebCore/loader/CachedMetadata.h:38
>  +  // Subclasses must not contain non-static member variables because they
> would
> Why not just make the destructor virtual?

Moot.

> 
> WebCore/loader/CachedMetadataMap.cpp:32
>  +  // static
> We usually skip these comments in WebKit.

Moot.

> 
> WebCore/loader/CachedMetadata.h:42
>  +      static CachedMetadata* create(const char* data, size_t size)
> This should probably return a PassOwnPtr
> If you haven't read http://webkit.org/coding/RefPtr.html, you might find it
> helpful.

Thanks again for the doc. You are right.

> WebCore/loader/CachedMetadataMap.cpp:57
>  +      return buffer;
> return buffer.release();

Moot.

> 
> WebCore/loader/CachedMetadataMap.cpp:76
>  +              // The dataSize has lied to us which means the format is
> corrupt.
> Do we want to signal this error condition somehow instead of failing silently?

Moot.

> 
> WebCore/loader/CachedMetadataMap.cpp:79
>  +              CachedMetadata* cachedMetadata =
> CachedMetadata::create(reinterpret_cast<const char*>(buffer + bufferPosition),
> dataSize);
> Why do we need this reinterpret_cast here?  Isn't buffer already a const char*?

Moot.

> 
> WebCore/loader/CachedMetadataMap.cpp:91
>  +  void CachedMetadataMap::appendValueToBuffer(const T* value,
> PassRefPtr<SharedBuffer> buffer) const
> This should be a raw SharedBuffer*.  We use PassRefPtr to signal that we're
> passing ownership of the refcounted object.

I'm now passing a const Vector<char>&. Is that ideal?

> 
> WebCore/loader/CachedMetadataMap.cpp:93
>  +      buffer->append(reinterpret_cast<const char*>(value), sizeof(T));
> Yuck.  This isn't strictly ANSI compliant.
> 

Are you referring to ANSI C (i.e. char could be another size) or ANSI unicode (i.e. doesn't handle UTF-8)?

If C: I could switch to uint8_t/uint32_t instead of char/unsigned. But I'd still have to cast at the API level to make this useful.

If Unicode: Keep in mind that I'm expecting this to be a serialization of bytes of metadata. It is not a string.

If something else, please explain.

> WebCore/loader/CachedMetadataMap.cpp:108
>  +      *bufferPosition += readSize;
> WebKit usually uses non-const reference for output parameters instead of
> pointers.

Moot.

> 
> WebCore/loader/CachedMetadataMap.h:59
>  +          // the static_casts to work properly. This means no member
> variables.
> Oh, I see.  Hum...  There's got to be a better design.  This is too much
> low-level memory poking.
> I feel like you're trying to use the C++ type system to do a bunch of work that
> it's not very good at doing.  Why do you have a separate subtype for each kind
> of thing you might want to cache?  It feels like you just want to have some
> sort of static registry or something to avoid conflicts.
> 
> A lot of this feels like duplication of Pickle.  I guess we don't have that
> here, do we...

Okay. I was hoping to make the API just slightly cleaner by exposing:
cachedResource.get<MyDataType>();
instead of:
cachedResource.get(myDataTypeID);

But in the interest of simplicity, I've gotten rid of all of the templates.

Yes, if Pickle were available, I'd be able to use that. WebCore/bindings/v8/SerializedScriptValue.cpp has similar serialization logic. But since my format is so simple (an unsigned followed by a byte bucket), I've just rolled my own. Let me know if you have a better idea for serialization.

> WebCore/loader/CachedResource.cpp:172
>  +          m_cachedMetadataMap.set(CachedMetadataMap::create().release());
> I don't think this release() is correct.

Moot.

> 
> WebCore/loader/CachedResource.cpp:407
>  +      ASSERT(m_cachedMetadataMap->size() == 1);
> Hum...  This mechanism seems like drastic overkill if there's only ever one
> thing in the map.  Maybe make the simpler thing first and we can elaborate it
> as needed later.

Done.

> 
> WebCore/loader/CachedResource.h:166
>  +              return m_cachedMetadataMap->get<T>();
> I'm not really sure what this templatization is buying you.

See previous comment. I thought it made the API more natural. It is gone now.

> 
> WebCore/platform/network/ResourceHandle.cpp:148
>  +  void ResourceHandle::cacheMetadata(const ResourceResponse&,
> PassRefPtr<SharedBuffer>)
> This PassRefPtr is confusing here.  Are we taking ownership of the
> SharedBuffer?  If so, is it ok to do nothing here?

I didn't understand PassRefPtr properly before reading that document. It is now a raw pointer.

> 
> WebCore/platform/network/ResourceHandle.h:115
>  +      static void cacheMetadata(const ResourceResponse&,
> PassRefPtr<SharedBuffer>);
> I suspect this should be a SharedBuffer*, but I'm not sure.

Yep, done.

> 
> WebKit/chromium/tests/CachedMetadataMapTest.cpp:13
>  +   * distribution.
> This is the more usual license for Google contributions.
> 

I've used this license throughout.

> WebKit/chromium/tests/CachedMetadataMapTest.cpp:32
>  +  // original KURL's.
> This comment seems wrong.

Yep, unintentionally branched from another test. Removed.
------- Comment #16 From 2010-05-04 17:35:52 PST -------
Created an attachment (id=55079) [details]
Get rid of map and fix smart pointers
------- Comment #17 From 2010-05-05 09:59:04 PST -------
> Would it be controversial if I mailed a patch to add a comment pointing to the
> documentation in RefPtr.h?

Please do.  Not enough people read that doc.  :)

> > WebCore/loader/CachedMetadataMap.cpp:91
> >  +  void CachedMetadataMap::appendValueToBuffer(const T* value,
> > PassRefPtr<SharedBuffer> buffer) const
> > This should be a raw SharedBuffer*.  We use PassRefPtr to signal that we're
> > passing ownership of the refcounted object.
> 
> I'm now passing a const Vector<char>&. Is that ideal?

That's probably good.  I'll look.

> > WebCore/loader/CachedMetadataMap.cpp:93
> >  +      buffer->append(reinterpret_cast<const char*>(value), sizeof(T));
> > Yuck.  This isn't strictly ANSI compliant.
> 
> Are you referring to ANSI C (i.e. char could be another size) or ANSI unicode
> (i.e. doesn't handle UTF-8)?
> 
> If C: I could switch to uint8_t/uint32_t instead of char/unsigned. But I'd
> still have to cast at the API level to make this useful.
> 
> If Unicode: Keep in mind that I'm expecting this to be a serialization of bytes
> of metadata. It is not a string.
> 
> If something else, please explain.

I meant I don't think it's a strictly legal reinterpret_cast according to ANSI C.  Just my way of complaining about low-level memory poking.

Looking at the new patch now.
------- Comment #18 From 2010-05-05 10:11:09 PST -------
(From update of attachment 55079 [details])
This is getting close.  I'd like to see this code used by something so it's not dead code though.  :)

WebCore/loader/CachedMetadataStore.h:108
 +      static const size_t unsignedSize = sizeof(unsigned);
This constant makes things harder to read.  Just use sizeof(unsigned).  That's perfectly clear.

WebCore/loader/CachedMetadataStore.h:107
 +      static const size_t dataTypeIDStart = 0;
This constant also doesn't really help.

WebCore/loader/CachedMetadataStore.h:54
 +      static PassOwnPtr<CachedMetadataStore> deserialize(const char* data, size_t size)
Why two parameters instead of a const Vector<char>&  ?

WebCore/loader/CachedMetadata.h:39
 +  class CachedMetadata : public RefCounted<CachedMetadata> {
I don't understand the role this class plays.  How is it different than Vector<char> ?  Maybe because it doesn't take ownership of the bytes?

WebCore/loader/CachedResource.h:148
 +      // Sets the serialized metadata retrieved from the platform's cache.
If this is a setter, maybe it should have "set" in its name?

WebCore/loader/CachedResource.h:232
 +      OwnPtr<CachedMetadataStore> m_cachedMetadataStore; // Lazy
Comment not needed.

WebCore/loader/CachedResource.h:153
 +      void setCachedMetadata(unsigned dataTypeID, CachedMetadata* cachedMetadata);
cachedMetadata parameter name not needed.

WebCore/loader/SubresourceLoader.cpp:181
 +      // So don't deliver any data to the loader yet.
I don't understand this comment.
------- Comment #19 From 2010-05-05 11:04:47 PST -------
I have a couple of questions inline below. I'll hold off on uploading the next patch until those are answered.

By the way, do you think that reinterpret_cast is okay or should I ask around/research it a little more?

(In reply to comment #18)
> (From update of attachment 55079 [details] [details])
> This is getting close.  I'd like to see this code used by something so it's not
> dead code though.  :)

This patch is hairy enough that I don't think I should add to it. I'll upload a separate patch for bindings/v8 now. Should it use this bug or a new bug?

> 
> WebCore/loader/CachedMetadataStore.h:108
>  +      static const size_t unsignedSize = sizeof(unsigned);
> This constant makes things harder to read.  Just use sizeof(unsigned).  That's
> perfectly clear.

Done.

> 
> WebCore/loader/CachedMetadataStore.h:107
>  +      static const size_t dataTypeIDStart = 0;
> This constant also doesn't really help.

I disagree. I think it is crucial for readability to have the offsets of the format in one place. If I just inline it, then I'm left with code that says readUnsigned(0) instead of readUnsigned(dataTypeIDStart). If you still feel strongly then I'll defer to you and inline it.

> 
> WebCore/loader/CachedMetadataStore.h:54
>  +      static PassOwnPtr<CachedMetadataStore> deserialize(const char* data,
> size_t size)
> Why two parameters instead of a const Vector<char>&  ?
> 

This is a side-effect of the main API. It looks like:


Load:     ResourceLoader::didReceiveCachedMetadata(const char*, int)
Persist:  ResourceHandle::cacheMetadata(const ResourceResponse&, const Vector<char>&)

They should probably both use Vectors or both use char*/length. My thought is to remove the Vector in favor of char*/length. But I wanted to confirm with you before making the change.

> WebCore/loader/CachedMetadata.h:39
>  +  class CachedMetadata : public RefCounted<CachedMetadata> {
> I don't understand the role this class plays.  How is it different than
> Vector<char> ?  Maybe because it doesn't take ownership of the bytes?
> 

It is basically a class to point to data without owning it (to avoid a copy). This allows CachedResource::cachedMetadata(unsigned) to return one object rather than the much uglier CachedResource::cachedMetadata(unsigned, const char**, int*). I couldn't find a suitable generic container in WebKit. Do you have a suggestion for a better way to return a pointer/length combination without a copy?

> WebCore/loader/CachedResource.h:148
>  +      // Sets the serialized metadata retrieved from the platform's cache.
> If this is a setter, maybe it should have "set" in its name?

Done. setSerializedCachedMetadata().

> 
> WebCore/loader/CachedResource.h:232
>  +      OwnPtr<CachedMetadataStore> m_cachedMetadataStore; // Lazy
> Comment not needed.

Done.

> 
> WebCore/loader/CachedResource.h:153
>  +      void setCachedMetadata(unsigned dataTypeID, CachedMetadata*
> cachedMetadata);
> cachedMetadata parameter name not needed.

Done.

> 
> WebCore/loader/SubresourceLoader.cpp:181
>  +      // So don't deliver any data to the loader yet.
> I don't understand this comment.

Oops, that was branched from didReceiveData, but it doesn't make sense here since its a one-shot. Removed.
------- Comment #20 From 2010-05-05 11:34:33 PST -------
(In reply to comment #19)
> I have a couple of questions inline below. I'll hold off on uploading the next
> patch until those are answered.
> 
> By the way, do you think that reinterpret_cast is okay or should I ask
> around/research it a little more?

I'd rather we didn't have to reinterpret_cast, but I'm not sure where's a way around it.

> > This is getting close.  I'd like to see this code used by something so it's not
> > dead code though.  :)
> 
> This patch is hairy enough that I don't think I should add to it. I'll upload a
> separate patch for bindings/v8 now. Should it use this bug or a new bug?

Good idea.  Please use a separate bug.  We can cross-link them if we need to using the "blocks" field.

> > WebCore/loader/CachedMetadataStore.h:107
> >  +      static const size_t dataTypeIDStart = 0;
> > This constant also doesn't really help.
> 
> I disagree. I think it is crucial for readability to have the offsets of the
> format in one place. If I just inline it, then I'm left with code that says
> readUnsigned(0) instead of readUnsigned(dataTypeIDStart). If you still feel
> strongly then I'll defer to you and inline it.

Ok.

> > WebCore/loader/CachedMetadataStore.h:54
> >  +      static PassOwnPtr<CachedMetadataStore> deserialize(const char* data,
> > size_t size)
> > Why two parameters instead of a const Vector<char>&  ?
> 
> This is a side-effect of the main API. It looks like:
> 
> Load:     ResourceLoader::didReceiveCachedMetadata(const char*, int)
> Persist:  ResourceHandle::cacheMetadata(const ResourceResponse&, const
> Vector<char>&)
> 
> They should probably both use Vectors or both use char*/length. My thought is
> to remove the Vector in favor of char*/length. But I wanted to confirm with you
> before making the change.

Using Vector is generally better because it's clear who owns the memory.  The main reason to use char* + length is if the malloc in Vector is too expensive or there's a critical memcpy that can't be removed.

> > WebCore/loader/CachedMetadata.h:39
> >  +  class CachedMetadata : public RefCounted<CachedMetadata> {
> > I don't understand the role this class plays.  How is it different than
> > Vector<char> ?  Maybe because it doesn't take ownership of the bytes?
> 
> It is basically a class to point to data without owning it (to avoid a copy).
> This allows CachedResource::cachedMetadata(unsigned) to return one object
> rather than the much uglier CachedResource::cachedMetadata(unsigned, const
> char**, int*). I couldn't find a suitable generic container in WebKit. Do you
> have a suggestion for a better way to return a pointer/length combination
> without a copy?

I guess my main concern is that the ownship of the memory is unclear.  How long can the caller hold on to this object before it explodes?  I think this is the problem that SharedBuffer is trying to solve.
------- Comment #21 From 2010-05-06 10:23:26 PST -------
Created an attachment (id=55255) [details]
Make CachedMetadata own data and reference count it
------- Comment #22 From 2010-05-06 11:16:54 PST -------
(From update of attachment 55255 [details])
Looks good.  Thanks for iterating on this patch.  Please address the minor nits below before committing.  If you'd like the commit bot to land your patch, you can upload a new patch with these comments addressed.

WebCore/loader/CachedMetadata.h:66
 +          return new CachedMetadata(dataTypeID, data, size);
adoptRef?

WebCore/loader/CachedMetadata.h:71
 +          return new CachedMetadata(data, size);
adoptRef?

WebCore/loader/CachedMetadata.h:62
 +      friend class CachedResource;
You can just make these methods public.  WebKit isn't super tight about hiding interfaces.

WebCore/loader/ResourceLoader.h:101
 +          virtual void didReceiveCachedMetadata(ResourceHandle*, const char* data, int length) { didReceiveCachedMetadata(data, length); }
These methods don't seem to fit the pattern of the other methods.
------- Comment #23 From 2010-05-06 11:41:54 PST -------
(In reply to comment #22)
> (From update of attachment 55255 [details] [details])
> Looks good.  Thanks for iterating on this patch.  Please address the minor nits
> below before committing.  If you'd like the commit bot to land your patch, you
> can upload a new patch with these comments addressed.
> 
> WebCore/loader/CachedMetadata.h:66
>  +          return new CachedMetadata(dataTypeID, data, size);
> adoptRef?

Done.

> 
> WebCore/loader/CachedMetadata.h:71
>  +          return new CachedMetadata(data, size);
> adoptRef?

Done.

> 
> WebCore/loader/CachedMetadata.h:62
>  +      friend class CachedResource;
> You can just make these methods public.  WebKit isn't super tight about hiding
> interfaces.

Done.

> 
> WebCore/loader/ResourceLoader.h:101
>  +          virtual void didReceiveCachedMetadata(ResourceHandle*, const char*
> data, int length) { didReceiveCachedMetadata(data, length); }
> These methods don't seem to fit the pattern of the other methods.

The diff didn't show enough context. There are other methods a couple of lines below which have a one-line forward inlined like this. Do you want me to move it to the cpp file or is it fine to leave like the others?
------- Comment #24 From 2010-05-06 11:43:09 PST -------
Created an attachment (id=55279) [details]
Add adoptRefs, remove protected section
------- Comment #25 From 2010-05-08 05:58:47 PST -------
(From update of attachment 55279 [details])
Clearing flags on attachment: 55279

Committed r59023: <http://trac.webkit.org/changeset/59023>
------- Comment #26 From 2010-05-08 05:58:55 PST -------
All reviewed patches have been landed.  Closing bug.
------- Comment #27 From 2010-05-08 06:34:12 PST -------
http://trac.webkit.org/changeset/59023 might have broken Chromium Win Release
------- Comment #28 From 2010-05-08 09:58:50 PST -------
Rolled out, due to chromium link failures.  webkit-patch rollout hit a bug and failed to re-open this one, sorry.  Fixing the webkit-patch bug now.
------- Comment #29 From 2010-05-08 11:10:17 PST -------
https://bugs.webkit.org/show_bug.cgi?id=38803 was the webkit-patch rollout bug.  Fix posted.
------- Comment #30 From 2010-05-08 23:08:14 PST -------
(From update of attachment 55255 [details])
Cleared Adam Barth's review+ from obsolete attachment 55255 [details] so that this bug does not appear in http://webkit.org/pending-commit.
------- Comment #31 From 2010-05-10 10:50:45 PST -------
Created an attachment (id=55568) [details]
Patch
------- Comment #32 From 2010-05-10 10:53:29 PST -------
(From update of attachment 55568 [details])
ok
------- Comment #33 From 2010-05-10 11:14:21 PST -------
(From update of attachment 55568 [details])
Clearing flags on attachment: 55568

Committed r59087: <http://trac.webkit.org/changeset/59087>
------- Comment #34 From 2010-05-10 11:14:30 PST -------
All reviewed patches have been landed.  Closing bug.