Bug 27777

Summary: ImageSourceCG makes bad data refs (race condition causes blank images)
Product: WebKit Reporter: Avi Drissman <avi>
Component: PlatformAssignee: David Levin <levin>
Status: RESOLVED FIXED    
Severity: Normal CC: andersca, commit-queue, darin, dglazkov, hyatt, levin, mitz, rsesek
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: OS X 10.5   
Attachments:
Description Flags
Patch to fix
none
Patch to fix (ref/deref backing data)
darin: review+
Patch with Darin's cleanup applied
none
Patch with platform fix and "static"
darin: review+
Patch to fix crash
darin: review-
Revised patch.
darin: review-
With header file changes
none
Quick style patch none

Description Avi Drissman 2009-07-28 13:23:23 PDT
Created attachment 33665 [details]
Patch to fix

When you create a CGImageRef, it takes the CFDataRef that holds its backing data, and retains it for its entire lifetime.

The problem here is that ImageSourceCG uses CFDataCreateWithBytesNoCopy to create the CFDataRef containing the image data. When you use CFDataCreateWithBytesNoCopy, it's up to you to ensure that the backing store that you pass to it stays valid through the lifetime of the created CFDataRef.

Since the lifetime of the CFDataRef is the lifetime of the CGImageRef, ImageSourceCG makes a promise that it can't keep. The SharedBuffer is passed in as a parameter to setData--who knows if it will live longer than the created CGImageRef?
Comment 1 Dimitri Glazkov (Google) 2009-07-28 13:32:57 PDT
I am not the right guy to review this. Mitz, hyatt, darin, can you look at this?
Comment 2 Darin Adler 2009-07-28 13:33:22 PDT
Comment on attachment 33665 [details]
Patch to fix

This will create a substantial memory use regression because it requires copying the data for every image every time. I believe an alternate fix can be done that will tie the lifetime of the SharedBuffer to the lifetime of the CFData instead.

Please do not check in this fix without further discussion.
Comment 3 Avi Drissman 2009-07-28 13:35:22 PDT
(In reply to comment #2)
> I believe an alternate fix can be
> done that will tie the lifetime of the SharedBuffer to the lifetime of the
> CFData instead.

OK, let me work on that.
Comment 4 Darin Adler 2009-07-28 13:36:26 PDT
(In reply to comment #3)
> OK, let me work on that.

One thing we may need to consider to get that right is that the CFData could be deallocated on a non-main thread.
Comment 5 Avi Drissman 2009-07-28 14:45:40 PDT
Created attachment 33674 [details]
Patch to fix (ref/deref backing data)
Comment 6 Avi Drissman 2009-07-28 15:08:33 PDT
(In reply to comment #4)
> (In reply to comment #3)
> > OK, let me work on that.
> 
> One thing we may need to consider to get that right is that the CFData could be
> deallocated on a non-main thread.

Are images passed between threads?
Comment 7 Darin Adler 2009-07-28 15:12:21 PDT
Comment on attachment 33674 [details]
Patch to fix (ref/deref backing data)

> +void SharedBufferDerefCallback(void*, void* info)

Function names should start with a lowercase letter. Since this is private to this source file it should use "static" to get internal linkage.

> +    SharedBuffer* sharedBuffer = reinterpret_cast<SharedBuffer*>(info);

This should be static_cast, which can convert a void* to a specific pointer type.

> +    data->ref();
> +    CFAllocatorContext context = {0, data, 0, 0, 0, 0, 0, SharedBufferDerefCallback, 0};
> +    CFAllocatorRef derefAllocator = CFAllocatorCreate(kCFAllocatorDefault, &context);
> +    CFDataRef cfData = CFDataCreateWithBytesNoCopy(0, reinterpret_cast<const UInt8*>(data->data()), data->size(), derefAllocator);
> +    CFRelease(derefAllocator);

Creating a unique allocator each time is unfortunate, but I can't think of any better idea.

r=me but the style issues I mention above should be fixed
Comment 8 Darin Adler 2009-07-28 15:13:30 PDT
(In reply to comment #6)
> (In reply to comment #4)
> > (In reply to comment #3)
> > > OK, let me work on that.
> > 
> > One thing we may need to consider to get that right is that the CFData could be
> > deallocated on a non-main thread.
> 
> Are images passed between threads?

I believe CGImageRef is free to call release on its data from any thread it likes.
Comment 9 Avi Drissman 2009-07-28 15:30:56 PDT
> I believe CGImageRef is free to call release on its data from any thread it
> likes.

All I know is that when the CGImageRef is released, it releases its providers, which release their data, and so on down the chain. I've seen no documentation that speaks of thread safety in this regard. I can't say one way or another.

In the latest patch, trying to add "static" to the function causes tons of errors. Will an anonymous namespace do instead?
Comment 10 Avi Drissman 2009-07-28 15:31:42 PDT
Created attachment 33677 [details]
Patch with Darin's cleanup applied
Comment 11 Avi Drissman 2009-07-28 15:51:45 PDT
Created attachment 33679 [details]
Patch with platform fix and "static"

That should do it. Compiles with both PLATFORM(MAC) and not.
Comment 12 David Levin 2009-07-29 00:33:12 PDT
Assign to levin for landing.
Comment 13 David Levin 2009-07-29 00:36:03 PDT
Please add the bug title and bug link to the change log in the future.

If you do prepare-ChangeLog --bug #, it will be done for you.
Comment 14 David Levin 2009-07-29 01:08:40 PDT
Committed as http://trac.webkit.org/changeset/46527
Comment 15 Avi Drissman 2009-11-18 11:42:11 PST
This fix is wrong.

ImageSourceCG.cpp:119:
RetainPtr<CFDataRef> cfData(AdoptCF, CFDataCreateWithBytesNoCopy(0, reinterpret_cast<const UInt8*>(data->data()), data->size(), derefAllocator.get()));

The data pointer obtained from a SharedBuffer is valid only for the moment. If anyone adds data to the SharedBuffer, and that causes the internal Vector<char> to be reallocated, the data pointer will become stale, and WebCore will crash the next time it tries to access the data of the created CFData.

Writing a fix to create a CGDataProviderRef that wraps a SharedBuffer and that can properly access its bytes.

Patch to come soon.
Comment 16 Avi Drissman 2009-11-18 14:30:04 PST
Created attachment 43462 [details]
Patch to fix crash

This ensures that when CG tries to access the data backing the image, it always gets something valid.

BTW, technically-speaking, PDFDocumentImage::dataChanged has the same bug since it uses CFDataCreateWithBytesNoCopy. Mitigating it is the fact that it owns its data (it's a subclass of Image) and therefore holds the ref, and it only creates the CFDataRef once the transfer is complete, ensuring that the underlying buffer won't be growing further and invalidating the data pointer.
Comment 17 Darin Adler 2009-11-18 16:39:06 PST
Comment on attachment 43462 [details]
Patch to fix crash

Looks like a great fix!

Is there any way to test this? How did you discover this mistake?

> +        (WebCore::):

Please remove lines like this from the ChangeLog when prepare-ChangeLog puts them in.

> +    ssize_t amount = std::min(static_cast<off_t>(count), sharedBuffer->size() - position);

We normally use "using namespace std" at the top of the file and call functions as just "min" rather than "std::min".

I don't think ssize_t exists on Windows. I suggest just using off_t for the local variable.

You can use min<off_t> instead of casting count to off_t if you like.

> +    if (amount <= 0)
> +        return 0;

This seems unnecessary. The memcpy function already does the right thing when passed a zero. There is no chance of a negative number here, is there?

If we really do need to handle the case where position is greater than the buffer size, I think we should check that specifically by comparing size and position before substracting. Then we can use the type size_t within the function and avoid the signed arithmetic in the min expression.

> +// We use the GetBytesAtPosition callback rather than the GetBytePointer one because SharedBuffer
> +// does not provide a way to lock down the byte pointer and guarantee that it won't move, which
> +// is a requirement for using the GetBytePointer callback.
> +static const CGDataProviderDirectCallbacks kProviderCallbacks = {0, NULL, NULL, sharedBufferGetBytesAtPosition, sharedBufferRelease};

Can we just put this inside the function, instead of at namespace scope? There's no need for it to be a global. The old idiom used for the allocator seems cleaner to me.

We don't use "k" for constants in WebKit.

I know that the callbacks in the Core Foundation library itself use this style with the "k", but we don't.

We use "0" rather than NULL in WebKit.

We normally put spaces after the first "{" and before the last "}".

Would you be willing to do the same fix in PDFDocumentImage?

review- mainly because of the minor coding style issues
Comment 18 Avi Drissman 2009-11-18 16:57:22 PST
(In reply to comment #17)
> Looks like a great fix!

Thanks!

> Is there any way to test this? How did you discover this mistake?

The repro case seems to be a large image fractured. The first chunk arrives, an ImageSourceCG is created, someone (not sure how yet) adds more data, then the image is drawn.

It's clear that the code is wrong on a theoretical level, and I have crash reports from Chrome in the field that implicate this code (see http://crbug.com/12728 ). Whether or not this fix solves the problem, though, I can't say.

This was discovered after describing the problem to a colleague. I've stared at the code on and off for three months now; but telling it to her, we realized the problem in a few minutes :)

> If we really do need to handle the case where position is greater than the
> buffer size

Yes, we do. We've no good guarantee that the size of the buffer hasn't changed since we created the source. Now, it's unlikely that it's shrunk, but I'd rather be careful than pass a negative (or wrapped) size to memcpy.

> Then we can use the type size_t within the
> function and avoid the signed arithmetic in the min expression.

Good idea.

> Would you be willing to do the same fix in PDFDocumentImage?

Sure thing.

Revised patch to come.
Comment 19 Avi Drissman 2009-11-18 19:34:47 PST
Created attachment 43479 [details]
Revised patch.

With fixes. Is the sharing of the function OK to do that way?
Comment 20 Darin Adler 2009-11-19 09:44:21 PST
Comment on attachment 43479 [details]
Revised patch.

Looks great. Two style things that should be fixed before landing this.

> +    size_t source_size = sharedBuffer->size();

WebKit style uses sourceSize, not source_size, for a local variable like this.

> +    size_t amount = min<size_t>(count, source_size - position);

I presume you tried this without <size_t>. If it worked that way we would want to leave it out.

> +    const CGDataProviderDirectCallbacks providerCallbacks = { 0, 0, 0, sharedBufferGetBytesAtPosition, sharedBufferRelease };

The const on this line isn't helpful. In general, there are many local variables that could be marked const, but as a general rule we do not do it since the extra verbiage doesn't help clarify things.

We could decide to go the other way and start putting in const for local variables whenever possible, but I'd prefer not to do it on a case by case basis.

> +#if !PLATFORM(MAC)
> +// From ImageSourceCG
> +extern size_t sharedBufferGetBytesAtPosition(void* info, void* buffer, off_t position, size_t count);
> +#endif

This needs to go into a header file, ImageSourceCG.h. We don't put externs into other files like this. I believe that you can even add a header file without modifying any of the project files, although it's customary to reference it in the Mac and Windows projects.
Comment 21 Avi Drissman 2009-11-19 12:43:19 PST
(In reply to comment #20)
> > +    size_t amount = min<size_t>(count, source_size - position);
> 
> I presume you tried this without <size_t>. If it worked that way we would want
> to leave it out.

offset is an off_t which is signed, so we do need the specialization.

> The const on this line isn't helpful.

If that's how you feel, that's fine. I was imitating the array declarations in
imageSourceOptions().

> This needs to go into a header file, ImageSourceCG.h.

OK.

Revving patch now.
Comment 22 Avi Drissman 2009-11-19 14:48:54 PST
Created attachment 43525 [details]
With header file changes

That should do it.
Comment 23 Darin Adler 2009-11-19 15:00:15 PST
Comment on attachment 43525 [details]
With header file changes

>  #include "GraphicsContext.h"
>  #include "ImageObserver.h"
> +#if !PLATFORM(MAC)
> +#include "ImageSourceCG.h"
> +#endif
>  #include <wtf/MathExtras.h>

Conditional includes are supposed to be in their own separate paragraph, not in with the rest of the includes.
Comment 24 Avi Drissman 2009-11-19 15:08:02 PST
> Conditional includes are supposed to be in their own separate paragraph, not in
> with the rest of the includes.

Do you want a followup patch?
Comment 25 Darin Adler 2009-11-19 15:12:47 PST
(In reply to comment #24)
> > Conditional includes are supposed to be in their own separate paragraph, not in
> > with the rest of the includes.
> 
> Do you want a followup patch?

Sure, would be nice to clean that up. I didn't want to hold you up just for that though.
Comment 26 WebKit Commit Bot 2009-11-19 15:13:09 PST
Comment on attachment 43525 [details]
With header file changes

Clearing flags on attachment: 43525

Committed r51207: <http://trac.webkit.org/changeset/51207>
Comment 27 WebKit Commit Bot 2009-11-19 15:13:19 PST
All reviewed patches have been landed.  Closing bug.
Comment 28 Avi Drissman 2009-11-19 15:53:36 PST
Created attachment 43531 [details]
Quick style patch
Comment 29 Avi Drissman 2009-11-19 15:54:17 PST
(In reply to comment #28)
> Created an attachment (id=43531) [details]
> Quick style patch
Comment 30 WebKit Commit Bot 2009-11-19 16:47:17 PST
Comment on attachment 43531 [details]
Quick style patch

Clearing flags on attachment: 43531

Committed r51221: <http://trac.webkit.org/changeset/51221>
Comment 31 WebKit Commit Bot 2009-11-19 16:47:30 PST
All reviewed patches have been landed.  Closing bug.