Bug 65063 - REGRESSION(r91628): 3 canvas tests crash on Chromium Linux and one test fail on Chromium Mac
: REGRESSION(r91628): 3 canvas tests crash on Chromium Linux and one test fail ...
Status: RESOLVED FIXED
: WebKit
Platform
: 528+ (Nightly build)
: Unspecified Unspecified
: P1 Normal
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2011-07-22 21:01 PST by
Modified: 2011-08-19 14:53 PST (History)


Attachments
Patch (6.31 KB, patch)
2011-08-03 12:41 PST, Tom Hudson
no flags Review Patch | Details | Formatted Diff | Diff
Patch (6.31 KB, patch)
2011-08-03 12:48 PST, Tom Hudson
no flags Review Patch | Details | Formatted Diff | Diff
Patch (5.84 KB, patch)
2011-08-18 21:26 PST, James Robinson
no flags Review Patch | Details | Formatted Diff | Diff
Patch (5.84 KB, patch)
2011-08-18 21:42 PST, James Robinson
no flags Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2011-07-22 21:01:57 PST
Following tests started crashing on Chromium Linux after r91628:
fast/canvas/webgl/drawingbuffer-test.html
fast/canvas/canvas-bg-multiple-removal.html
fast/canvas/canvas-as-image-incremental-repaint.html

http://build.webkit.org/results/Chromium%20Linux%20Release%20(Tests)/r91629%20(21411)/results.html

In addition, css2.1/t0905-c414-flt-01-d-g.html is failing with image differences on Chromium Mac.
------- Comment #1 From 2011-07-22 21:11:06 PST -------
This bug is definitely a P1 since this is a crash regression.
------- Comment #2 From 2011-07-23 21:36:59 PST -------
Hm, I saw this crash in testing, and thought I fixed it. I won't be able to fix until next week though.
------- Comment #3 From 2011-07-25 10:41:55 PST -------
*** Bug 65121 has been marked as a duplicate of this bug. ***
------- Comment #4 From 2011-07-25 10:42:41 PST -------
(In reply to comment #3)
> *** Bug 65121 has been marked as a duplicate of this bug. ***

http://test-results.appspot.com/dashboards/flakiness_dashboard.html#showExpectations=true&group=%40ToT%20GPU%20Mesa%20-%20chromium.org&tests=fast%2Fcanvas%2Fwebgl%2Fgl-enable-enum-test.html

Also add fast/canvas/webgl/gl-enable-enum-test.html to that list.
------- Comment #5 From 2011-07-25 12:06:44 PST -------
Do we have a crash log?
------- Comment #6 From 2011-07-31 14:07:24 PST -------
(In reply to comment #5)
> Do we have a crash log?

I don't; I don't have access to Chromium Linux build.
------- Comment #7 From 2011-08-01 10:17:30 PST -------
Here's a stack from my chromium linux build.  This doesn't look related to Simon's checkin:


[3660:3660:5964953942181:ERROR:process_util_posix.cc(134)] Received signal 11
        base::debug::StackTrace::StackTrace() [0x8a5c0a]
        base::(anonymous namespace)::StackDumpSignalHandler() [0x86df1d]
        0x7f65eeb7bc20
        SkRefCnt::getRefCnt() [0x8ea3fc]
        SkBitmap::validate() [0x8ea044]
        SkBitmap::~SkBitmap() [0x8e5986]
        WebCore::NativeImageSkia::~NativeImageSkia() [0x104ae16]
        WebCore::BitmapImageSingleFrameSkia::~BitmapImageSingleFrameSkia() [0x104a95e]
        WTF::RefCounted<>::deref() [0x4ed5d8]
        WTF::derefIfNotNull<>() [0x4ecbb9]
        WTF::RefPtr<>::~RefPtr() [0x4ec355]
        WebCore::HTMLCanvasElement::~HTMLCanvasElement() [0xe847dd]
        WebCore::TreeShared<>::removedLastRef() [0xd908d8]
        WebCore::TreeShared<>::deref() [0x47116d]
        WTF::derefIfNotNull<>() [0xdcf977]
        WTF::RefPtr<>::~RefPtr() [0xdc7abb]
        std::pair<>::~pair() [0xdca072]
        WTF::HashTable<>::deallocateTable() [0xdca0c9]
        WTF::HashTable<>::clear() [0xdca532]
        WTF::HashMap<>::clear() [0xdc415e]
        WebCore::Document::removedLastRef() [0xdac9a9]
        WebCore::TreeShared<>::deref() [0x47116d]
        WebCore::DOMDataStore::weakNodeCallback() [0x1565726]
        v8::internal::GlobalHandles::Node::PostGarbageCollectionProcessing() [0xa41dca]
        v8::internal::GlobalHandles::PostGarbageCollectionProcessing() [0xa40735]
        v8::internal::Heap::PerformGarbageCollection() [0xa5298c]
        v8::internal::Heap::CollectGarbage() [0xa51f45]
        v8::internal::Heap::CollectGarbage() [0xa0bbe1]
        v8::internal::Heap::CollectAllGarbage() [0xa51d7f]
        v8::internal::Heap::IdleNotification() [0xa5bdb7]
        v8::internal::V8::IdleNotification() [0xc223a3]
        v8::V8::IdleNotification() [0x9bf0f4]
        WebCore::V8GCForContextDispose::pseudoIdleTimerFired() [0x110cac7]
        WebCore::Timer<>::fired() [0x110cc0c]
        WebCore::ThreadTimers::sharedTimerFiredInternal() [0xf9ed9c]
        WebCore::ThreadTimers::sharedTimerFired() [0xf9ecd3]
        webkit_glue::WebKitClientImpl::DoTimeout() [0x195ab7e]
        DispatchToMethod<>() [0x195b283]
        base::BaseTimer<>::TimerTask::Run() [0x195b1d8]
        base::subtle::TaskClosureAdapter::Run() [0x88720f]
        base::internal::Invoker1<>::DoInvoke() [0x8515be]
        base::Callback<>::Run() [0x8503d7]
        MessageLoop::RunTask() [0x84e055]
        MessageLoop::DeferOrRunPendingTask() [0x84e15d]
        MessageLoop::DoWork() [0x84e973]
        base::MessagePumpGlib::RunWithDispatcher() [0x89c075]
        base::MessagePumpGlib::Run() [0x89c456]
        MessageLoop::RunInternal() [0x84de49]
        MessageLoop::RunHandler() [0x84dcfc]
        MessageLoop::Run() [0x84d717]
        webkit_support::RunMessageLoop() [0x5bc37b]
        TestShell::waitTestFinished() [0x45d9b4]
        TestShell::runFileTest() [0x4566a4]
        runTest() [0x42c56c]
        main [0x42d040]
        0x7f65eeb66d8e
        0x41b5d9
------- Comment #8 From 2011-08-01 10:18:16 PST -------
To repro, I ran the following:
./out/Debug/DumpRenderTree --pixel-tests /src/WebKit/LayoutTests/fast/canvas/webgl/*.html
------- Comment #9 From 2011-08-01 10:36:46 PST -------
This looks like a Chromium or Skia-specific issue.
------- Comment #10 From 2011-08-01 14:47:57 PST -------
With Chromium Linux 15.0.841.0 (WebKit r92135), drawingbuffer-test, canvas-bg-multiple-removal, and gl-enable-enum-test all run fine.

canvas-as-image-incremental-repaint.html crashes.

HTMLCanvasElement::didDraw() is attempting to call m_copiedImage.clear(). When it arrives at SkBitmap::~SkBitmap, the bitmap appears to be corrupt, with fColorTable = 0x1.

m_copiedImage is set validly in a call from CSSCanvasValue::image(). At this point, the Bitmap looks valid: 300x300, 1200 row bytes, NULL fColorTable, NULL fPixels, Config 6, BytesPerPixel 4.

BitmapImageSingleFrameSkia appears to have *two* SkBitmaps, named Native and Resized. When we enter HTMLCanvasElement::didDraw(), the m_nativeImage is unchanged, but the m_resizedImage - formerly all NULL - now has a fColorTable = 0x1.
------- Comment #11 From 2011-08-03 08:52:49 PST -------
This appears to be a regression of http://code.google.com/p/chromium/issues/detail?id=79739 / https://bugs.webkit.org/show_bug.cgi?id=58821. However, the changes to GraphicsContext3DSkia::getImageData() made then seem to be intact, suggesting it's a similar bad cast in some other place.

At least three callers are using Image::isBitmapImage() as RTTI and casting an Image to a BitmapImage so they can call the functions on it. Although BitmapImageSingleFrameSkia is implemented as a bitmap, it doesn't have all the other baggage that has accreted around BitmapImage (multiple animation frames).
------- Comment #12 From 2011-08-03 12:41:03 PST -------
Created an attachment (id=102814) [details]
Patch
------- Comment #13 From 2011-08-03 12:45:10 PST -------
(From update of attachment 102814 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=102814&action=review

> Source/WebCore/ChangeLog:6
> +        REGRESSION(r91628): 3 canvas tests crash on Chromium Linux and one test fail on Chromium Mac
> +        Remove casts of BitmapImageSingleFrameSkia to WebCore::BitmapImage;
> +        add virtual functions to WebCore::Image to make casts unnecessary.
> +        https://bugs.webkit.org/show_bug.cgi?id=65063

Wrong format. The first two lines should be the bug title and the bug url.

> Source/WebCore/ChangeLog:10
> +        No new tests; was caught by preexisting tests.

You need to have a long description here below "Reviewed by" line and a blank line.
------- Comment #14 From 2011-08-03 12:46:32 PST -------
(From update of attachment 102814 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=102814&action=review

> Source/WebCore/platform/graphics/cg/GraphicsContext3DCG.cpp:103
> +    bool hasAlpha = image->hasTransparency();

This now seems to have slightly different semantics:  it checks if currentFrameHasAlpha(), rather than frame 0. Is that intentional?  (Maybe this is actually more correct?)
------- Comment #15 From 2011-08-03 12:48:42 PST -------
Created an attachment (id=102817) [details]
Patch
------- Comment #16 From 2011-08-03 12:55:53 PST -------
(From update of attachment 102814 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=102814&action=review

> Source/WebCore/platform/graphics/BitmapImage.h:163
> +    virtual bool hasTransparency() { return currentFrameHasAlpha(); }

It’s rarely helpful to have an inline definition of a function that is also virtual. If this is going to be virtual, I suggest putting the implementation into BitmapImage.cpp rather than putting it here in the header.

> Source/WebCore/platform/graphics/BitmapImage.h:-169
> -#if !ASSERT_DISABLED
> -    bool notSolidColor()
> +    virtual bool notSolidColor()
>      {
>          return size().width() != 1 || size().height() != 1 || frameCount() > 1;
>      }
> -#endif

It’s rarely helpful to have an inline definition of a function that is also virtual. If this is going to be virtual, I suggest putting the implementation into BitmapImage.cpp rather than putting it here in the header.

Why is this now outside ASSERT_DISABLED #if? I ask because the imprecision of this before was partly justified by the fact that it’s for assertions only.

There is no rationale for this change in the change log. In fact, the change log says nothing.

> Source/WebCore/platform/graphics/Image.h:92
> +    // This is not RTTI and may not indicate that the object can
> +    // be safely cast to WebCore::BitmapImage on all ports.
>      virtual bool isBitmapImage() const { return false; }

This comment indicates that we need to rename the function; not add a comment. It’s highly likely someone will use this function wrong and the comment won’t necessarily help.

Many people are not familiar with the RTTI acronym and so it would be better not to use it.

> Source/WebCore/platform/graphics/Image.h:163
> +    virtual bool hasTransparency () { return true; }
> +    virtual bool notSolidColor () { return true; }

I don’t like either of these function names. The reason is that both don’t do what they say. There are many images that have no transparency at all that will still return true for hasTransparency. There are many images that are a solid color that will return true for notSolidColor. Ideally the functions would have precise names rather than imprecise ones. What does it mean if these functions return false? We should name them so their behavior actually matches what they are named, even if that means a longer name.

These are formatted wrong with spaces before braces.

Also, it’s rarely helpful to have an inline definition of a function that is also virtual. If this is going to be virtual, I suggest putting the implementation into Image.cpp rather than putting it here in the header.
------- Comment #17 From 2011-08-03 12:56:35 PST -------
(From update of attachment 102817 [details])
My comments all apply to this new patch as well as the old one.
------- Comment #18 From 2011-08-03 13:02:25 PST -------
(In reply to comment #14)
> (From update of attachment 102814 [details] [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=102814&action=review
> 
> > Source/WebCore/platform/graphics/cg/GraphicsContext3DCG.cpp:103
> > +    bool hasAlpha = image->hasTransparency();
> 
> This now seems to have slightly different semantics:  it checks if currentFrameHasAlpha(), rather than frame 0. Is that intentional?  (Maybe this is actually more correct?)

(In reply to comment #14)
> (From update of attachment 102814 [details] [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=102814&action=review
> 
> > Source/WebCore/platform/graphics/cg/GraphicsContext3DCG.cpp:103
> > +    bool hasAlpha = image->hasTransparency();
> 
> This now seems to have slightly different semantics:  it checks if currentFrameHasAlpha(), rather than frame 0. Is that intentional?  (Maybe this is actually more correct?)

Oh, good point. I think we need somebody familiar with the CG port here - it looks like the subsequent code SOMETIMES operates on frame 0, and SOMETIMES on the current frame?

This may mean that the attempt to define "isBitmapImage()" as "semantically bitmap-like" rather than "fake WebCore::BitmapImage RTTI" isn't going to work, and we'll have to make one of the other fixes discussed in email.

I will be out of the office for 2 of the next 3 weeks; if somebody wants to fix this bug before then, other routes forward we've considered are:

1. Return false from BitmapImageSingleFrameSkia::isBitmapImage(); add a new function (hasRaster()? hasBitmapSemantics()?) and use that in RenderLayerBacking::isDirectlyCompositedImage(),
ImageQualityController::shouldPaintAtLowQuality(), and possibly CachedImage::destroyDecodedData().

2. Have BitmapImageSingleFrameSkia inherit from WebCore::BitmapImage, and stub out all of the functionality there we don't want.

3. Separate out AnimatedBitmapImage from BitmapImage; make BitmapImageSingleFrameSkia inherit from BitmapImage. This feels to me like it leaves the source code in the cleanest state, but has a heavy impact on all the other ports.
------- Comment #19 From 2011-08-03 13:28:56 PST -------
Darin,

Thanks for the review. 

(In reply to comment #16)
> Why is this now outside ASSERT_DISABLED #if? I ask because the imprecision of this before was partly justified by the fact that it’s for assertions only.

It looks like this is debugging code left over from testing m_isSolidColor. Can it be removed safely and the unit tests relied on? If not, and we decide to keep this function, I can push the #if !ASSERT_DISABLED to other classes that implement it.

> > Source/WebCore/platform/graphics/Image.h:92
> > +    // This is not RTTI and may not indicate that the object can
> > +    // be safely cast to WebCore::BitmapImage on all ports.
> >      virtual bool isBitmapImage() const { return false; }
> 
> This comment indicates that we need to rename the function; not add a comment. It’s highly likely someone will use this function wrong and the comment won’t necessarily help.

The function appears to be being used in two different ways:
 1 - can I cast to a BitmapImage
 2 - does this thing have the semantics of a bitmap

This patch was attempting to remove any uses of case #1 (a previous patch for bug 55821 only addressed a single caller). I'd be happy to add a second function that directly addresses case #2, but it seems to me that #1 is common in WebKit and is difficult to stop people from doing, thus the other approaches we considered that I mentioned in comment #18 above.

> > Source/WebCore/platform/graphics/Image.h:163
> > +    virtual bool hasTransparency () { return true; }
> > +    virtual bool notSolidColor () { return true; }
> 
> I don’t like either of these function names. The reason is that both don’t do what they say. There are many images that have no transparency at all that will still return true for hasTransparency. There are many images that are a solid color that will return true for notSolidColor. Ideally the functions would have precise names rather than imprecise ones. What does it mean if these functions return false? We should name them so their behavior actually matches what they are named, even if that means a longer name.

notSolidColor() was preexisting on BitmapImage; I agree that the name is not descriptive. isSinglePixelConstantColor() (with the sense reversed)?

Do you have alternate name suggestions for hasTransparency()? mayBeTransparent() / isNotCertainlyOpaque(), perhaps?

> Also, it’s rarely helpful to have an inline definition of a function that is also virtual. If this is going to be virtual, I suggest putting the implementation into Image.cpp rather than putting it here in the header.

Inline virtuals are absolutely endemic to these class headers. Should I go through and move the rest into their cpp files? As a new WebKit contributor I was trying to follow established style regardless of whether it conflicted with best practices I'd learned elsewhere.
------- Comment #20 From 2011-08-18 21:04:16 PST -------
This still crashing, we're doing a bad cast on Image* to BitmapImage* and then calling the wrong member functions.  Totally not cool to leave a memory stomping crash in the tree for so long :(

I'll try to pick this up and fix it, seems like the patch stalled out mostly over naming and style concerns.
------- Comment #21 From 2011-08-18 21:11:27 PST -------
(In reply to comment #11)
> This appears to be a regression of http://code.google.com/p/chromium/issues/detail?id=79739 / https://bugs.webkit.org/show_bug.cgi?id=58821. However, the changes to GraphicsContext3DSkia::getImageData() made then seem to be intact, suggesting it's a similar bad cast in some other place.
> 
> At least three callers are using Image::isBitmapImage() as RTTI and casting an Image to a BitmapImage so they can call the functions on it. Although BitmapImageSingleFrameSkia is implemented as a bitmap, it doesn't have all the other baggage that has accreted around BitmapImage (multiple animation frames).

I think this is a really serious bug that we need to fix with BitmapImageSingleFrameSkia.  That's an incredibly fragile pattern and very different from the rest of how WebKit works.  That said, the more important thing right now is to fix the crashes, and then later fix the design flaws that lead to it being so easy to write crashers like this.
------- Comment #22 From 2011-08-18 21:26:52 PST -------
Created an attachment (id=104455) [details]
Patch
------- Comment #23 From 2011-08-18 21:28:29 PST -------
Updated the patch.  Darin Adler's feedback is valid, but right now we have a crash on ToT and crashes on Chrome builds in the wild.  I haven't changed any function names in this patch, so we can improve them later, but it's imperative that we fix the crash regression ASAP.
------- Comment #24 From 2011-08-18 21:42:49 PST -------
Created an attachment (id=104457) [details]
Patch
------- Comment #25 From 2011-08-19 14:50:19 PST -------
(From update of attachment 104457 [details])
Clearing flags on attachment: 104457

Committed r93441: <http://trac.webkit.org/changeset/93441>
------- Comment #26 From 2011-08-19 14:50:27 PST -------
All reviewed patches have been landed.  Closing bug.
------- Comment #27 From 2011-08-19 14:53:08 PST -------
Filed https://bugs.webkit.org/show_bug.cgi?id=66597 to discuss fixing the design issue here.  The Image/BitmapImage code could also use some general cleanup (like moving the definition of virtual functions to the .cpp, some of the function names are pretty poor) - I feel those should be handled with separate bugs.