Bug 65063 - REGRESSION(r91628): 3 canvas tests crash on Chromium Linux and one test fail on Chromium Mac
Summary: REGRESSION(r91628): 3 canvas tests crash on Chromium Linux and one test fail ...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Platform (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P1 Normal
Assignee: James Robinson
URL:
Keywords:
: 65121 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-07-22 21:01 PDT by Ryosuke Niwa
Modified: 2011-08-19 14:53 PDT (History)
9 users (show)

See Also:


Attachments
Patch (6.31 KB, patch)
2011-08-03 12:41 PDT, Tom Hudson
no flags Details | Formatted Diff | Diff
Patch (6.31 KB, patch)
2011-08-03 12:48 PDT, Tom Hudson
no flags Details | Formatted Diff | Diff
Patch (5.84 KB, patch)
2011-08-18 21:26 PDT, James Robinson
no flags Details | Formatted Diff | Diff
Patch (5.84 KB, patch)
2011-08-18 21:42 PDT, James Robinson
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ryosuke Niwa 2011-07-22 21:01:57 PDT
Following tests started crashing on Chromium Linux after r91628:
fast/canvas/webgl/drawingbuffer-test.html
fast/canvas/canvas-bg-multiple-removal.html
fast/canvas/canvas-as-image-incremental-repaint.html

http://build.webkit.org/results/Chromium%20Linux%20Release%20(Tests)/r91629%20(21411)/results.html

In addition, css2.1/t0905-c414-flt-01-d-g.html is failing with image differences on Chromium Mac.
Comment 1 Ryosuke Niwa 2011-07-22 21:11:06 PDT
This bug is definitely a P1 since this is a crash regression.
Comment 2 Simon Fraser (smfr) 2011-07-23 21:36:59 PDT
Hm, I saw this crash in testing, and thought I fixed it. I won't be able to fix until next week though.
Comment 3 Adrienne Walker 2011-07-25 10:41:55 PDT
*** Bug 65121 has been marked as a duplicate of this bug. ***
Comment 4 Adrienne Walker 2011-07-25 10:42:41 PDT
(In reply to comment #3)
> *** Bug 65121 has been marked as a duplicate of this bug. ***

http://test-results.appspot.com/dashboards/flakiness_dashboard.html#showExpectations=true&group=%40ToT%20GPU%20Mesa%20-%20chromium.org&tests=fast%2Fcanvas%2Fwebgl%2Fgl-enable-enum-test.html

Also add fast/canvas/webgl/gl-enable-enum-test.html to that list.
Comment 5 Simon Fraser (smfr) 2011-07-25 12:06:44 PDT
Do we have a crash log?
Comment 6 Ryosuke Niwa 2011-07-31 14:07:24 PDT
(In reply to comment #5)
> Do we have a crash log?

I don't; I don't have access to Chromium Linux build.
Comment 7 Tony Chang 2011-08-01 10:17:30 PDT
Here's a stack from my chromium linux build.  This doesn't look related to Simon's checkin:


[3660:3660:5964953942181:ERROR:process_util_posix.cc(134)] Received signal 11
        base::debug::StackTrace::StackTrace() [0x8a5c0a]
        base::(anonymous namespace)::StackDumpSignalHandler() [0x86df1d]
        0x7f65eeb7bc20
        SkRefCnt::getRefCnt() [0x8ea3fc]
        SkBitmap::validate() [0x8ea044]
        SkBitmap::~SkBitmap() [0x8e5986]
        WebCore::NativeImageSkia::~NativeImageSkia() [0x104ae16]
        WebCore::BitmapImageSingleFrameSkia::~BitmapImageSingleFrameSkia() [0x104a95e]
        WTF::RefCounted<>::deref() [0x4ed5d8]
        WTF::derefIfNotNull<>() [0x4ecbb9]
        WTF::RefPtr<>::~RefPtr() [0x4ec355]
        WebCore::HTMLCanvasElement::~HTMLCanvasElement() [0xe847dd]
        WebCore::TreeShared<>::removedLastRef() [0xd908d8]
        WebCore::TreeShared<>::deref() [0x47116d]
        WTF::derefIfNotNull<>() [0xdcf977]
        WTF::RefPtr<>::~RefPtr() [0xdc7abb]
        std::pair<>::~pair() [0xdca072]
        WTF::HashTable<>::deallocateTable() [0xdca0c9]
        WTF::HashTable<>::clear() [0xdca532]
        WTF::HashMap<>::clear() [0xdc415e]
        WebCore::Document::removedLastRef() [0xdac9a9]
        WebCore::TreeShared<>::deref() [0x47116d]
        WebCore::DOMDataStore::weakNodeCallback() [0x1565726]
        v8::internal::GlobalHandles::Node::PostGarbageCollectionProcessing() [0xa41dca]
        v8::internal::GlobalHandles::PostGarbageCollectionProcessing() [0xa40735]
        v8::internal::Heap::PerformGarbageCollection() [0xa5298c]
        v8::internal::Heap::CollectGarbage() [0xa51f45]
        v8::internal::Heap::CollectGarbage() [0xa0bbe1]
        v8::internal::Heap::CollectAllGarbage() [0xa51d7f]
        v8::internal::Heap::IdleNotification() [0xa5bdb7]
        v8::internal::V8::IdleNotification() [0xc223a3]
        v8::V8::IdleNotification() [0x9bf0f4]
        WebCore::V8GCForContextDispose::pseudoIdleTimerFired() [0x110cac7]
        WebCore::Timer<>::fired() [0x110cc0c]
        WebCore::ThreadTimers::sharedTimerFiredInternal() [0xf9ed9c]
        WebCore::ThreadTimers::sharedTimerFired() [0xf9ecd3]
        webkit_glue::WebKitClientImpl::DoTimeout() [0x195ab7e]
        DispatchToMethod<>() [0x195b283]
        base::BaseTimer<>::TimerTask::Run() [0x195b1d8]
        base::subtle::TaskClosureAdapter::Run() [0x88720f]
        base::internal::Invoker1<>::DoInvoke() [0x8515be]
        base::Callback<>::Run() [0x8503d7]
        MessageLoop::RunTask() [0x84e055]
        MessageLoop::DeferOrRunPendingTask() [0x84e15d]
        MessageLoop::DoWork() [0x84e973]
        base::MessagePumpGlib::RunWithDispatcher() [0x89c075]
        base::MessagePumpGlib::Run() [0x89c456]
        MessageLoop::RunInternal() [0x84de49]
        MessageLoop::RunHandler() [0x84dcfc]
        MessageLoop::Run() [0x84d717]
        webkit_support::RunMessageLoop() [0x5bc37b]
        TestShell::waitTestFinished() [0x45d9b4]
        TestShell::runFileTest() [0x4566a4]
        runTest() [0x42c56c]
        main [0x42d040]
        0x7f65eeb66d8e
        0x41b5d9
Comment 8 Tony Chang 2011-08-01 10:18:16 PDT
To repro, I ran the following:
./out/Debug/DumpRenderTree --pixel-tests /src/WebKit/LayoutTests/fast/canvas/webgl/*.html
Comment 9 Simon Fraser (smfr) 2011-08-01 10:36:46 PDT
This looks like a Chromium or Skia-specific issue.
Comment 10 Tom Hudson 2011-08-01 14:47:57 PDT
With Chromium Linux 15.0.841.0 (WebKit r92135), drawingbuffer-test, canvas-bg-multiple-removal, and gl-enable-enum-test all run fine.

canvas-as-image-incremental-repaint.html crashes.

HTMLCanvasElement::didDraw() is attempting to call m_copiedImage.clear(). When it arrives at SkBitmap::~SkBitmap, the bitmap appears to be corrupt, with fColorTable = 0x1.

m_copiedImage is set validly in a call from CSSCanvasValue::image(). At this point, the Bitmap looks valid: 300x300, 1200 row bytes, NULL fColorTable, NULL fPixels, Config 6, BytesPerPixel 4.

BitmapImageSingleFrameSkia appears to have *two* SkBitmaps, named Native and Resized. When we enter HTMLCanvasElement::didDraw(), the m_nativeImage is unchanged, but the m_resizedImage - formerly all NULL - now has a fColorTable = 0x1.
Comment 11 Tom Hudson 2011-08-03 08:52:49 PDT
This appears to be a regression of http://code.google.com/p/chromium/issues/detail?id=79739 / https://bugs.webkit.org/show_bug.cgi?id=58821. However, the changes to GraphicsContext3DSkia::getImageData() made then seem to be intact, suggesting it's a similar bad cast in some other place.

At least three callers are using Image::isBitmapImage() as RTTI and casting an Image to a BitmapImage so they can call the functions on it. Although BitmapImageSingleFrameSkia is implemented as a bitmap, it doesn't have all the other baggage that has accreted around BitmapImage (multiple animation frames).
Comment 12 Tom Hudson 2011-08-03 12:41:03 PDT
Created attachment 102814 [details]
Patch
Comment 13 Ryosuke Niwa 2011-08-03 12:45:10 PDT
Comment on attachment 102814 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=102814&action=review

> Source/WebCore/ChangeLog:6
> +        REGRESSION(r91628): 3 canvas tests crash on Chromium Linux and one test fail on Chromium Mac
> +        Remove casts of BitmapImageSingleFrameSkia to WebCore::BitmapImage;
> +        add virtual functions to WebCore::Image to make casts unnecessary.
> +        https://bugs.webkit.org/show_bug.cgi?id=65063

Wrong format. The first two lines should be the bug title and the bug url.

> Source/WebCore/ChangeLog:10
> +        No new tests; was caught by preexisting tests.

You need to have a long description here below "Reviewed by" line and a blank line.
Comment 14 Stephen White 2011-08-03 12:46:32 PDT
Comment on attachment 102814 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=102814&action=review

> Source/WebCore/platform/graphics/cg/GraphicsContext3DCG.cpp:103
> +    bool hasAlpha = image->hasTransparency();

This now seems to have slightly different semantics:  it checks if currentFrameHasAlpha(), rather than frame 0. Is that intentional?  (Maybe this is actually more correct?)
Comment 15 Tom Hudson 2011-08-03 12:48:42 PDT
Created attachment 102817 [details]
Patch
Comment 16 Darin Adler 2011-08-03 12:55:53 PDT
Comment on attachment 102814 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=102814&action=review

> Source/WebCore/platform/graphics/BitmapImage.h:163
> +    virtual bool hasTransparency() { return currentFrameHasAlpha(); }

It’s rarely helpful to have an inline definition of a function that is also virtual. If this is going to be virtual, I suggest putting the implementation into BitmapImage.cpp rather than putting it here in the header.

> Source/WebCore/platform/graphics/BitmapImage.h:-169
> -#if !ASSERT_DISABLED
> -    bool notSolidColor()
> +    virtual bool notSolidColor()
>      {
>          return size().width() != 1 || size().height() != 1 || frameCount() > 1;
>      }
> -#endif

It’s rarely helpful to have an inline definition of a function that is also virtual. If this is going to be virtual, I suggest putting the implementation into BitmapImage.cpp rather than putting it here in the header.

Why is this now outside ASSERT_DISABLED #if? I ask because the imprecision of this before was partly justified by the fact that it’s for assertions only.

There is no rationale for this change in the change log. In fact, the change log says nothing.

> Source/WebCore/platform/graphics/Image.h:92
> +    // This is not RTTI and may not indicate that the object can
> +    // be safely cast to WebCore::BitmapImage on all ports.
>      virtual bool isBitmapImage() const { return false; }

This comment indicates that we need to rename the function; not add a comment. It’s highly likely someone will use this function wrong and the comment won’t necessarily help.

Many people are not familiar with the RTTI acronym and so it would be better not to use it.

> Source/WebCore/platform/graphics/Image.h:163
> +    virtual bool hasTransparency () { return true; }
> +    virtual bool notSolidColor () { return true; }

I don’t like either of these function names. The reason is that both don’t do what they say. There are many images that have no transparency at all that will still return true for hasTransparency. There are many images that are a solid color that will return true for notSolidColor. Ideally the functions would have precise names rather than imprecise ones. What does it mean if these functions return false? We should name them so their behavior actually matches what they are named, even if that means a longer name.

These are formatted wrong with spaces before braces.

Also, it’s rarely helpful to have an inline definition of a function that is also virtual. If this is going to be virtual, I suggest putting the implementation into Image.cpp rather than putting it here in the header.
Comment 17 Darin Adler 2011-08-03 12:56:35 PDT
Comment on attachment 102817 [details]
Patch

My comments all apply to this new patch as well as the old one.
Comment 18 Tom Hudson 2011-08-03 13:02:25 PDT
(In reply to comment #14)
> (From update of attachment 102814 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=102814&action=review
> 
> > Source/WebCore/platform/graphics/cg/GraphicsContext3DCG.cpp:103
> > +    bool hasAlpha = image->hasTransparency();
> 
> This now seems to have slightly different semantics:  it checks if currentFrameHasAlpha(), rather than frame 0. Is that intentional?  (Maybe this is actually more correct?)

(In reply to comment #14)
> (From update of attachment 102814 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=102814&action=review
> 
> > Source/WebCore/platform/graphics/cg/GraphicsContext3DCG.cpp:103
> > +    bool hasAlpha = image->hasTransparency();
> 
> This now seems to have slightly different semantics:  it checks if currentFrameHasAlpha(), rather than frame 0. Is that intentional?  (Maybe this is actually more correct?)

Oh, good point. I think we need somebody familiar with the CG port here - it looks like the subsequent code SOMETIMES operates on frame 0, and SOMETIMES on the current frame?

This may mean that the attempt to define "isBitmapImage()" as "semantically bitmap-like" rather than "fake WebCore::BitmapImage RTTI" isn't going to work, and we'll have to make one of the other fixes discussed in email.

I will be out of the office for 2 of the next 3 weeks; if somebody wants to fix this bug before then, other routes forward we've considered are:

1. Return false from BitmapImageSingleFrameSkia::isBitmapImage(); add a new function (hasRaster()? hasBitmapSemantics()?) and use that in RenderLayerBacking::isDirectlyCompositedImage(),
ImageQualityController::shouldPaintAtLowQuality(), and possibly CachedImage::destroyDecodedData().

2. Have BitmapImageSingleFrameSkia inherit from WebCore::BitmapImage, and stub out all of the functionality there we don't want.

3. Separate out AnimatedBitmapImage from BitmapImage; make BitmapImageSingleFrameSkia inherit from BitmapImage. This feels to me like it leaves the source code in the cleanest state, but has a heavy impact on all the other ports.
Comment 19 Tom Hudson 2011-08-03 13:28:56 PDT
Darin,

Thanks for the review. 

(In reply to comment #16)
> Why is this now outside ASSERT_DISABLED #if? I ask because the imprecision of this before was partly justified by the fact that it’s for assertions only.

It looks like this is debugging code left over from testing m_isSolidColor. Can it be removed safely and the unit tests relied on? If not, and we decide to keep this function, I can push the #if !ASSERT_DISABLED to other classes that implement it.

> > Source/WebCore/platform/graphics/Image.h:92
> > +    // This is not RTTI and may not indicate that the object can
> > +    // be safely cast to WebCore::BitmapImage on all ports.
> >      virtual bool isBitmapImage() const { return false; }
> 
> This comment indicates that we need to rename the function; not add a comment. It’s highly likely someone will use this function wrong and the comment won’t necessarily help.

The function appears to be being used in two different ways:
 1 - can I cast to a BitmapImage
 2 - does this thing have the semantics of a bitmap

This patch was attempting to remove any uses of case #1 (a previous patch for bug 55821 only addressed a single caller). I'd be happy to add a second function that directly addresses case #2, but it seems to me that #1 is common in WebKit and is difficult to stop people from doing, thus the other approaches we considered that I mentioned in comment #18 above.

> > Source/WebCore/platform/graphics/Image.h:163
> > +    virtual bool hasTransparency () { return true; }
> > +    virtual bool notSolidColor () { return true; }
> 
> I don’t like either of these function names. The reason is that both don’t do what they say. There are many images that have no transparency at all that will still return true for hasTransparency. There are many images that are a solid color that will return true for notSolidColor. Ideally the functions would have precise names rather than imprecise ones. What does it mean if these functions return false? We should name them so their behavior actually matches what they are named, even if that means a longer name.

notSolidColor() was preexisting on BitmapImage; I agree that the name is not descriptive. isSinglePixelConstantColor() (with the sense reversed)?

Do you have alternate name suggestions for hasTransparency()? mayBeTransparent() / isNotCertainlyOpaque(), perhaps?

> Also, it’s rarely helpful to have an inline definition of a function that is also virtual. If this is going to be virtual, I suggest putting the implementation into Image.cpp rather than putting it here in the header.

Inline virtuals are absolutely endemic to these class headers. Should I go through and move the rest into their cpp files? As a new WebKit contributor I was trying to follow established style regardless of whether it conflicted with best practices I'd learned elsewhere.
Comment 20 James Robinson 2011-08-18 21:04:16 PDT
This still crashing, we're doing a bad cast on Image* to BitmapImage* and then calling the wrong member functions.  Totally not cool to leave a memory stomping crash in the tree for so long :(

I'll try to pick this up and fix it, seems like the patch stalled out mostly over naming and style concerns.
Comment 21 James Robinson 2011-08-18 21:11:27 PDT
(In reply to comment #11)
> This appears to be a regression of http://code.google.com/p/chromium/issues/detail?id=79739 / https://bugs.webkit.org/show_bug.cgi?id=58821. However, the changes to GraphicsContext3DSkia::getImageData() made then seem to be intact, suggesting it's a similar bad cast in some other place.
> 
> At least three callers are using Image::isBitmapImage() as RTTI and casting an Image to a BitmapImage so they can call the functions on it. Although BitmapImageSingleFrameSkia is implemented as a bitmap, it doesn't have all the other baggage that has accreted around BitmapImage (multiple animation frames).

I think this is a really serious bug that we need to fix with BitmapImageSingleFrameSkia.  That's an incredibly fragile pattern and very different from the rest of how WebKit works.  That said, the more important thing right now is to fix the crashes, and then later fix the design flaws that lead to it being so easy to write crashers like this.
Comment 22 James Robinson 2011-08-18 21:26:52 PDT
Created attachment 104455 [details]
Patch
Comment 23 James Robinson 2011-08-18 21:28:29 PDT
Updated the patch.  Darin Adler's feedback is valid, but right now we have a crash on ToT and crashes on Chrome builds in the wild.  I haven't changed any function names in this patch, so we can improve them later, but it's imperative that we fix the crash regression ASAP.
Comment 24 James Robinson 2011-08-18 21:42:49 PDT
Created attachment 104457 [details]
Patch
Comment 25 James Robinson 2011-08-19 14:50:19 PDT
Comment on attachment 104457 [details]
Patch

Clearing flags on attachment: 104457

Committed r93441: <http://trac.webkit.org/changeset/93441>
Comment 26 James Robinson 2011-08-19 14:50:27 PDT
All reviewed patches have been landed.  Closing bug.
Comment 27 James Robinson 2011-08-19 14:53:08 PDT
Filed https://bugs.webkit.org/show_bug.cgi?id=66597 to discuss fixing the design issue here.  The Image/BitmapImage code could also use some general cleanup (like moving the definition of virtual functions to the .cpp, some of the function names are pretty poor) - I feel those should be handled with separate bugs.