Bug 33230 - svg/W3C-SVG-1.1/animate-elem-30-t.svg crashing occasionally on Leopard Build Bot
Summary: svg/W3C-SVG-1.1/animate-elem-30-t.svg crashing occasionally on Leopard Build Bot
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on: 14861 32294
Blocks:
  Show dependency treegraph
 
Reported: 2010-01-05 14:33 PST by Eric Seidel (no email)
Modified: 2010-01-21 16:29 PST (History)
4 users (show)

See Also:


Attachments
crash log (30.39 KB, text/plain)
2010-01-06 23:46 PST, Eric Seidel (no email)
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2010-01-05 14:33:09 PST
svg/W3C-SVG-1.1/coords-coord-01-t.svg crashing occasionally on Leopard Build Bot

http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r52820%20(8894)/
Tests that caused the DumpRenderTree tool to crash:
svg/W3C-SVG-1.1/coords-coord-01-t.svg

http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r52818%20(8892)/
Tests that caused the DumpRenderTree tool to crash:
svg/W3C-SVG-1.1/animate-elem-30-t.svg

ASSERTION FAILED: !m_frame->view() || !m_frame->view()->needsLayout()
(/Volumes/Big/WebKit-BuildSlave/leopard-intel-debug/build/WebCore/editing/SelectionController.cpp:929 bool WebCore::SelectionController::recomputeCaretRect())

Both times:
Tests where results did not match expected results:
svg/W3C-SVG-1.1/filters-conv-01-f.svg

I wonder if this could be caused/revealed by http://trac.webkit.org/changeset/52778?
Comment 1 Eric Seidel (no email) 2010-01-05 14:34:16 PST
The filter failure is covered by bug 32294.
Comment 2 Darin Adler 2010-01-05 14:35:20 PST
Yes, this assertion is new.

We need a backtrace from the crash to figure out who is calling recomputeCaretRect without triggering layout.
Comment 3 Eric Seidel (no email) 2010-01-05 14:36:57 PST
Yet another case where bug 14861 would be useful.
Comment 4 Nikolas Zimmermann 2010-01-06 14:37:19 PST
svg/W3C-SVG-1.1/coords-coord-01-t.svg is not crashing, it was only the animate-elem-30-t.svg testcase. With the last SVG marker code rewrite, a SVGUseElement fix has been included, which may affect this bug (aka. fix it). Let's see if this ever happens again.
Comment 5 Eric Seidel (no email) 2010-01-06 20:17:51 PST
This happened again:
http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r52891%20(8951)/results.html
Comment 6 Eric Seidel (no email) 2010-01-06 20:19:06 PST
To be clear, specifically svg/W3C-SVG-1.1/animate-elem-30-t.svg crashed again on Leopard Debug.  See above link.

ASSERTION FAILED: !m_frame->view() || !m_frame->view()->needsLayout()
(/Volumes/Big/WebKit-BuildSlave/leopard-intel-debug/build/WebCore/editing/SelectionController.cpp:929 bool WebCore::SelectionController::recomputeCaretRect())
Comment 7 Eric Seidel (no email) 2010-01-06 21:12:00 PST
This is (not surprisingly) also crashing on the Snow Leopard Debug bot:
http://build.webkit.org/results/SnowLeopard%20Intel%20Leaks/r52898%20(3248)/svg/W3C-SVG-1.1/animate-elem-30-t-stderr.txt
Comment 8 Eric Seidel (no email) 2010-01-06 21:29:07 PST
Saw this again:
http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r52896%20(8955)/results.html

This appears to be our most common crasher on the Leopard Bot.

I've not yet tried reproducing this locally.
Comment 9 Eric Seidel (no email) 2010-01-06 23:14:37 PST
Yup.  Worst failure on leopard bot right now:
http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r52902%20(8961)/results.html
Comment 10 Eric Seidel (no email) 2010-01-06 23:27:30 PST
run-webkit-tests --debug svg/W3C-SVG-1.1/animate-elem-30-t.svg

reproduces more than half the time for me.  you can run it with --iterations=100 to be sure to get a crash. :)

I've not bothered catching it in the debugger yet, but I have a crash report and will attach it shortly.
Comment 11 Eric Seidel (no email) 2010-01-06 23:46:45 PST
Created attachment 46025 [details]
crash log

I'm not sure the full stack trace is actually very helpful.  I suspect we'll have to catch this in the debugger to understand what's going on.
Comment 12 Dirk Schulze 2010-01-07 01:43:09 PST
I was not able to reproduce this crahs once on gtk :-(. Even with 100 iterations. Trying the debug build later.
Comment 13 Nikolas Zimmermann 2010-01-07 06:22:32 PST
I checked this again, it's definately not related to any <use> problem, it must be directly related to Darins patch, adding the new SelectionController assertions. Maybe they expose a bug in SVG text handling regarding selection state (aka. that we don't update anything correctly like HTML text does).
Darin, can you enlighten us, what may be triggering these assertions? Thanks in advance!
Comment 14 Darin Adler 2010-01-07 09:16:25 PST
(In reply to comment #13)
> Darin, can you enlighten us, what may be triggering these assertions? Thanks in
> advance!

The new assertions are in functions that assume layout. In general, functions can't get at the render tree unless they know layout has been performed.

In the past we were sloppy and imprecise about which functions would trigger layout if needed, and which required that layout already is done. This led to various crashes, even ones with security impact.

If the assertion fires, it means a function that depends on up-to-date layout was called when layout is needed but has not been done. There are two major ways to fix that:

    1) Look at the backtrace and figure out a higher level that can guarantee layout, usually by adding a call to Document::updateLayout. The design question is which functions depend on the caller to make sure the layout is up to date, and which take care of it themselves.

    2) Change the call with the assertion to perform layout instead of requiring it already be done. This can be tricky because there may be multiple call sites, and adding extra calls to layout can cause problems/crashes or slower performance.

I added a small number of assertions to the selection code, but generally we need more of these assertions in more places in the code.

The critical issue is figuring out who the caller is and what the backtrace is for these assertions. This probably requires a reproducible case, a bot that logs backtraces for crashes, or some inspired guesswork.
Comment 15 Eric Seidel (no email) 2010-01-07 11:40:31 PST
(In reply to comment #14)
> The critical issue is figuring out who the caller is and what the backtrace is
> for these assertions. This probably requires a reproducible case, a bot that
> logs backtraces for crashes, or some inspired guesswork.

The test case is reproducible, see comment 10.

A crash log including the full backtrace is attached, see comment 11 or attachment 46025 [details].
Comment 16 Eric Seidel (no email) 2010-01-07 11:41:23 PST
Is the problem that we're inside a layout already?
0   com.apple.WebCore             	0x05a578b3 WebCore::SelectionController::recomputeCaretRect() + 145 (SelectionController.cpp:929)
1   com.apple.WebCore             	0x0540c31b WebCore::Frame::selectionLayoutChanged() + 89 (Frame.cpp:642)
2   com.apple.WebCore             	0x054317eb WebCore::FrameView::layout(bool) + 2505 (FrameView.cpp:695)
3   com.apple.WebCore             	0x05431b41 WebCore::FrameView::forceLayout(bool) + 31 (FrameView.cpp:1809)
4   com.apple.WebKit              	0x00af4879 -[WebHTMLView layoutToMinimumPageWidth:maximumPageWidth:adjustingViewSize:] + 259 (WebHTMLView.mm:3012)
5   com.apple.WebKit              	0x00ae5128 -[WebHTMLView layout] + 68 (WebHTMLView.mm:3026)
Comment 17 Darin Adler 2010-01-07 12:50:22 PST
(In reply to comment #16)
> Is the problem that we're inside a layout already?
> 0   com.apple.WebCore                 0x05a578b3
> WebCore::SelectionController::recomputeCaretRect() + 145
> (SelectionController.cpp:929)
> 1   com.apple.WebCore                 0x0540c31b
> WebCore::Frame::selectionLayoutChanged() + 89 (Frame.cpp:642)
> 2   com.apple.WebCore                 0x054317eb
> WebCore::FrameView::layout(bool) + 2505 (FrameView.cpp:695)
> 3   com.apple.WebCore                 0x05431b41
> WebCore::FrameView::forceLayout(bool) + 31 (FrameView.cpp:1809)
> 4   com.apple.WebKit                  0x00af4879 -[WebHTMLView
> layoutToMinimumPageWidth:maximumPageWidth:adjustingViewSize:] + 259
> (WebHTMLView.mm:3012)
> 5   com.apple.WebKit                  0x00ae5128 -[WebHTMLView layout] + 68
> (WebHTMLView.mm:3026)

At that point in the layout process, needsLayout should already be false. I'm guessing that there's some situation where SVG code creates the need for a new layout while performing the current layout. That's a bug in SVG. But it may be hard to track down.

This does reflect a real bug. Inside recomputeCaretRect it's going to create a VisiblePosition and this will trigger a new layout! Extremely unpleasant.

I am not certain what the best thing to do is. We could remove the assertion for now to un-flakify the bots, which is sad for all the non-SVG cases and tests we could otherwise catch. We could add more targeted assertions to catch code trigger new layout during the layout process to try to pinpoint where the SVG code goes wrong.
Comment 18 Eric Seidel (no email) 2010-01-07 21:27:50 PST
I'm confused.  The call stack does not have SVG in it at all.  See the attached crash log.
Comment 19 Nikolas Zimmermann 2010-01-21 15:41:32 PST
Reran svg/W3C-SVG-1.1/animate-elem-30-t.svg with --repeat-each 100, can not reproduce it anymore. I've fixed the problem with the intermediate layouts during the <use> rewrite.

Actually Darins commented forced me to rewrite <use>, we shouldn't ever see crashes like this again :-)
Comment 20 Darin Adler 2010-01-21 16:03:50 PST
(In reply to comment #18)
> I'm confused. The call stack does not have SVG in it at all.

If the FrameView was doing a layout and during that, the code to lay out SVG elements triggered the need for additional layout, after that you could crash like this. The troublesome SVG code would have already run to completion so there would be no SVG code in the backtrace.
Comment 21 Eric Seidel (no email) 2010-01-21 16:06:03 PST
(In reply to comment #20)
> (In reply to comment #18)
> > I'm confused. The call stack does not have SVG in it at all.
> 
> If the FrameView was doing a layout and during that, the code to lay out SVG
> elements triggered the need for additional layout, after that you could crash
> like this. The troublesome SVG code would have already run to completion so
> there would be no SVG code in the backtrace.

So is it not safe to trigger additional layout during a layout()?

SVG does this:
http://trac.webkit.org/browser/trunk/WebCore/rendering/SVGRenderSupport.cpp#L253
Comment 22 Darin Adler 2010-01-21 16:09:41 PST
(In reply to comment #21)
> So is it not safe to trigger additional layout during a layout()?

Correct.

> SVG does this:
> http://trac.webkit.org/browser/trunk/WebCore/rendering/SVGRenderSupport.cpp#L253

Has always been incorrect and is a serious design problem.
Comment 23 Nikolas Zimmermann 2010-01-21 16:11:07 PST
(In reply to comment #22)
> (In reply to comment #21)
> > So is it not safe to trigger additional layout during a layout()?
> 
> Correct.
> 
> > SVG does this:
> > http://trac.webkit.org/browser/trunk/WebCore/rendering/SVGRenderSupport.cpp#L253
> 
> Has always been incorrect and is a serious design problem.

Huh?

Every RenderBlock calls child->layoutIfNeeded(), from it's own layout() method.
Where is the problem here? We're only layouting the subtree, no one retriggers a root layout from the middle of a layout.
Comment 24 Darin Adler 2010-01-21 16:29:39 PST
(In reply to comment #23)
> (In reply to comment #22)
> > > SVG does this:
> > > http://trac.webkit.org/browser/trunk/WebCore/rendering/SVGRenderSupport.cpp#L253
> > 
> > Has always been incorrect and is a serious design problem.
> 
> Huh?
> 
> Every RenderBlock calls child->layoutIfNeeded(), from it's own layout() method.
> Where is the problem here? We're only layouting the subtree, no one retriggers
> a root layout from the middle of a layout.

Sorry, I was in a rush did not look at the code and probably said the wrong thing.

It's wrong to trigger additional layout that will still not be complete at the end of the layout process. The code above is likely correct if it fits into the normal layout process.