Bug 94277

Summary: REGRESSION(when?) WK2 tests have 500+ failures causing an early exit
Product: WebKit Reporter: Brady Eidson <beidson>
Component: Tools / TestsAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, cmuppala, dpranke, kbalazs, ossy, rniwa, sam, thorton, tony
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on: 94505, 94517    
Bug Blocks: 94507    

Description Brady Eidson 2012-08-16 17:57:03 PDT
REGRESSION:  WK2 tests have 500+ failures causing an early exit

Much discussion on #webkit.

We think it was http://trac.webkit.org/changeset/124958 or http://trac.webkit.org/changeset/124581 or both.
Comment 1 Dirk Pranke 2012-08-16 18:03:10 PDT
to add more context from the #webkit discussion

it looks like all the wk2 bots are failing fairly reliably (though not always) with 500+ failures. the wk1 bots seem happy.

it looks like things prior to the time of ~r124581 (where we changed to passing --pixel-tests per-test for reftests to work) were pretty happy.

A partial revert of that change, however, did not seem to fix things.

In r124958, I changed errors from ImageDiff to be treated as test failures (previously we would ignore the failure and treat things as if tests were passing, e.g., false positives).

As reported in https://bugs.webkit.org/show_bug.cgi?id=81962, I get lots of ImageDiff warnings *only when running wk2* (don't know why yet), and so I'm suspecting that the change in r124958 has pushed the problems over the edge. i.e., r124958 has made the bug 81962 a lot more serious (arguably, as it should be).

I am continuing to do more testing.
Comment 2 Dirk Pranke 2012-08-16 18:15:30 PDT
So, with my ImageDiff change reverted, I still get 200+ failures near tip-of-tree using WK2 (Release) on Lion. 

If you actually look at the failures, many (most?) of them appear to be render trees for what should be text-only tests, as if dumpAsText() is having no effect. I don't know how that could be happening, or if there were any changes to WTR recently that might cause that?
Comment 3 Dirk Pranke 2012-08-16 18:30:05 PDT
Also, it's worth trying to repro these issues when running the tests serially (I'm trying this now, but it's obviously much slower so I don't have results yet), to see if WTR is just interfering w/ itself.
Comment 4 Brady Eidson 2012-08-17 10:58:08 PDT
(In reply to comment #2)
> So, with my ImageDiff change reverted, I still get 200+ failures near tip-of-tree using WK2 (Release) on Lion. 
> 
> If you actually look at the failures, many (most?) of them appear to be render trees for what should be text-only tests, as if dumpAsText() is having no effect. I don't know how that could be happening, or if there were any changes to WTR recently that might cause that?

That's terrible...  I hope to look at the log to WTR today and see if I can spot anything out of the blue.

(In reply to comment #3)
> Also, it's worth trying to repro these issues when running the tests serially (I'm trying this now, but it's obviously much slower so I don't have results yet), to see if WTR is just interfering w/ itself.

Any word on that effort?
Comment 5 Dirk Pranke 2012-08-17 11:11:06 PDT
(In reply to comment #4)
> (In reply to comment #2)
> > So, with my ImageDiff change reverted, I still get 200+ failures near tip-of-tree using WK2 (Release) on Lion. 
> > 
> > If you actually look at the failures, many (most?) of them appear to be render trees for what should be text-only tests, as if dumpAsText() is having no effect. I don't know how that could be happening, or if there were any changes to WTR recently that might cause that?
> 
> That's terrible...  I hope to look at the log to WTR today and see if I can spot anything out of the blue.
> 
> (In reply to comment #3)
> > Also, it's worth trying to repro these issues when running the tests serially (I'm trying this now, but it's obviously much slower so I don't have results yet), to see if WTR is just interfering w/ itself.
> 
> Any word on that effort?

well, running serially took an hour and produced ~400 failures at r124580. I don't know why this is so much worse than what I saw on the bots around that time range, unless there's something wrong with my local configuration.

My best guess at this point is that there are multiple serious issues with WTR:
1) dumpAsText isn't working in some cases
2) something is broken with how WTR is generating pixel dumps that is causing ImageDiff to fail and as a result, we're failing a *lot* of reftests (which do pixel compares even when pixel tests are disabled).

I don't see anything obviously wrong with NRWT; I could revert (or disable) r124958, but that would seem to just ignore the real problem.
Comment 6 Dirk Pranke 2012-08-17 11:26:47 PDT
Also, for what it's worth, since I don't know all that much about WTR (and as I have other tasks) it probably is more useful for someone else to pick up the ball and start running. If I can be of further help please let me know :).
Comment 7 Dirk Pranke 2012-08-17 21:20:27 PDT
So, I did a little test bisecting, and it looks like the following command seems to run reasonably well for me:

rwt -2 -i fast/repaint -i fast/canvas -i fast/inspector-support -i accessibility -i compositing -i css3 -i http/tests/inspector -i inspector -i http/tests/inspector-enabled

in theory we reverted a change to the inspector that was causing all the inspector failures, so you're left with hunting down something causing instability in repaint, compositing, filters, and/or canvas. I imagine further bisecting should track things down pretty quickly, but I'm done for the night/weekend :).
Comment 8 Brady Eidson 2012-08-20 11:12:29 PDT
We've concretely identified there's an off-by-one issue in comparing test results, that's tracked by https://bugs.webkit.org/show_bug.cgi?id=94505 which is now a blocking subtask here.  (It might be the only issue)
Comment 9 Csaba Osztrogonác 2014-09-16 01:28:33 PDT
fixed in http://trac.webkit.org/changeset/126418 long long time ago.