Bug 72001 - [Chromium] only run media GPU layout tests on platforms supporting accelerated compositing
Summary: [Chromium] only run media GPU layout tests on platforms supporting accelerate...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Andrew Scherkus
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-11-09 23:39 PST by Andrew Scherkus
Modified: 2011-11-10 18:05 PST (History)
7 users (show)

See Also:


Attachments
Patch (1.78 KB, patch)
2011-11-09 23:41 PST, Andrew Scherkus
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Scherkus 2011-11-09 23:39:08 PST
[Chromium] only run media GPU layout tests on platforms supporting accelerated compositing
Comment 1 Andrew Scherkus 2011-11-09 23:41:46 PST
Created attachment 114443 [details]
Patch
Comment 2 Andrew Scherkus 2011-11-09 23:44:13 PST
Reason being that the following line in test_expectations.txt:
WONTFIX SKIP GPU GPU-CG LEOPARD : media = PASS TIMEOUT FAIL

...doesn't seem to do the trick with respect to skipping the tests. Furthermore if I've been debugging the layout test results correctly it also looks like it's marking all media tests on 10.5 (non-GPU) as PASS/FAIL.
Comment 3 Tony Chang 2011-11-10 09:36:33 PST
(In reply to comment #2)
> Reason being that the following line in test_expectations.txt:
> WONTFIX SKIP GPU GPU-CG LEOPARD : media = PASS TIMEOUT FAIL
> 
> ...doesn't seem to do the trick with respect to skipping the tests. Furthermore if I've been debugging the layout test results correctly it also looks like it's marking all media tests on 10.5 (non-GPU) as PASS/FAIL.

Can we figure out why the above doesn't work?
Comment 4 Andrew Scherkus 2011-11-10 10:33:04 PST
(In reply to comment #3)
> (In reply to comment #2)
> > Reason being that the following line in test_expectations.txt:
> > WONTFIX SKIP GPU GPU-CG LEOPARD : media = PASS TIMEOUT FAIL
> > 
> > ...doesn't seem to do the trick with respect to skipping the tests. Furthermore if I've been debugging the layout test results correctly it also looks like it's marking all media tests on 10.5 (non-GPU) as PASS/FAIL.
> 
> Can we figure out why the above doesn't work?

Turns out it _does_ work but that more-specific expectations will cause the test to run on 10.5 GPU.

Take a look at flakiness results for media/audio-controls-rendering.html:
http://test-results.appspot.com/dashboards/flakiness_dashboard.html#group=%40ToT%20GPU%20Mesa%20-%20chromium.org&tests=media%2Faudio-controls-rendering&showExpectations=true

It was previously marked as:
BUGWK70748 CPU : media/audio-controls-rendering.html = IMAGE

...which caused it to override the SKIP and start running on Mac 10.5 GPU. After rebaselining and removing that expectation the test is now skipped again.

Anyone know if that's the intended behaviour?
Comment 5 Ojan Vafai 2011-11-10 10:37:32 PST
It is certainly the intended behavior that more specific listings override less specific ones. We could take a queue from CSS and add something like "!important" lol.
Comment 6 Andrew Scherkus 2011-11-10 10:40:08 PST
(In reply to comment #5)
> It is certainly the intended behavior that more specific listings override less specific ones. We could take a queue from CSS and add something like "!important" lol.

Yeah that's what I figured and certainly makes sense to me!

So... are we happy with my patch or should I investigate a different avenue?

Having the tests potentially run isn't fun because it generates noise on the flakiness dashboard as well as increases test cycle time.
Comment 7 Tony Chang 2011-11-10 10:40:26 PST
This specific case seems like a bug since the SKIP line refers to GPU and the more specific line refers to CPU.  I wouldn't expect
  CPU : media/audio-controls-rendering.html = IMAGE
to override
  WONTFIX SKIP GPU GPU-CG LEOPARD : media = PASS TIMEOUT FAIL
Comment 8 Andrew Scherkus 2011-11-10 10:44:29 PST
(In reply to comment #7)
> This specific case seems like a bug since the SKIP line refers to GPU and the more specific line refers to CPU.  I wouldn't expect
>   CPU : media/audio-controls-rendering.html = IMAGE
> to override
>   WONTFIX SKIP GPU GPU-CG LEOPARD : media = PASS TIMEOUT FAIL

Check out the diff:
https://trac.webkit.org/changeset/99789/trunk/LayoutTests/platform/chromium/test_expectations.txt

It looks like it was listed as:
BUGWK70748 CPU : media/audio-controls-rendering.html = IMAGE 
BUGWK70748 GPU : media/audio-controls-rendering.html = IMAGE 

So in this case I guess GPU was being overridden.
Comment 9 Tony Chang 2011-11-10 10:48:28 PST
(In reply to comment #8)
> (In reply to comment #7)
> > This specific case seems like a bug since the SKIP line refers to GPU and the more specific line refers to CPU.  I wouldn't expect
> >   CPU : media/audio-controls-rendering.html = IMAGE
> > to override
> >   WONTFIX SKIP GPU GPU-CG LEOPARD : media = PASS TIMEOUT FAIL
> 
> Check out the diff:
> https://trac.webkit.org/changeset/99789/trunk/LayoutTests/platform/chromium/test_expectations.txt
> 
> It looks like it was listed as:
> BUGWK70748 CPU : media/audio-controls-rendering.html = IMAGE 
> BUGWK70748 GPU : media/audio-controls-rendering.html = IMAGE 
> 
> So in this case I guess GPU was being overridden.

If you changed:
  BUGWK70748 GPU : media/audio-controls-rendering.html = IMAGE
to:
  BUGWK70748 GPU SNOWLEOPARD WIN LINUX : media/audio-controls-rendering.html = IMAGE

I think it would fix your problem since it doesn't overlap the previous SKIP line.
Comment 10 Andrew Scherkus 2011-11-10 10:54:28 PST
(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > This specific case seems like a bug since the SKIP line refers to GPU and the more specific line refers to CPU.  I wouldn't expect
> > >   CPU : media/audio-controls-rendering.html = IMAGE
> > > to override
> > >   WONTFIX SKIP GPU GPU-CG LEOPARD : media = PASS TIMEOUT FAIL
> > 
> > Check out the diff:
> > https://trac.webkit.org/changeset/99789/trunk/LayoutTests/platform/chromium/test_expectations.txt
> > 
> > It looks like it was listed as:
> > BUGWK70748 CPU : media/audio-controls-rendering.html = IMAGE 
> > BUGWK70748 GPU : media/audio-controls-rendering.html = IMAGE 
> > 
> > So in this case I guess GPU was being overridden.
> 
> If you changed:
>   BUGWK70748 GPU : media/audio-controls-rendering.html = IMAGE
> to:
>   BUGWK70748 GPU SNOWLEOPARD WIN LINUX : media/audio-controls-rendering.html = IMAGE
> 
> I think it would fix your problem since it doesn't overlap the previous SKIP line.

With updating test_expectations.txt being a collaborative effort is it reasonable to expect folks to notice situations like this and write fine-grained expectations?

The person who added this expectation would have to be aware that compositing isn't not enabled on LEOPARD.
Comment 11 Ojan Vafai 2011-11-10 10:57:08 PST
(In reply to comment #10)
> With updating test_expectations.txt being a collaborative effort is it reasonable to expect folks to notice situations like this and write fine-grained expectations?
> 
> The person who added this expectation would have to be aware that compositing isn't not enabled on LEOPARD.

Not at all reasonable IMO. I don't see a better solution though other than generally reducing the complexity of the test_expectations format. I guess your proposal here to bake it into run-webkit-tests is one solution. That carries the problem of making it harder to know why a test is/isn't being run because now you need to know both how test_expectations works and what run-webkit-tests hard-codes.

Definitely open to suggestions on how to make this easier to maintain/understand. It's crazy confusing right now.
Comment 12 Andrew Scherkus 2011-11-10 10:59:39 PST
(In reply to comment #11)
> (In reply to comment #10)
> > With updating test_expectations.txt being a collaborative effort is it reasonable to expect folks to notice situations like this and write fine-grained expectations?
> > 
> > The person who added this expectation would have to be aware that compositing isn't not enabled on LEOPARD.
> 
> Not at all reasonable IMO. I don't see a better solution though other than generally reducing the complexity of the test_expectations format. I guess your proposal here to bake it into run-webkit-tests is one solution. That carries the problem of making it harder to know why a test is/isn't being run because now you need to know both how test_expectations works and what run-webkit-tests hard-codes.
> 
> Definitely open to suggestions on how to make this easier to maintain/understand. It's crazy confusing right now.

To be clear I'm not 100% thrilled with it either -- it took me a good deal of grepping just to find out how media tests were getting run on GPU bots in the first place!

GPU tests are definitely a little strange as they only run small subset but share the test_expectations.txt file.
Comment 13 Tony Chang 2011-11-10 11:05:41 PST
From offline discussion with Ojan:

Part of the problem is it's hard to figure out which lines in test_expectations.txt apply to a specific config.  One solution would be to linkify the expectations on the flakiness dashboard and clicking on it would show you which lines apply to that test on that builder.
Comment 14 Dirk Pranke 2011-11-10 16:02:17 PST
The fact that we are subsetting the list of tests running on the GPU bots inside the chromium_gpu.py code is arguably a misfeature, but it does solve the problem you've run into (and may be why I've left that in until now rather than just WONTFIX SKIPping all of the directories).

I certainly agree that the test_expectations files are fragile and hard to maintain by hand, but I suspect that any format that supports all of the different uses will be at least as complicated. 

One possible option would be to make any directories marked as SKIP override expected results for specific files inside the directory; I am not aware of any case in practice where we want to skip all but X files in a dir, and if we did, we could support that by requiring the user to explicitly list all of the tests to skip.

We could also build up better tools for explaining what tests run where and what the expectations are. It is actually possible to extract that from NRWT now given the appropriate command line flags, but it's not easy to do so (using the --print trace* options)

Maybe we need a "webkit-patch print-expectations" command that duplicates what we print with --print trace and iterates over all of the ports or something?
Comment 15 Dirk Pranke 2011-11-10 16:03:16 PST
Comment on attachment 114443 [details]
Patch

R+'ing this change; it sounds like it will produce the intended result most easily. Scherkus, it's up to you if you want to land it or just fall back to expectations-based skipping.
Comment 16 Ojan Vafai 2011-11-10 16:10:09 PST
(In reply to comment #14)
> One possible option would be to make any directories marked as SKIP override expected results for specific files inside the directory; I am not aware of any case in practice where we want to skip all but X files in a dir, and if we did, we could support that by requiring the user to explicitly list all of the tests to skip.

This makes sense to me. "SKIP cannot be overridden" seems reasonable. It does further complicate things though. :(
Comment 17 Andrew Scherkus 2011-11-10 17:46:11 PST
Comment on attachment 114443 [details]
Patch

Clearing flags on attachment: 114443

Committed r99920: <http://trac.webkit.org/changeset/99920>
Comment 18 Andrew Scherkus 2011-11-10 17:46:16 PST
All reviewed patches have been landed.  Closing bug.
Comment 19 Andrew Scherkus 2011-11-10 17:47:39 PST
Thanks for the review Dirk!

Do you want me to file follow up bugs for some of the ideas discussed here, or should we agree on a reasonable path forward first?
Comment 20 Tony Chang 2011-11-10 17:48:40 PST
We probably want to remove the SKIP line from test_expectations to avoid confusion.
Comment 21 Dirk Pranke 2011-11-10 17:53:58 PST
(In reply to comment #19)
> Thanks for the review Dirk!
> 
> Do you want me to file follow up bugs for some of the ideas discussed here, or should we agree on a reasonable path forward first?

Having a 'webkit-patch print-expectations' seems like a good idea; please feel free to file a bug for that.

I'm less certain about the SKIP overrides more-specific-paths, but feel free to file a bug to track better ideas for that.