Bug 46452 - [chromium] Updated test expectations to match the bots using new auto-update script.
Summary: [chromium] Updated test expectations to match the bots using new auto-update ...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Other OS X 10.5
: P2 Normal
Assignee: James Kozianski
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-23 23:53 PDT by James Kozianski
Modified: 2011-01-18 22:02 PST (History)
4 users (show)

See Also:


Attachments
Patch (4.07 KB, patch)
2010-09-23 23:56 PDT, James Kozianski
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description James Kozianski 2010-09-23 23:53:35 PDT
[chromium] Updated test expectations to match the bots using new auto-update script.
Comment 1 James Kozianski 2010-09-23 23:56:49 PDT
Created attachment 68656 [details]
Patch
Comment 2 WebKit Commit Bot 2010-09-24 02:13:33 PDT
Comment on attachment 68656 [details]
Patch

Clearing flags on attachment: 68656

Committed r68244: <http://trac.webkit.org/changeset/68244>
Comment 3 WebKit Commit Bot 2010-09-24 02:13:38 PDT
All reviewed patches have been landed.  Closing bug.
Comment 4 Tony Chang 2010-09-24 11:00:20 PDT
Where can I learn more about this script?  What will the work flow be for converting BUG_AUTO into bugs?

As a feature request, maybe it can guess when to use SLOW if the test is only failing in debug.
Comment 5 James Kozianski 2010-09-24 13:24:26 PDT
Ojan and I have been working on improving the webkit/tools/layout_tests/webkitpy/layout_tests/update_expectations_from_dashboard.py script. Its usage is undocumented, but in short it takes JSON data as input generated from http://test-results.appspot.com/dashboards/flakiness_dashboard.html#expectationsUpdate=true and uses it to modify test_expectations.txt.

The modifications to the script are unreviewed, but once committed I'll add a section to the wiki explaining how to use it at http://dev.chromium.org/developers/testing/flakiness-dashboard.

> What will the work flow be for converting BUG_AUTO into bugs?

I'm not sure - Ojan, can you chime in?


> As a feature request, maybe it can guess when to use SLOW if the test is only failing in debug.

Yep, that sounds like a good idea. Could you provide a more specific heuristic?
Comment 6 Tony Chang 2010-09-24 14:16:27 PDT
(In reply to comment #5)
> > As a feature request, maybe it can guess when to use SLOW if the test is only failing in debug.
> 
> Yep, that sounds like a good idea. Could you provide a more specific heuristic?

I'm not sure.  I just bring it up because it looks like the last 5 entries to test_expectations.txt are just slow tests.  But maybe not.  One of the tests is for Release builds.
Comment 7 Ojan Vafai 2010-09-25 14:02:00 PDT
(In reply to comment #4)
> Where can I learn more about this script?  

The working version is still not checked in. We'll document it once it's usable. This was our first pass at actually using it.

> What will the work flow be for converting BUG_AUTO into bugs?

That's a good question. I don't have a good answer. I was planning on bringing this up on chromium-dev soon. I'm open to suggestions.

> As a feature request, maybe it can guess when to use SLOW if the test is only failing in debug.

I've been meaning to get rid of SLOW. I think it's too complicated. Instead, we should just have a long timeout but give a short timeout to tests that we expect to timeout. How does that sound?
Comment 8 Tony Chang 2010-09-27 09:29:46 PDT
(In reply to comment #7)
> (In reply to comment #4)
> > What will the work flow be for converting BUG_AUTO into bugs?
> 
> That's a good question. I don't have a good answer. I was planning on bringing this up on chromium-dev soon. I'm open to suggestions.

I think whoever runs the tool should fill in bug numbers, right?  We shouldn't check in expectations without bugs filed.

> > As a feature request, maybe it can guess when to use SLOW if the test is only failing in debug.
> 
> I've been meaning to get rid of SLOW. I think it's too complicated. Instead, we should just have a long timeout but give a short timeout to tests that we expect to timeout. How does that sound?

I suspect that over time, more tests will time out (the long timeout) and the full test run will gradually get slower.  I prefer fast by default with exceptions to make things slower.

Maybe you're hoping that the auto-update script will detect and mark tests as slow?  I'm not sure how easy that will be to do.