32954 – have run-webkit-tests output the JSON files for tracking layout test flakiness

RESOLVED WONTFIX 32954

have run-webkit-tests output the JSON files for tracking layout test flakiness

https://bugs.webkit.org/show_bug.cgi?id=32954

Summary have run-webkit-tests output the JSON files for tracking layout test flakiness

Ojan Vafai

Reported 2009-12-26 12:57:04 PST

There have been a number of requests to add support for the webkit builders to the layout test tracking dashboard (http://src.chromium.org/viewvc/chrome/trunk/src/webkit/tools/layout_tests/flakiness_dashboard.html). It was intentionally written in a generic way. So 95% of the work to add support here is to have run-webkit-tests output the JSON files that the dashboard reads in. I've written a mini design doc at http://sites.google.com/a/chromium.org/dev/developers/design-documents/layout-tests-results-dashboard. It mostly just outlines the format of the JSON files. It also outlines the 6 steps I can think of that need doing to make it all work. Eric volunteered to do the first task (the run-webkit-tests changes). I'm happy to do the rest once that's done.

Attachments
Add attachment proposed patch, testcase, etc.

Adam Roben (:aroben)

Comment 1 2011-03-23 08:49:40 PDT

*** Bug 56852 has been marked as a duplicate of this bug. ***

Adam Roben (:aroben)

Comment 2 2011-03-23 08:55:28 PDT

See bug 56852 comment 2 for a good overview of what we need to do to fix this.

Eric Seidel (no email)

Comment 3 2011-03-23 10:15:20 PDT

The webkit-patch python code already knows how to post-process a layout-test-results directory into the json objects. But it doesn't have all the data that a NRWT JSON would. See: http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/common/net/layouttestresults.py I think it would be easiest to just write a webkit-patch command which spit out a JSON file of the apporpriate format from that parsed information. It would also be possible to change ORWT to spit out the json directly, but I expect that would be harder.

Adam Roben (:aroben)

Comment 4 2011-03-23 10:16:44 PDT

Reusing existing code seems best. Thanks for the pointer, Eric! One bit of information used by the JSON files that's not currently stored in layout-test-results is how long each test took to run. There may be other bits, too.

Eric Seidel (no email)

Comment 5 2011-03-23 10:22:00 PDT

Yes, that would definitely be the advantage of spitting out the JSON from ORWT directly. Ojan and I talked a long time ago about moving NRWT off of using a static results.html file and instead using a standard results.html template which populated itself from the JSON. That would make it super-easy to move ORWT to the same model, and allow us to tweak the look of the output pages w/o caring which RWT we're running. This migration-to-a-dynamic-page-for-results.html would be an excuse to rip out all of the ugly results.html output in ORWT and replace it with some simpler JSON output, including stuff needed for the flakiness dashboard.

Ojan Vafai

Comment 6 2011-03-23 15:18:15 PDT

The idea to reuse code like this is great. I was a little loathe to have two implementations that would need to be kept in sync. I think it's only the test runtimes that can't be retrieved later. That means the flakiness dashboard can't show test runtimes, but that doesn't seem like it justifies completely rewriting the code. For the sake of the treemap dashboard though, we'll need full_results.json with test runtimes. Luckily, that's a much simpler format and is the one we were intending to use for the dynamic results.html page. I think it would be fine to have old-run-webkit-tests output full_results.json and have a webkit-patch command for outputting incremental_results.json and expectations.json. FYI, incremental_results.json has a bunch of metadata (e.g. builder name, build number, etc) that we'll need to pass as arguments to the webkit-patch command.

Adam Roben (:aroben)

Comment 7 2011-03-30 20:58:15 PDT

(In reply to comment #6) > I think it would be fine to have old-run-webkit-tests output full_results.json and have a webkit-patch command for outputting incremental_results.json and expectations.json. Do we already have code that knows how to generate those two files given full_results.json?

Ojan Vafai

Comment 8 2011-03-30 20:59:52 PDT

(In reply to comment #7) > (In reply to comment #6) > > I think it would be fine to have old-run-webkit-tests output full_results.json and have a webkit-patch command for outputting incremental_results.json and expectations.json. > > Do we already have code that knows how to generate those two files given full_results.json? No. They are a totally different format with different data. The file names are a bit unfortunate. :(

Adam Roben (:aroben)

Comment 9 2011-03-30 21:55:41 PDT

Some info gleaned via IRC: <aroben> ojan: I think my main problem is not understanding the format of incremental_results.json or how it's generated <ojan> aroben: yeah...it's confusing... <ojan> aroben: you could get a small version of incremental_results.json... <ojan> aroben: with a local test run... <ojan> aroben: let me get you the right commandline to run <aroben> ojan: I guess that's true! <ojan> aroben: new-run-webkit-tests editing/selection --build-number 5 --builder-name MyBuilderName <ojan> aroben: puts the files in WebKitBuild/Debug/layout-test-results <ojan> aroben: there's 4 json files it spits out <ojan> aroben: you can ignore unexpected_results.json for now <ojan> aroben: it has the same format as full_results.json fwiw <ojan> aroben: one catch, it will only include individual tests there if the test actually fails <ojan> aroben: so you'll want to force a few tests to fail/timeout/crash to get a better sense of what the format is <aroben> ojan: which file are you referring to? <ojan> aroben: incremental_results.json <aroben> ojan: how accurate is https://sites.google.com/a/chromium.org/dev/developers/design-documents/layout-tests-results-dashboard ? <ojan> aroben: it was accurate when i wrote it almost a year ago, but some things have changed <ojan> aroben: it should be 100% accurate for expectations.json <ojan> aroben: but for incremental_results.json it's a bit outdated... <ojan> aroben: incremtnal_results.json didn't exist when i wrote that <ojan> aroben: it's the same format, but it only contains the data for one run <aroben> ojan: same as results.json, you mean? <ojan> then, ont he server-side it gets merged into results.json <ojan> aroben: and we'll need to figure out what the right values are for the *Count values, but that can be done later. <aroben> ojan: so incremental_results.json contains information about tests that didn't match expectations for the current test run? <aroben> ojan: is that a fair description? <ojan> aroben: exactly <ojan> aroben: + a little extra metadata not specific to individual tests <ojan> aroben: e.g. the webkitRevision, buildNumber, time of the test run, etc <ojan> aroben: in an ideal world, we would synthesize all the arguments needed by http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/layout_tests/layout_package/json_layout_results_generator.py from webkit-patch <ojan> aroben: so we could just use that code <aroben> ojan: seems like that should be possible <ojan> aroben: yeah...i tthink that'd be easiest <ojan> aroben: most of those arguments are straightforward <aroben> ojan: (though I might make it a separate command rather than reusing webkit-patch; this isn't very "patch"-related) <ojan> aroben: yeah, makes sense <ojan> aroben: some of the arguments will need to be passed on the commandline obviously since they depend on the bot (e.g. builder_name, build_number, etc) <aroben> ojan: build_number is just the buildbot build number? <ojan> yeah <ojan> the confusing one is builder_name vs build_name <ojan> :( <aroben> ojan: does NRWT always spit out these JSON files, even when an engineer runs the tests? <ojan> aroben: yes <ojan> aroben: but it doesn't do anything with them unless you pass a test-results-server argument <ojan> aroben: except "webkit-patch rebaseline-server" uses unexpected_results.json <aroben> ojan: so what's build_name? <ojan> aroben: lemme figure that out...i always need to look at the code to remember <ojan> aroben: for the build.webkit.org bots it might be the same a builder_name <ojan> aroben: for reasons i don't udnerstand, the chromium bots store the layout test results at a path with a different name than the builder_name <ojan> aroben: e.g. http://build.chromium.org/f/chromium/layout_test_results/Webkit_Win__deps__dbg__1_/79958/ <ojan> is for Webkit Win (deps)(dbg)(1) <aroben> ojan: I see <ojan> aroben: but it looks like the build.webkit.org bots don't have that distinction <aroben> ojan: ok, we'll leave it the same for now and see what happens <ojan> aroben: yup :) <aroben> ojan: there are other differences between build.webkit.org and build.chromium.org's URLs <aroben> ojan: I wonder if they will matter <ojan> aroben: they won't matter for the JSON data <ojan> aroben: we'll need to fix a bunch of assumptions in the dashboard itself <aroben> ojan: where does that code live? <ojan> aroben: the dashboard code? <aroben> ojan: yeah <ojan> aroben: it's in the chromium repo <ojan> aroben: http://src.chromium.org/viewvc/chrome/trunk/tools/dashboards/ <ojan> aroben: the dashboard is just a handful of static html/js files that pull in the JSON files, so there's no server really <ojan> aroben: test-results.appspot.com is just a dumb file server except for the fact that it knows how to merge incremental_results.json files <ojan> aroben: fwiw, i don't think anyone would be opposed to moving the dashboard code into the webkit tree as long as it's still OK for it to have some support for the chromium tests <ojan> aroben: the test-results.appspot.com code is in the webkit tree already <aroben> ojan: I don't really care where it lives, as long as we can get this all working <aroben> ojan: huh? <ojan> aroben: http://trac.webkit.org/browser/trunk/Tools/TestResultServer <aroben> ojan: oh, but the dashboard code is separate <ojan> yeah <ojan> aroben: the dashboard code could be served up from anywhere <aroben> ojan: seems like it would make sense to have it all in one place <ojan> makes it easy to "run a local server" since you just load the html file from your local checkout in a browser :) <ojan> aroben: yeah...there just wasn't much motivation since only the chromium bots were supported anyways <ojan> aroben: we should prob move the code once we start hooking bits together <ojan> aroben: but i don't really care either way :) <aroben> ojan: I only care if I personally need to make changes to the dashboard <aroben> ojan: in which case svn.webkit.org is much easier for me <aroben> ojan: what's "fixable" in full_results.json? <aroben> ojan: oh, your docs page explains it <aroben> ojan: for us it will just be the number of failures I think <ojan> aroben: yes <ojan> aroben: and most of the *Count values will always be 0 for you <ojan> aroben: yeah, i think fixableCount is the only Count value that makes any sense for non-Chromium ports

Adam Roben (:aroben)

Comment 10 2011-03-30 21:58:29 PDT

I wanted to get a handle on how json_layout_results_generator.py works, so I: 1) Put a breakpoint in JSONLayoutTestResultsGenerator.__init__ 2) Modified css1/basic/class_as_selector.html and css1/basic/comments.html to time out 3) Ran this command: new-run-webkit-tests css1/basic --build-number 5 --builder-name MyBuilderName Then I printed the arguments passed to __init__: (Pdb) p port <port.mac.MacPort object at 0x10ddb12d0> (Pdb) p builder_name 'MyBuilderName' (Pdb) p build_name 'DUMMY_BUILD_NAME' (Pdb) p build_number '5' (Pdb) p results_file_base_path u'/Users/aroben/dev/BuildProducts/Debug/layout-test-results' (Pdb) p builder_base_url 'http://build.chromium.org/buildbot/layout_test_results/' (Pdb) p test_timings [<webkitpy.layout_tests.layout_package.test_results.TestResult object at 0x10df09e90>, <webkitpy.layout_tests.layout_package.test_results.TestResult object at 0x10df09fd0>, <webkitpy.layout_tests.layout_package.test_results.TestResult object at 0x10df09bd0>, <webkitpy.layout_tests.layout_package.test_results.TestResult object at 0x10df09ed0>, <webkitpy.layout_tests.layout_package.test_results.TestResult object at 0x10df0f110>, <webkitpy.layout_tests.layout_package.test_results.TestResult object at 0x10df0f050>, <webkitpy.layout_tests.layout_package.test_results.TestResult object at 0x10df0f090>] (Pdb) p test_timings[0] <webkitpy.layout_tests.layout_package.test_results.TestResult object at 0x10df09e90> (Pdb) p expectations <webkitpy.layout_tests.layout_package.test_expectations.TestExpectations instance at 0x10ddc2b48> (Pdb) p result_summary <webkitpy.layout_tests.layout_package.result_summary.ResultSummary object at 0x10ddc6450> (Pdb) p all_tests ['/Users/aroben/dev/WebKit/OpenSource/LayoutTests/css1/basic/class_as_selector.html', '/Users/aroben/dev/WebKit/OpenSource/LayoutTests/css1/basic/comments.html', '/Users/aroben/dev/WebKit/OpenSource/LayoutTests/css1/basic/containment.html', '/Users/aroben/dev/WebKit/OpenSource/LayoutTests/css1/basic/contextual_selectors.html', '/Users/aroben/dev/WebKit/OpenSource/LayoutTests/css1/basic/grouping.html', '/Users/aroben/dev/WebKit/OpenSource/LayoutTests/css1/basic/id_as_selector.html', '/Users/aroben/dev/WebKit/OpenSource/LayoutTests/css1/basic/inheritance.html'] (Pdb) p test_results_server '' (Pdb) p test_type 'layout-tests' (Pdb) p master_name None

Ojan Vafai

Comment 11 2011-03-30 22:07:45 PDT

FWIW, test_results_server and master_name will only be needed when we decide to upload these. At that point the values for the build.webkit.org bots should be: test_results_server=test-results.appspot.com master_name=webkit.org builder_base_url is unused. unfortunately, removing the argument is a bit tricky since there in calling code on the chromium-side that also needs to be modified.

Adam Roben (:aroben)

Comment 12 2011-07-21 15:48:40 PDT

We're pretty close to having all bots using NRWT now, at which point this won't be needed. I don't think it's worth the effort to do this at this point.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution WONTFIX

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware All

OS All

Product WebKit

Component Tools / Tests

Assignee

Nobody

Reported

2009-12-26 12:57 PST

Modified

2011-07-21 15:48 PDT History

CC List

5 users Show

URL

Keywords

Depends on

Blocks

56852

Dependencies

tree graph