RESOLVED FIXED 199804
[ews-build] Parse full_results.json for layout-tests
https://bugs.webkit.org/show_bug.cgi?id=199804
Summary [ews-build] Parse full_results.json for layout-tests
Aakash Jain
Reported 2019-07-15 13:57:26 PDT
Parse full_results.json for layout-tests in EWS. These results (from multiple layout-test runs) will help in determining whether the failures are introduced by the patch, or are pre-existing.
Attachments
WIP (96.85 KB, patch)
2019-07-15 14:25 PDT, Aakash Jain
no flags
WIP (12.11 KB, patch)
2019-07-16 17:38 PDT, Aakash Jain
no flags
Patch for landing (17.36 KB, patch)
2019-07-17 09:10 PDT, Aakash Jain
no flags
Aakash Jain
Comment 1 2019-07-15 14:25:21 PDT
EWS Watchlist
Comment 2 2019-07-15 14:26:44 PDT
Attachment 374149 [details] did not pass style-queue: ERROR: Tools/BuildSlaveSupport/ews-build/steps_unittest.py:1013: missing whitespace after ',' [pep8/E231] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps_unittest.py:1011: [TestRunWebKitTests.test_parse_results_json] Passing unexpected keyword argument 'state_string' in function call [pylint/E1123] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps_unittest.py:1011: [TestRunWebKitTests.test_parse_results_json] No value passed for parameter 'status_text' in function call [pylint/E1120] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps_unittest.py:1013: [TestRunWebKitTests.test_parse_results_json] Instance of 'TestRunWebKitTests' has no 'assertEqual' member [pylint/E1101] [5] ERROR: Tools/BuildSlaveSupport/ews-build/resultsjsonparser.py:33: multiple imports on one line [pep8/E401] [5] ERROR: Tools/BuildSlaveSupport/ews-build/test_expectations.py:227: at least two spaces before inline comment [pep8/E261] [5] ERROR: Tools/BuildSlaveSupport/ews-build/test_expectations.py:893: at least two spaces before inline comment [pep8/E261] [5] Total errors found: 7 in 8 files If any of these errors are false positives, please file a bug against check-webkit-style.
Aakash Jain
Comment 3 2019-07-15 14:28:44 PDT
Jonathan Bedard
Comment 4 2019-07-16 13:55:27 PDT
I know we're trying to port old code, but in this case, I think we're better off just re-writing this, because the old code seems way more complicated than it should be. full_results.json is a dictionary that looks like this: { "date": "12:56PM on July 16, 2019", "fixable": 31587, "has_pretty_patch": true, "interrupted": true, "layout_tests_dir": "/Volumes/Shared/CheckoutAlpha/OpenSource/LayoutTests", "num_flaky": 0, "num_missing": 0, "num_passes": 23742, "num_regressions": 10, "other_crashes": { "com.apple.WebKit.WebContent.Development-43082": {}, "com.apple.WebKit.WebContent.Development-45177": {}, "com.apple.WebKit.WebContent.Development-45177-1": {}, "com.apple.WebKit.WebContent.Development-45177-2": {}, "com.apple.WebKit.WebContent.Development-45177-3": {}, "com.apple.WebKit.WebContent.Development-45296": {}, "com.apple.WebKit.WebContent.Development-45296-1": {}, "com.apple.WebKit.WebContent.Development-45296-2": {}, "com.apple.WebKit.WebContent.Development-45296-3": {} }, "pixel_tests_enabled": false, "skipped": 31577, "tests": { "css3": { "filters": { "backdrop": { "backdrop-filter-with-border-radius-and-reflection-remove.html": { "actual": "IMAGE", "expected": "PASS", "image_diff_percent": 0.01, "reftest_type": [ "==" ], "report": "REGRESSION" } } } }, "http": { "tests": { "media": { "modern-media-controls": { "macos-fullscreen-media-controls": { "macos-fullscreen-media-controls-live-broadcast.html": { "actual": "TIMEOUT", "expected": "PASS", "has_stderr": true, "report": "REGRESSION" } }, "pip-support": { "pip-support-live-broadcast.html": { "actual": "TIMEOUT", "expected": "PASS", "has_stderr": true, "report": "REGRESSION" } }, "skip-back-support": { "skip-back-support-button-click.html": { "actual": "TIMEOUT", "expected": "PASS", "has_stderr": true, "report": "REGRESSION" }, "skip-back-support-live-broadcast.html": { "actual": "TIMEOUT", "expected": "PASS", "has_stderr": true, "report": "REGRESSION" } }, "status-support": { "status-support-live-broadcast.html": { "actual": "TIMEOUT", "expected": "PASS", "has_stderr": true, "report": "REGRESSION" }, "status-support-loading.html": { "actual": "TIMEOUT", "expected": "PASS", "has_stderr": true, "report": "REGRESSION" } }, "time-control": { "1-to-10-hours.html": { "actual": "TIMEOUT", "expected": "PASS", "has_stderr": true, "report": "REGRESSION" }, "10-hours-or-more.html": { "actual": "TIMEOUT", "expected": "PASS", "has_stderr": true, "report": "REGRESSION" } } } }, "misc": { "webtiming-ssl.php": { "actual": "TEXT", "expected": "PASS", "report": "REGRESSION" } } } } }, "uses_expectations_file": true, "version": 4 } Since I think we are just trying to extract the number of REGRESSION tags in this output, we have a few options. The first (and simplest) is if we don't actually care about which tests regressed. In that case, we just pull out 'num_regressions' and we're done. The second option is where we pull out which tests specifically failed. In that case, we need to unpack the trie (I had to look that up the first time I came across the term, apparently it's like a formal thing <https://en.wikipedia.org/wiki/Trie>). I've seen some overly complicated code in webkitpy that unpacks these Tries, all you really need is a recursive function that accepts a callback and sort of looks like this: def unpack_trie(callback, tree, name=''): if not tree or 'actual' in tree: return callback(name, tree) for key, value in tree.iteritems(): unpack_trie(callback, value, '{}.{}'.format(name, key) if name else key) I don't think we need any of the other logic that we're copying from webkitpy. I think that we can make this change in maybe ~300-500 lines of code...we certainly don't need the 2500-3k lines we're copying from webkitpy. * I think we might also need to strip the ADD_RESULTS(...); call from full_results.json since, despite it's name, full_results.json actually is not json :(.
Aakash Jain
Comment 5 2019-07-16 17:38:45 PDT
EWS Watchlist
Comment 6 2019-07-16 17:41:39 PDT Comment hidden (obsolete)
Aakash Jain
Comment 7 2019-07-16 17:49:28 PDT
(In reply to Jonathan Bedard from comment #4) > I know we're trying to port old code, but in this case, I think we're better off just re-writing this, because the old code seems way more complicated than it should be. Agree. webkitpy code is just too convoluted. Re-wrote the core logic in updated patch. > The second option is where we pull out which tests specifically failed. Yes, we need the test names in order to compare if same tests failed on clean tree as well. >In that case, we need to unpack the trie. I've seen some overly complicated code in webkitpy that unpacks these Tries, all you really need is a recursive function that accepts a callback Done. > I don't think we need any of the other logic that we're copying from webkitpy. I think that we can make this change in maybe ~300-500 lines of code...we certainly don't need the 2500-3k lines we're copying from webkitpy. Yup, after re-writing the core logic, the change is back to ~100 lines as you expected. > * I think we might also need to strip the ADD_RESULTS(...) Done
Aakash Jain
Comment 8 2019-07-17 09:10:08 PDT
Created attachment 374299 [details] Patch for landing
Aakash Jain
Comment 9 2019-07-17 09:11:16 PDT
Added few more unit-tests. Sample run: https://ews-build.webkit-uat.org/#/builders/40/builds/314
Aakash Jain
Comment 10 2019-07-17 09:11:47 PDT
Radar WebKit Bug Importer
Comment 11 2019-07-17 09:12:15 PDT
Note You need to log in before you can comment on or make changes to this bug.