Bug 199804 - [ews-build] Parse full_results.json for layout-tests
Summary: [ews-build] Parse full_results.json for layout-tests
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Aakash Jain
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-07-15 13:57 PDT by Aakash Jain
Modified: 2019-07-17 09:12 PDT (History)
4 users (show)

See Also:


Attachments
WIP (96.85 KB, patch)
2019-07-15 14:25 PDT, Aakash Jain
no flags Details | Formatted Diff | Diff
WIP (12.11 KB, patch)
2019-07-16 17:38 PDT, Aakash Jain
no flags Details | Formatted Diff | Diff
Patch for landing (17.36 KB, patch)
2019-07-17 09:10 PDT, Aakash Jain
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Aakash Jain 2019-07-15 13:57:26 PDT
Parse full_results.json for layout-tests in EWS. These results (from multiple layout-test runs) will help in determining whether the failures are introduced by the patch, or are pre-existing.
Comment 1 Aakash Jain 2019-07-15 14:25:21 PDT
Created attachment 374149 [details]
WIP
Comment 2 Build Bot 2019-07-15 14:26:44 PDT
Attachment 374149 [details] did not pass style-queue:


ERROR: Tools/BuildSlaveSupport/ews-build/steps_unittest.py:1013:  missing whitespace after ','  [pep8/E231] [5]
ERROR: Tools/BuildSlaveSupport/ews-build/steps_unittest.py:1011:  [TestRunWebKitTests.test_parse_results_json] Passing unexpected keyword argument 'state_string' in function call  [pylint/E1123] [5]
ERROR: Tools/BuildSlaveSupport/ews-build/steps_unittest.py:1011:  [TestRunWebKitTests.test_parse_results_json] No value passed for parameter 'status_text' in function call  [pylint/E1120] [5]
ERROR: Tools/BuildSlaveSupport/ews-build/steps_unittest.py:1013:  [TestRunWebKitTests.test_parse_results_json] Instance of 'TestRunWebKitTests' has no 'assertEqual' member  [pylint/E1101] [5]
ERROR: Tools/BuildSlaveSupport/ews-build/resultsjsonparser.py:33:  multiple imports on one line  [pep8/E401] [5]
ERROR: Tools/BuildSlaveSupport/ews-build/test_expectations.py:227:  at least two spaces before inline comment  [pep8/E261] [5]
ERROR: Tools/BuildSlaveSupport/ews-build/test_expectations.py:893:  at least two spaces before inline comment  [pep8/E261] [5]
Total errors found: 7 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Aakash Jain 2019-07-15 14:28:44 PDT
Sample run: https://ews-build.webkit-uat.org/#/builders/40/builds/282
Comment 4 Jonathan Bedard 2019-07-16 13:55:27 PDT
I know we're trying to port old code, but in this case, I think we're better off just re-writing this, because the old code seems way more complicated than it should be. full_results.json is a dictionary that looks like this:

{
    "date": "12:56PM on July 16, 2019",
    "fixable": 31587,
    "has_pretty_patch": true,
    "interrupted": true,
    "layout_tests_dir": "/Volumes/Shared/CheckoutAlpha/OpenSource/LayoutTests",
    "num_flaky": 0,
    "num_missing": 0,
    "num_passes": 23742,
    "num_regressions": 10,
    "other_crashes": {
        "com.apple.WebKit.WebContent.Development-43082": {},
        "com.apple.WebKit.WebContent.Development-45177": {},
        "com.apple.WebKit.WebContent.Development-45177-1": {},
        "com.apple.WebKit.WebContent.Development-45177-2": {},
        "com.apple.WebKit.WebContent.Development-45177-3": {},
        "com.apple.WebKit.WebContent.Development-45296": {},
        "com.apple.WebKit.WebContent.Development-45296-1": {},
        "com.apple.WebKit.WebContent.Development-45296-2": {},
        "com.apple.WebKit.WebContent.Development-45296-3": {}
    },
    "pixel_tests_enabled": false,
    "skipped": 31577,
    "tests": {
        "css3": {
            "filters": {
                "backdrop": {
                    "backdrop-filter-with-border-radius-and-reflection-remove.html": {
                        "actual": "IMAGE",
                        "expected": "PASS",
                        "image_diff_percent": 0.01,
                        "reftest_type": [
                            "=="
                        ],
                        "report": "REGRESSION"
                    }
                }
            }
        },
        "http": {
            "tests": {
                "media": {
                    "modern-media-controls": {
                        "macos-fullscreen-media-controls": {
                            "macos-fullscreen-media-controls-live-broadcast.html": {
                                "actual": "TIMEOUT",
                                "expected": "PASS",
                                "has_stderr": true,
                                "report": "REGRESSION"
                            }
                        },
                        "pip-support": {
                            "pip-support-live-broadcast.html": {
                                "actual": "TIMEOUT",
                                "expected": "PASS",
                                "has_stderr": true,
                                "report": "REGRESSION"
                            }
                        },
                        "skip-back-support": {
                            "skip-back-support-button-click.html": {
                                "actual": "TIMEOUT",
                                "expected": "PASS",
                                "has_stderr": true,
                                "report": "REGRESSION"
                            },
                            "skip-back-support-live-broadcast.html": {
                                "actual": "TIMEOUT",
                                "expected": "PASS",
                                "has_stderr": true,
                                "report": "REGRESSION"
                            }
                        },
                        "status-support": {
                            "status-support-live-broadcast.html": {
                                "actual": "TIMEOUT",
                                "expected": "PASS",
                                "has_stderr": true,
                                "report": "REGRESSION"
                            },
                            "status-support-loading.html": {
                                "actual": "TIMEOUT",
                                "expected": "PASS",
                                "has_stderr": true,
                                "report": "REGRESSION"
                            }
                        },
                        "time-control": {
                            "1-to-10-hours.html": {
                                "actual": "TIMEOUT",
                                "expected": "PASS",
                                "has_stderr": true,
                                "report": "REGRESSION"
                            },
                            "10-hours-or-more.html": {
                                "actual": "TIMEOUT",
                                "expected": "PASS",
                                "has_stderr": true,
                                "report": "REGRESSION"
                            }
                        }
                    }
                },
                "misc": {
                    "webtiming-ssl.php": {
                        "actual": "TEXT",
                        "expected": "PASS",
                        "report": "REGRESSION"
                    }
                }
            }
        }
    },
    "uses_expectations_file": true,
    "version": 4
}

Since I think we are just trying to extract the number of REGRESSION tags in this output, we have a few options. The first (and simplest) is if we don't actually care about which tests regressed. In that case, we just pull out 'num_regressions' and we're done.

The second option is where we pull out which tests specifically failed. In that case, we need to unpack the trie (I had to look that up the first time I came across the term, apparently it's like a formal thing <https://en.wikipedia.org/wiki/Trie>). I've seen some overly complicated code in webkitpy that unpacks these Tries, all you really need is a recursive function that accepts a callback and sort of looks like this:

def unpack_trie(callback, tree, name=''):
    if not tree or 'actual' in tree:
        return callback(name, tree)

    for key, value in tree.iteritems():
        unpack_trie(callback, value, '{}.{}'.format(name, key) if name else key)

I don't think we need any of the other logic that we're copying from webkitpy. I think that we can make this change in maybe ~300-500 lines of code...we certainly don't need the 2500-3k lines we're copying from webkitpy.

* I think we might also need to strip the ADD_RESULTS(...); call from full_results.json since, despite it's name, full_results.json actually is not json :(.
Comment 5 Aakash Jain 2019-07-16 17:38:45 PDT
Created attachment 374268 [details]
WIP
Comment 6 Build Bot 2019-07-16 17:41:39 PDT Comment hidden (obsolete)
Comment 7 Aakash Jain 2019-07-16 17:49:28 PDT
(In reply to Jonathan Bedard from comment #4)
> I know we're trying to port old code, but in this case, I think we're better off just re-writing this, because the old code seems way more complicated than it should be.
Agree. webkitpy code is just too convoluted. Re-wrote the core logic in updated patch.


> The second option is where we pull out which tests specifically failed. 
Yes, we need the test names in order to compare if same tests failed on clean tree as well.

>In that case, we need to unpack the trie. I've seen some overly complicated code in webkitpy that unpacks these Tries, all you really need is a recursive function that accepts a callback
Done.

> I don't think we need any of the other logic that we're copying from webkitpy. I think that we can make this change in maybe ~300-500 lines of code...we certainly don't need the 2500-3k lines we're copying from webkitpy.
Yup, after re-writing the core logic, the change is back to ~100 lines as you expected.

> * I think we might also need to strip the ADD_RESULTS(...)
Done
Comment 8 Aakash Jain 2019-07-17 09:10:08 PDT
Created attachment 374299 [details]
Patch for landing
Comment 9 Aakash Jain 2019-07-17 09:11:16 PDT
Added few more unit-tests. 

Sample run: https://ews-build.webkit-uat.org/#/builders/40/builds/314
Comment 10 Aakash Jain 2019-07-17 09:11:47 PDT
Committed r247515: <https://trac.webkit.org/changeset/247515>
Comment 11 Radar WebKit Bug Importer 2019-07-17 09:12:15 PDT
<rdar://problem/53205637>