137353 – Add --json flag to test-webkitpy to emit JSON formatted test results on stdout

RESOLVED FIXED 137353

Add --json flag to test-webkitpy to emit JSON formatted test results on stdout

https://bugs.webkit.org/show_bug.cgi?id=137353

Summary Add --json flag to test-webkitpy to emit JSON formatted test results on stdout

Jake Nielsen

Reported 2014-10-02 12:27:33 PDT

test-webkitpy can optionally take in --results-dir and put a json file outlining its results into the specified directory

Attachments
Adds the appropriate mechanisms to allow test-webkitpy to generate a json file of its results in a specified directory (18.35 KB, patch) 2014-10-02 18:49 PDT, Jake Nielsen	no flags	Details Formatted Diff Diff
Fixes help message for --results-dir option, and addresses stylebot's issues (18.57 KB, patch) 2014-10-02 19:01 PDT, Jake Nielsen	no flags	Details Formatted Diff Diff
Removes the TestResult(s) class. (11.64 KB, patch) 2014-10-03 13:43 PDT, Jake Nielsen	dbates: review- dbates: commit-queue-	Details Formatted Diff Diff
Adds the results.py file back (with significant changes) and adds results_unittest.py (23.38 KB, patch) 2014-10-05 19:58 PDT, Jake Nielsen	jake.nielsen.webkit: review- jake.nielsen.webkit: commit-queue-	Details Formatted Diff Diff
Adds Config object logic, and changes runtest.py to make sure it doesn't accidentally circumvent mock injection (20.96 KB, patch) 2014-10-06 11:57 PDT, Jake Nielsen	dbates: review- dbates: commit-queue-	Details Formatted Diff Diff
Changes to the --json flag solution to the problem, as discussed in person. (45.76 KB, patch) 2014-10-10 17:44 PDT, Jake Nielsen	dbates: review- dbates: commit-queue-	Details Formatted Diff Diff
Changes to use the _print_results_as_json method, as discussed in person. (4.83 KB, patch) 2014-10-13 12:58 PDT, Jake Nielsen	dbates: review+ dbates: commit-queue-	Details Formatted Diff Diff
Fixes as per comments. (18.57 KB, patch) 2014-10-13 16:46 PDT, Jake Nielsen	jake.nielsen.webkit: review- jake.nielsen.webkit: commit-queue-	Details Formatted Diff Diff
The patch I intended to post last time... (5.66 KB, patch) 2014-10-13 16:50 PDT, Jake Nielsen	dbates: review+	Details Formatted Diff Diff
Removes redundant test names from the list of tests (5.70 KB, patch) 2014-10-15 10:46 PDT, Jake Nielsen	no flags	Details Formatted Diff Diff
Fixes style issues (5.76 KB, patch) 2014-10-15 10:51 PDT, Jake Nielsen	dbates: review+	Details Formatted Diff Diff
Dictionary key change (5.76 KB, patch) 2014-10-15 12:46 PDT, Jake Nielsen	no flags	Details Formatted Diff Diff
Show Obsolete (11) View All Add attachment proposed patch, testcase, etc.

Jake Nielsen

Comment 1 2014-10-02 18:49:29 PDT

Created attachment 239171 [details] Adds the appropriate mechanisms to allow test-webkitpy to generate a json file of its results in a specified directory

WebKit Commit Bot

Comment 2 2014-10-02 18:52:02 PDT

Attachment 239171 [details] did not pass style-queue: ERROR: Tools/Scripts/webkitpy/test/results.py:26: expected 2 blank lines, found 1 [pep8/E302] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:27: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:28: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:29: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:31: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:32: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:33: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:34: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:35: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:36: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:38: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:39: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:40: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:41: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:43: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:44: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:46: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:47: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:49: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:50: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:54: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:55: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:56: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:57: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:59: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:60: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:61: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:62: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:64: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:65: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:67: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:68: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:70: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:71: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:73: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:74: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:76: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:77: indentation contains tabs [pep8/W191] [5] ERROR: Tools/Scripts/webkitpy/test/results.py:78: blank line at end of file [pep8/W391] [5] Total errors found: 39 in 9 files If any of these errors are false positives, please file a bug against check-webkit-style.

Jake Nielsen

Comment 3 2014-10-02 19:01:29 PDT

Created attachment 239173 [details] Fixes help message for --results-dir option, and addresses stylebot's issues

Daniel Bates

Comment 4 2014-10-03 09:50:39 PDT

Comment on attachment 239173 [details] Fixes help message for --results-dir option, and addresses stylebot's issues I'm unclear of the motivation for this change from reading the bug description (comment #0) and change log entry. Can you elaborate on it, including the intended audience of the JSON file?

Jake Nielsen

Comment 5 2014-10-03 11:42:59 PDT

(In reply to comment #4) > (From update of attachment 239173 [details]) > I'm unclear of the motivation for this change from reading the bug description (comment #0) and change log entry. Can you elaborate on it, including the intended audience of the JSON file? Sure. We want the webkit buildbot page (and in particular results.html) to be able to display the results of the python unit tests. We want EWS bubbles for them as well. JSC and webkit-perl tests are going to follow suite as well.

Jake Nielsen

Comment 6 2014-10-03 13:43:47 PDT

Created attachment 239227 [details] Removes the TestResult(s) class.

Daniel Bates

Comment 7 2014-10-03 14:23:24 PDT

Comment on attachment 239227 [details] Removes the TestResult(s) class. View in context: https://bugs.webkit.org/attachment.cgi?id=239227&action=review > Tools/ChangeLog:7 > + We should add a description to this change log entry that describers the reasoning behind this change based on your remark in comment #5 because the bug title doesn't explain the reasoning for this change. > Tools/Scripts/webkitpy/common/webkit_finder.py:39 > + def webkit_results_dir(self): > + path_to_build_dir = self._filesystem.join(self.webkit_base(), "WebKitBuild") > + configuration_dir = self._filesystem.read_text_file(self._filesystem.join(path_to_build_dir, "Configuration")) > + return self._filesystem.join(path_to_build_dir, configuration_dir) > + This will not work for people that specify a custom build directory path using the environment variable WEBKIT_OUTPUTDIR (see second paragraph on <http://www.webkit.org/building/build.html>). Notice we have existing logic to compute the correct build directory in Config.build_directory(). Can we make use of this functionality? > Tools/Scripts/webkitpy/test/main.py:108 > + parser.add_option('-r', '--results-dir', action='store', default="", I don't see the value in abbreviating the word "directory" as "dir". Moreover, --results-dir is inconsistent with the naming of the command line option --results-directory used in existing tools for a similar purpose, including run-webkit-tests. > Tools/Scripts/webkitpy/test/main.py:109 > + help='specifies a directory in which to generate a json file that contains the results of the tests') Nit: json => JSON (JSON is an acronym for JavaScript Object Notation).

Jake Nielsen

Comment 8 2014-10-05 19:58:54 PDT

Created attachment 239311 [details] Adds the results.py file back (with significant changes) and adds results_unittest.py After some discussion with Dan in person, we decided that since the class in results.py will be reused very shortly, it's worth including in this patch.

Jake Nielsen

Comment 9 2014-10-05 19:59:57 PDT

Comment on attachment 239311 [details] Adds the results.py file back (with significant changes) and adds results_unittest.py Didn't address all of Dan's comments.

Jake Nielsen

Comment 10 2014-10-05 21:43:00 PDT

Constructing this Config object is taking a lot longer than I though it would. I'll come back to this tomorrow.

Jake Nielsen

Comment 11 2014-10-06 11:57:32 PDT

Created attachment 239345 [details] Adds Config object logic, and changes runtest.py to make sure it doesn't accidentally circumvent mock injection

Daniel Bates

Comment 12 2014-10-08 01:40:34 PDT

Comment on attachment 239345 [details] Adds Config object logic, and changes runtest.py to make sure it doesn't accidentally circumvent mock injection View in context: https://bugs.webkit.org/attachment.cgi?id=239345&action=review > Tools/ChangeLog:8 > + The webkit buildbot page's results.html file should contain results Nit: webkit => WebKit > Tools/ChangeLog:9 > + for the python unit tests (and soon to be others), as should the EWS Nit: python => Python > Tools/ChangeLog:31 > + in an equivelant TestSuiteResults object. Nit: equivelant => equivalent > Tools/Scripts/webkitpy/test/main.py:78 > + self._filesystem = filesystem or FileSystem() Notice that Finder stores the passed FileSystem object in the public instance variable, Finder.filesystem. It seems sufficient to go through Finder to get the FileSystem object instead of explicitly storing a reference to it in a private instance variable (i.e. self.finder.filesystem). > Tools/Scripts/webkitpy/test/main.py:107 > + parser.add_option('-r', '--results-directory', action='store', default="", Nit: " => ' (double quote) to (single quote) for the empty string literal defined as the default value of parameter default. > Tools/Scripts/webkitpy/test/main.py:170 > + self._filesystem.write_text_file(self._filesystem.join(self._options.results_directory, "results.json"), test_runner.test_suite_results.test_suite_results_to_json_string()) Nit: " => ' (double quote) to (single quotes) for the string literal 'results.json'. > Tools/Scripts/webkitpy/test/results.py:26 > +class TestSuiteResults(object): This class is OK as-is. Having said that it seems sufficient to have non-member functions that deserialize and serialize test result data as opposed to defining a class with such functionality. > Tools/Scripts/webkitpy/test/results.py:44 > + test_result_dict['did_pass'] = test_result_tuple[1] I take it that did_pass cannot be inferred from the values of failure_messages and error_messages? > Tools/Scripts/webkitpy/test/results.py:54 > + self.list_of_test_results = initial_list_of_test_results Would it make sense to make self.list_of_test_results a private instance variable? Or remove list_of_failed_test_results() and list_of_errored_test_results()? I mean, it seems unnecessary to both expose all the tests and provide convenience methods to filter the tests as a client can perform the filtering themselves. > Tools/Scripts/webkitpy/test/runner.py:59 > + self.test_suite_results.add_test_result(test_name=test_name, did_pass=not bool(failures + errors), failure_messages=failures, error_messages=errors) Nit: I suggest removing the parameter name for the first argument (test_name) as its purpose is clear from the name of its argument (the local variable named test_name): self.test_suite_results.add_test_result(test_name, did_pass=not bool(failures + errors), failure_messages=failures, error_messages=errors) See my question about did_pass in TestSuiteResults.test_suite_results_to_json_string(). If did_pass can be inferred from the values of failure_messages and error_messages then we can further simplify this line of code. > Tools/Scripts/webkitpy/tool/steps/runtests.py:73 > + filesystem = self._tool.filesystem > + port_name = self._tool.deprecated_port().name() > + > + if not type(port_name) is str: > + port_name = port_name.__name__ > + > + config = Config(self._tool.executive, filesystem, port_name) > + > + python_unittests_results_dir = filesystem.join(config.build_directory(config.default_configuration()), "python-unittest-results") > + filesystem.maybe_make_directory(python_unittests_results_dir) > > python_unittests_command = self._tool.deprecated_port().run_python_unittests_command() > + python_unittests_command.append("--results-directory") > + python_unittests_command.append(python_unittests_results_dir) > + I envisioned calling Config.build_directory() through an instance of class Port. In particular, I envisioned adding a method to class Port similar to Port.default_results_directory(), say Port.python_unittest_results_directory(), that returned the path to the Python unit tests results directory: def python_unittest_results_directory(self): return self._build_path('python-unittest-results') Then we can write this code to have the form: ... if not self._options.non_interactive: filesystem = self._tool.filesystem python_unittest_results_directory = self._tool.port_factory.get().python_unittest_results_directory() filesystem.maybe_make_directory(python_unittests_results_dir) # FIXME: We should teach the commit-queue and the EWS how to run these tests. python_unittests_command = self._tool.deprecated_port().run_python_unittests_command() python_unittests_command.extend(['--results-directory', python_unittest_results_directory]) ... If we have Tools/Scripts/webkitpy/test/main.py create the specified results directory (if it does not already exist) then we can remove the call to FileSystem.maybe_make_directory() and simplify this code to have the form: ... if not self._options.non_interactive: # FIXME: We should teach the commit-queue and the EWS how to run these tests. python_unittests_command = self._tool.deprecated_port().run_python_unittests_command() python_unittests_command.extend(['--results-directory', self._tool.port_factory.get().python_unittest_results_directory()]) ...

Daniel Bates

Comment 13 2014-10-08 14:49:13 PDT

I thought about the approach proposed by this patch some more. Notice that the Python unit test runner has an existing abstraction (webkitpy.test.Printer) for formatting and writing test result output. Can we make use of the class webkitpy.test.Printer to support emitting JSON output? In particular, can we either extend the class webkitpy.test.Printer to support emitting JSON-formatted output or define a subclass of it than does such formatting?

Jake Nielsen

Comment 14 2014-10-10 17:44:53 PDT

Created attachment 239662 [details] Changes to the --json flag solution to the problem, as discussed in person.

Daniel Bates

Comment 15 2014-10-10 20:45:12 PDT

Comment on attachment 239662 [details] Changes to the --json flag solution to the problem, as discussed in person. View in context: https://bugs.webkit.org/attachment.cgi?id=239662&action=review > Tools/Scripts/webkitpy/test/main.py:120 > + self.printer = JSONPrinter(self._stream_override or sys.stdout, self._options) How did you come the decision to emit the JSON output to standard output by default given that we emit the human readable output to standard error by default? > Tools/Scripts/webkitpy/test/main.py:124 > + self.printer.print_started_test_suite("Python Unit Tests") How did you come to the decision to emit this text? > Tools/Scripts/webkitpy/test/printers/human_readable_printer.py:111 > + elif not test_did_pass: > + lines = [] > + suffix = ' failed' > + self.num_failures += 1 We will never execute this code since we handle the cases when failures and errors are non-empty lists (above) and "not test_did_pass" only evaluates to true if either failures or errors is a non-empty list. > Tools/Scripts/webkitpy/test/printers/json_printer.py:76 > + def print_started_test_suite(self, test_suite_name): > + self.stream.write("{") > + self.stream.write("\"suite_name\" : \"%s\",\n" % test_suite_name) > + self.stream.write("\"list_of_tests\" : [\n") > + > + def write_update(self, msg): > + pass > + > + def print_started_test(self, source, test_name): > + pass > + > + def print_finished_test(self, source, test_name, test_did_pass, test_time, failures, errors): > + self._number_of_tests_run += 1 > + if not self._first_test_flag: > + self.stream.write(",\n") > + > + test_dict = {} > + test_dict["test_name"] = test_name > + test_dict["test_did_pass"] = test_did_pass > + test_dict["failures"] = failures > + test_dict["errors"] = errors > + > + self.stream.write(json.dumps(test_dict)) > + > + self._first_test_flag = False > + > + def print_result(self, run_time): > + self.stream.write("],\n\"test_run_time\" : %s,\n\"number_of_tests_run\" : %s}" % (run_time, self._number_of_tests_run)) How did you come to the decision to piecewise write the JSON data to the stream? I suggest that we build the test_dict structure in memory and then implement print_result() to write the result of json.dump(test_dict) to stream self.stream. The benefit of this approach is that we do not need to explicitly emit any JSON markup ourself in print_started_test_suite() or print_finished_test(), which is brittle and error prone. > Tools/Scripts/webkitpy/test/runner.py:65 > + self.printer.print_finished_test(source, test_name, not bool(failures + errors), delay, failures, errors) I don't see the value of the new argument, tests_did_pass, given that the printer is passed the list of failures and errors and can compute tests_did_pass itself.

Jake Nielsen

Comment 16 2014-10-13 10:00:45 PDT

(In reply to comment #15) > (From update of attachment 239662 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=239662&action=review > > > Tools/Scripts/webkitpy/test/main.py:120 > > + self.printer = JSONPrinter(self._stream_override or sys.stdout, self._options) > > How did you come the decision to emit the JSON output to standard output by default given that we emit the human readable output to standard error by default? This is a pretty standard choice for command line tools that you expect to be redirecting to files or piping to other programs. It allows errors to still be reported on stderr without malforming the JSON. > > > Tools/Scripts/webkitpy/test/main.py:124 > > + self.printer.print_started_test_suite("Python Unit Tests") > > How did you come to the decision to emit this text? The JSON file should have some notion of what test suite it originated from. This is the way I chose to do it. in the results.py solution this was part of the constructor. > > > Tools/Scripts/webkitpy/test/printers/human_readable_printer.py:111 > > + elif not test_did_pass: > > + lines = [] > > + suffix = ' failed' > > + self.num_failures += 1 > > We will never execute this code since we handle the cases when failures and errors are non-empty lists (above) and "not test_did_pass" only evaluates to true if either failures or errors is a non-empty list. We won't here, but we will in the bindings test. The bindings test has no notion of failures and errors, only whether a test passed or not. > > > Tools/Scripts/webkitpy/test/printers/json_printer.py:76 > > + def print_started_test_suite(self, test_suite_name): > > + self.stream.write("{") > > + self.stream.write("\"suite_name\" : \"%s\",\n" % test_suite_name) > > + self.stream.write("\"list_of_tests\" : [\n") > > + > > + def write_update(self, msg): > > + pass > > + > > + def print_started_test(self, source, test_name): > > + pass > > + > > + def print_finished_test(self, source, test_name, test_did_pass, test_time, failures, errors): > > + self._number_of_tests_run += 1 > > + if not self._first_test_flag: > > + self.stream.write(",\n") > > + > > + test_dict = {} > > + test_dict["test_name"] = test_name > > + test_dict["test_did_pass"] = test_did_pass > > + test_dict["failures"] = failures > > + test_dict["errors"] = errors > > + > > + self.stream.write(json.dumps(test_dict)) > > + > > + self._first_test_flag = False > > + > > + def print_result(self, run_time): > > + self.stream.write("],\n\"test_run_time\" : %s,\n\"number_of_tests_run\" : %s}" % (run_time, self._number_of_tests_run)) > > How did you come to the decision to piecewise write the JSON data to the stream? I suggest that we build the test_dict structure in memory and then implement print_result() to write the result of json.dump(test_dict) to stream self.stream. The benefit of this approach is that we do not need to explicitly emit any JSON markup ourself in print_started_test_suite() or print_finished_test(), which is brittle and error prone. I'll go change that. > > > Tools/Scripts/webkitpy/test/runner.py:65 > > + self.printer.print_finished_test(source, test_name, not bool(failures + errors), delay, failures, errors) > > I don't see the value of the new argument, tests_did_pass, given that the printer is passed the list of failures and errors and can compute tests_did_pass itself. In this test, you're correct, it's redundant. In the bindings tests it's not.

Jake Nielsen

Comment 17 2014-10-13 12:58:46 PDT

Created attachment 239739 [details] Changes to use the _print_results_as_json method, as discussed in person.

Daniel Bates

Comment 18 2014-10-13 15:27:23 PDT

Comment on attachment 239739 [details] Changes to use the _print_results_as_json method, as discussed in person. View in context: https://bugs.webkit.org/attachment.cgi?id=239739&action=review > Tools/ChangeLog:4 > + on stdout Nit: on => to > Tools/ChangeLog:11 > + Adds the read method to MockFile to allow the stdout of test_webkitpy > + to be redirected without causing erroneous test failures. For you consideration, I suggest adding a remark about the test(s) that failed without this change. > Tools/Scripts/webkitpy/port/server_process_unittest.py:71 > + # This means EOF Although it can be implied that this comment is with respect to the line above it given the length of the body of this function, it would be better to be explicit about this relationship and either move the comment to actually be above the return statement (line 70) or inline the comment on the same line as the return statement. You may also want to consider writing out End of file instead of abbreviating it as EOF to be explicit even though I suspect our audience is familiar with the abbreviation of EOF to mean End of file. > Tools/Scripts/webkitpy/test/main.py:36 > import time > import traceback > import unittest > +import operator > +import json Please order the imports in this file such that they are sorted according to the output of the UNIX sort command. > Tools/Scripts/webkitpy/test/main.py:108 > + help='emit JSON formatted test results on stdout') Nit: on => to The word "emit" in this help message can be taken as a synonym for the verb "output". I suggest that we use the word "write" instead of the word "emit" to refer to the action of writing JSON output so as to be consistent with the language suggested for the word "output" on p.114 of the Apple Style Guide (April 2013), <https://help.apple.com/asg/mac/2013/ASG_2013.pdf>. > Tools/Scripts/webkitpy/test/main.py:128 > + def _print_results_as_json(self, stream, test_names, failures, errors): It is sufficient to make this a non-member function. I don't see the need to make this a member function that takes a parameter called self as this method does not make use of any instance data. > Tools/Scripts/webkitpy/test/main.py:135 > + results['tests'] = sorted(test_names) :( It is OK to include a list of the names of all the tests we ran assuming we address the inefficiency of this line (see my remark on line 176 for more details). I wish we could avoid including such a list in the JSON output as it guarantees that the size of the JSON file is at least proportional to the total number of tests ran. In contrast, the JSON output for the layout tests results, full_results.json, is proportional to the number of tests that failed (including flaky tests). I know you mentioned in person today (10/13) that you felt that knowing the names of all tests than ran is necessary to identify flaky tests. It would great if we could find a way to identify flaky tests without having to include the names of all the tests that ran in the JSON output. Can we derived this data from the list of tests that failed (failures), the list of the tests that had a Python error (errors) and/or other metrics? Notice that we have logic to identify flaky tests when generating the layout test results JSON output without necessitating inserting a list of the names of all the tests that ran in the JSON output. You mentioned you were unhappy with this approach, but I don't recall the reason. It would be beneficial to understand how the shortcomings (what are they?) of the test flakiness detection we use for layout test results and future flakiness detection for Python unit tests (do we know of any flaky tests?) are overcome by including in the JSON output a list of the names of all the tests that ran. > Tools/Scripts/webkitpy/test/main.py:136 > + Nit: I don't feel that this empty line improves the readability of this function. I suggest removing it. > Tools/Scripts/webkitpy/test/main.py:137 > + stream.write(json.dumps(results)) Notice that json.dump() (intentional omission of the 's' in the word "dumps") takes a File-like object as output (*). I suggest we take advantage of this convenience API and write this line as: json.dump(results, stream) On another note, you may want to consider specifying the keyword argument separators=(',', ':') towards generating more compact JSON output (since we envision the primary consumer being another computer program/web page as opposed to a human being). This would also make the separators used in this JSON output consistent with the separators used in the JSON output for the layout test results, full_results.json. (*) See <https://docs.python.org/2/library/json.html#basic-usage> for more details on the usage of json.dump(). > Tools/Scripts/webkitpy/test/main.py:176 > + self._print_results_as_json(sys.stdout, parallel_tests + serial_tests, test_runner.failures, test_runner.errors) This is inefficient. In particular, it is space and time inefficient to concatenate the list parallel_tests and serial_tests, which allocates a new list, given we are going to sort this concatenated list using sorted(), which also allocates a new list. Either we should pass an iterator (i.e. itertools.chain(parallel_tests, serial_tests)) for the second argument to _print_results_as_json() or we should modify _print_results_as_json() so that it sorts tests_names in-place (i.e. use test_names.sort()). If we choose the latter approach, then I suggest we add a docstring (**) that explains the purpose of _print_results_as_json() and documents that it mutates the argument test_names. (**) See <http://legacy.python.org/dev/peps/pep-0257/> for more details on docstrings.

Jake Nielsen

Comment 19 2014-10-13 16:22:09 PDT

(In reply to comment #18) > (From update of attachment 239739 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=239739&action=review > > > Tools/ChangeLog:4 > > + on stdout > > Nit: on => to > > > Tools/ChangeLog:11 > > + Adds the read method to MockFile to allow the stdout of test_webkitpy > > + to be redirected without causing erroneous test failures. > > For you consideration, I suggest adding a remark about the test(s) that failed without this change. > > > Tools/Scripts/webkitpy/port/server_process_unittest.py:71 > > + # This means EOF > > Although it can be implied that this comment is with respect to the line above it given the length of the body of this function, it would be better to be explicit about this relationship and either move the comment to actually be above the return statement (line 70) or inline the comment on the same line as the return statement. You may also want to consider writing out End of file instead of abbreviating it as EOF to be explicit even though I suspect our audience is familiar with the abbreviation of EOF to mean End of file. > > > Tools/Scripts/webkitpy/test/main.py:36 > > import time > > import traceback > > import unittest > > +import operator > > +import json > > Please order the imports in this file such that they are sorted according to the output of the UNIX sort command. > > > Tools/Scripts/webkitpy/test/main.py:108 > > + help='emit JSON formatted test results on stdout') > > Nit: on => to > > The word "emit" in this help message can be taken as a synonym for the verb "output". I suggest that we use the word "write" instead of the word "emit" to refer to the action of writing JSON output so as to be consistent with the language suggested for the word "output" on p.114 of the Apple Style Guide (April 2013), <https://help.apple.com/asg/mac/2013/ASG_2013.pdf>. > > > Tools/Scripts/webkitpy/test/main.py:128 > > + def _print_results_as_json(self, stream, test_names, failures, errors): > > It is sufficient to make this a non-member function. I don't see the need to make this a member function that takes a parameter called self as this method does not make use of any instance data. > > > Tools/Scripts/webkitpy/test/main.py:135 > > + results['tests'] = sorted(test_names) > > :( It is OK to include a list of the names of all the tests we ran assuming we address the inefficiency of this line (see my remark on line 176 for more details). I wish we could avoid including such a list in the JSON output as it guarantees that the size of the JSON file is at least proportional to the total number of tests ran. In contrast, the JSON output for the layout tests results, full_results.json, is proportional to the number of tests that failed (including flaky tests). I know you mentioned in person today (10/13) that you felt that knowing the names of all tests than ran is necessary to identify flaky tests. It would great if we could find a way to identify flaky tests without having to include the names of all the tests that ran in the JSON output. Can we derived this data from the list of tests that failed (failures), the list of the tests that had a Python error (errors) and/or other metrics? Notice that we have logic to identify flaky tests when generating the layout test results JSON output without necessitating inserting a list of the names of all the tests that ran in the JSON output. You mentioned you were unhappy with this approach, but I don't recall the reason. It would be beneficial to understand how the shortcomings (what are they?) of the test flakiness detection we use for layout test results and future flakiness detection for Python unit tests (do we know of any flaky tests?) are overcome by including in the JSON output a list of the names of all the tests that ran. I think decimating the data irreversibly is a mistake. It's impossible to know when a particular test was added to the suite unless you have more than just a list of failures. The simplest way we could do this is simply by storing the list of test names. Furthermore, the layout-test-results directory, which gets archived in its entirety every time the buildbots run layout test, is 73 megabytes large. In contrast the complete json file containing the list of tests of test-webkitpy is 0.124 megabytes. That's a mere 0.17% change in the archive size. I really don't think this is worth fighting about. > > > Tools/Scripts/webkitpy/test/main.py:136 > > + > > Nit: I don't feel that this empty line improves the readability of this function. I suggest removing it. > > > Tools/Scripts/webkitpy/test/main.py:137 > > + stream.write(json.dumps(results)) > > Notice that json.dump() (intentional omission of the 's' in the word "dumps") takes a File-like object as output (*). I suggest we take advantage of this convenience API and write this line as: > > json.dump(results, stream) > > On another note, you may want to consider specifying the keyword argument separators=(',', ':') towards generating more compact JSON output (since we envision the primary consumer being another computer program/web page as opposed to a human being). This would also make the separators used in this JSON output consistent with the separators used in the JSON output for the layout test results, full_results.json. > > (*) See <https://docs.python.org/2/library/json.html#basic-usage> for more details on the usage of json.dump(). > > > Tools/Scripts/webkitpy/test/main.py:176 > > + self._print_results_as_json(sys.stdout, parallel_tests + serial_tests, test_runner.failures, test_runner.errors) > > This is inefficient. In particular, it is space and time inefficient to concatenate the list parallel_tests and serial_tests, which allocates a new list, given we are going to sort this concatenated list using sorted(), which also allocates a new list. Either we should pass an iterator (i.e. itertools.chain(parallel_tests, serial_tests)) for the second argument to _print_results_as_json() or we should modify _print_results_as_json() so that it sorts tests_names in-place (i.e. use test_names.sort()). If we choose the latter approach, then I suggest we add a docstring (**) that explains the purpose of _print_results_as_json() and documents that it mutates the argument test_names. > > (**) See <http://legacy.python.org/dev/peps/pep-0257/> for more details on docstrings. (In reply to comment #18) > (From update of attachment 239739 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=239739&action=review > > > Tools/ChangeLog:4 > > + on stdout > > Nit: on => to > > > Tools/ChangeLog:11 > > + Adds the read method to MockFile to allow the stdout of test_webkitpy > > + to be redirected without causing erroneous test failures. > > For you consideration, I suggest adding a remark about the test(s) that failed without this change. > > > Tools/Scripts/webkitpy/port/server_process_unittest.py:71 > > + # This means EOF > > Although it can be implied that this comment is with respect to the line above it given the length of the body of this function, it would be better to be explicit about this relationship and either move the comment to actually be above the return statement (line 70) or inline the comment on the same line as the return statement. You may also want to consider writing out End of file instead of abbreviating it as EOF to be explicit even though I suspect our audience is familiar with the abbreviation of EOF to mean End of file. > > > Tools/Scripts/webkitpy/test/main.py:36 > > import time > > import traceback > > import unittest > > +import operator > > +import json > > Please order the imports in this file such that they are sorted according to the output of the UNIX sort command. > > > Tools/Scripts/webkitpy/test/main.py:108 > > + help='emit JSON formatted test results on stdout') > > Nit: on => to > > The word "emit" in this help message can be taken as a synonym for the verb "output". I suggest that we use the word "write" instead of the word "emit" to refer to the action of writing JSON output so as to be consistent with the language suggested for the word "output" on p.114 of the Apple Style Guide (April 2013), <https://help.apple.com/asg/mac/2013/ASG_2013.pdf>. > > > Tools/Scripts/webkitpy/test/main.py:128 > > + def _print_results_as_json(self, stream, test_names, failures, errors): > > It is sufficient to make this a non-member function. I don't see the need to make this a member function that takes a parameter called self as this method does not make use of any instance data. > > > Tools/Scripts/webkitpy/test/main.py:135 > > + results['tests'] = sorted(test_names) > > :( It is OK to include a list of the names of all the tests we ran assuming we address the inefficiency of this line (see my remark on line 176 for more details). I wish we could avoid including such a list in the JSON output as it guarantees that the size of the JSON file is at least proportional to the total number of tests ran. In contrast, the JSON output for the layout tests results, full_results.json, is proportional to the number of tests that failed (including flaky tests). I know you mentioned in person today (10/13) that you felt that knowing the names of all tests than ran is necessary to identify flaky tests. It would great if we could find a way to identify flaky tests without having to include the names of all the tests that ran in the JSON output. Can we derived this data from the list of tests that failed (failures), the list of the tests that had a Python error (errors) and/or other metrics? Notice that we have logic to identify flaky tests when generating the layout test results JSON output without necessitating inserting a list of the names of all the tests that ran in the JSON output. You mentioned you were unhappy with this approach, but I don't recall the reason. It would be beneficial to understand how the shortcomings (what are they?) of the test flakiness detection we use for layout test results and future flakiness detection for Python unit tests (do we know of any flaky tests?) are overcome by including in the JSON output a list of the names of all the tests that ran. > > > Tools/Scripts/webkitpy/test/main.py:136 > > + > > Nit: I don't feel that this empty line improves the readability of this function. I suggest removing it. > > > Tools/Scripts/webkitpy/test/main.py:137 > > + stream.write(json.dumps(results)) > > Notice that json.dump() (intentional omission of the 's' in the word "dumps") takes a File-like object as output (*). I suggest we take advantage of this convenience API and write this line as: > > json.dump(results, stream) > > On another note, you may want to consider specifying the keyword argument separators=(',', ':') towards generating more compact JSON output (since we envision the primary consumer being another computer program/web page as opposed to a human being). This would also make the separators used in this JSON output consistent with the separators used in the JSON output for the layout test results, full_results.json. > > (*) See <https://docs.python.org/2/library/json.html#basic-usage> for more details on the usage of json.dump(). > > > Tools/Scripts/webkitpy/test/main.py:176 > > + self._print_results_as_json(sys.stdout, parallel_tests + serial_tests, test_runner.failures, test_runner.errors) > > This is inefficient. In particular, it is space and time inefficient to concatenate the list parallel_tests and serial_tests, which allocates a new list, given we are going to sort this concatenated list using sorted(), which also allocates a new list. Either we should pass an iterator (i.e. itertools.chain(parallel_tests, serial_tests)) for the second argument to _print_results_as_json() or we should modify _print_results_as_json() so that it sorts tests_names in-place (i.e. use test_names.sort()). If we choose the latter approach, then I suggest we add a docstring (**) that explains the purpose of _print_results_as_json() and documents that it mutates the argument test_names. > > (**) See <http://legacy.python.org/dev/peps/pep-0257/> for more details on docstrings.

Jake Nielsen

Comment 20 2014-10-13 16:46:52 PDT

Created attachment 239759 [details] Fixes as per comments.

Jake Nielsen

Comment 21 2014-10-13 16:48:12 PDT

Sigh, disregard that last patch. My bad.

Jake Nielsen

Comment 22 2014-10-13 16:50:12 PDT

Created attachment 239760 [details] The patch I intended to post last time...

Daniel Bates

Comment 23 2014-10-13 20:55:51 PDT

(In reply to comment #19) > > > Tools/Scripts/webkitpy/test/main.py:135 > > > + results['tests'] = sorted(test_names) > > > > :( It is OK to include a list of the names of all the tests we ran assuming we address the inefficiency of this line (see my remark on line 176 for more details). I wish we could avoid including such a list in the JSON output as it guarantees that the size of the JSON file is at least proportional to the total number of tests ran. In contrast, the JSON output for the layout tests results, full_results.json, is proportional to the number of tests that failed (including flaky tests). I know you mentioned in person today (10/13) that you felt that knowing the names of all tests than ran is necessary to identify flaky tests. It would great if we could find a way to identify flaky tests without having to include the names of all the tests that ran in the JSON output. Can we derived this data from the list of tests that failed (failures), the list of the tests that had a Python error (errors) and/or other metrics? Notice that we have logic to identify flaky tests when generating the layout test results JSON output without necessitating inserting a list of the names of all the tests that ran in the JSON output. You mentioned you were unhappy with this approach, but I don't recall the reason. It would be beneficial to understand how the shortcomings (what are they?) of the test flakiness detection we use for layout test results and future flakiness detection for Python unit tests (do we know of any flaky tests?) are overcome by including in the JSON output a list of the names of all the tests that ran. > I think decimating the data irreversibly is a mistake. It's impossible to know when a particular test was added to the suite unless you have more than just a list of failures. The simplest way we could do this is simply by storing the list of test names. Can the JSON output be composed of the following of the tests that passed, the tests that failed (expected result differs from actual result) and the tests that failed because of a Python error? Or will it make it more straightforward to implement the JavaScript front-end code if the JSON output is composed of the tests that failed (expected result differs from actual result), tests that failed because of a Python error, and all the tests that ran? Notice that the list of all the tests than ran includes the tests the failed (expected result differs from actual result) and the tests that failed because of a Python error.

Daniel Bates

Comment 24 2014-10-13 21:01:20 PDT

Comment on attachment 239760 [details] The patch I intended to post last time... Thank you for updating the patch.

Jake Nielsen

Comment 25 2014-10-15 10:46:25 PDT

Created attachment 239877 [details] Removes redundant test names from the list of tests To address any efficiency concerns, I ran some tests with randomly generated test names, and a random subset of failures and errors. 10,000 test names, 1000 errors/failures. For comparison, there are presently 1430 unit tests in the test-webkitpy test suite. Here are the results (time measured in seconds): [563]jacob_nielsen: ~/GitRepos/WorkRepos/OpenSource/Tools/Scripts $ python ./JsonPerformanceTest.py Average time: 0.022092668[564]jacob_nielsen: ~/GitRepos/WorkRepos/OpenSource/Tools/Scripts $ python ./JsonPerformanceTest.py Average time: 0.02275959[565]jacob_nielsen: ~/GitRepos/WorkRepos/OpenSource/Tools/Scripts $ python ./JsonPerformanceTest.py Average time: 0.0233933[566]jacob_nielsen: ~/GitRepos/WorkRepos/OpenSource/Tools/Scripts $

WebKit Commit Bot

Comment 26 2014-10-15 10:48:48 PDT

Attachment 239877 [details] did not pass style-queue: ERROR: Tools/Scripts/webkitpy/test/main.py:77: expected 2 blank lines, found 1 [pep8/E302] [5] Total errors found: 1 in 3 files If any of these errors are false positives, please file a bug against check-webkit-style.

Jake Nielsen

Comment 27 2014-10-15 10:51:37 PDT

Created attachment 239878 [details] Fixes style issues

Daniel Bates

Comment 28 2014-10-15 12:42:02 PDT

Comment on attachment 239878 [details] Fixes style issues View in context: https://bugs.webkit.org/attachment.cgi?id=239878&action=review > Tools/Scripts/webkitpy/test/main.py:85 > + results['tests'] = sorted(set(all_test_names) - set(map(operator.itemgetter(0), failures)) - set(map(operator.itemgetter(0), errors))) I suggest we name the dictionary key 'passes' since it now represents the list of tests that pass. For completeness, there's probably a more space and time efficient way to compute the list of passing tests. Regardless, this approach to computing the list of passing tests should be sufficient for our purposes.

Jake Nielsen

Comment 29 2014-10-15 12:46:14 PDT

Created attachment 239884 [details] Dictionary key change

WebKit Commit Bot

Comment 30 2014-10-15 13:56:17 PDT

Comment on attachment 239884 [details] Dictionary key change Clearing flags on attachment: 239884 Committed r174743: <http://trac.webkit.org/changeset/174743>

WebKit Commit Bot

Comment 31 2014-10-15 13:56:22 PDT

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component Tools / Tests

Assignee

Jake Nielsen

Reported

2014-10-02 12:27 PDT

Modified

2014-10-15 13:56 PDT History

CC List

4 users Show

URL

Keywords

Depends on

Blocks