190188 – Add retry for test groups with failed build requests.

RESOLVED FIXED 190188

Add retry for test groups with failed build requests.

https://bugs.webkit.org/show_bug.cgi?id=190188

Summary Add retry for test groups with failed build requests.

dewei_zhu

Reported 2018-10-01 23:52:24 PDT

Add retry for test groups with failed build requests.

Attachments
Patch (51.25 KB, patch) 2018-10-02 00:56 PDT, dewei_zhu	no flags	Details Formatted Diff Diff
Patch (63.64 KB, patch) 2018-10-03 00:49 PDT, dewei_zhu	no flags	Details Formatted Diff Diff
Patch (67.79 KB, patch) 2018-10-03 20:56 PDT, dewei_zhu	rniwa: review+	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

dewei_zhu

Comment 1 2018-10-02 00:56:45 PDT

Created attachment 351353 [details] Patch

Ryosuke Niwa

Comment 2 2018-10-02 01:34:24 PDT

Comment on attachment 351353 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=351353&action=review > Websites/perf.webkit.org/ChangeLog:18 > + ADD COLUMN testgroup_expected_successful_build_requests integer DEFAULT NULL, > + ADD COLUMN testgroup_may_need_additional_build_requests boolean DEFAULT FALSE; These names are too long. How about just: testgroup_iteration_count & testgroup_may_need_more_requests. > Websites/perf.webkit.org/public/api/build-requests.php:77 > + if (count($test_groups_may_need_additional_build_requests)) > + $db->query('UPDATE analysis_test_groups SET testgroup_may_need_additional_build_requests = TRUE WHERE testgroup_id = ANY($1)', > + array('{' . implode(',', array_keys($test_groups_may_need_additional_build_requests)) . '}')); I don't think we need to coerce queries like this. Just issue a SQL query per build request. We can certainly have a map which stores the set of test groups we've already set this flag. > Websites/perf.webkit.org/public/api/test-groups.php:25 > - } elseif ($path[0] == 'ready-for-notification') { > + } elseif ($path[0] == 'ready-for-further-processing') { I think we should just add a new API surface instead of replacing an existing one. > Websites/perf.webkit.org/public/privileged-api/update-test-group.php:19 > + if (array_key_exists('name', $data) || array_key_exists('hidden', $data) || array_key_exists('mayNeedAdditionalBuildRequests', $data)) > + exit_with_error('TooManyArgumentsProvided'); Why don't we add a new dedicated API instead? I think that'd be cleaner than re-using this API and having restrictions like these. > Websites/perf.webkit.org/public/privileged-api/update-test-group.php:28 > $values['hidden'] = Database::to_database_boolean($data['hidden']); > + $values['may_need_additional_build_requests'] = FALSE; I think we probably want another check at where we create build requests that we haven't recently canceled this test group. > Websites/perf.webkit.org/public/privileged-api/update-test-group.php:58 > + for ($i = 0; $i < $additional_build_request_count; $i++) { > + $processed_commit_set = array(); Why don't we split the part to find the list of commit sets in this group into a separate helper function? > Websites/perf.webkit.org/public/v3/models/test-group.js:14 > + this._mayNeedAdditionalBuildRequests = object.mayNeedAdditionalBuildRequests; > + this._expectedSuccessfulBuildRequests = object.expectedSuccessfulBuildRequests; Let's also shorten these names. > Websites/perf.webkit.org/public/v3/models/test-group.js:231 > + async addBuildRequests(buildRequestCount) Just call it "count". > Websites/perf.webkit.org/public/v3/models/test-group.js:239 > + async clearNeedAdditionalBuildRequestsFlag() No need to say "flag". > Websites/perf.webkit.org/server-tests/resources/mock-data.js:67 > + addMockData: function (db, statusList, needs_notification=true) Use camelCase. > Websites/perf.webkit.org/server-tests/resources/mock-data.js:82 > + db.insert('analysis_test_groups', {id: 600, task: 500, name: 'some test group', expected_successful_build_requests: 4, needs_notification}), Ditto here and elsewhere in this test. > Websites/perf.webkit.org/tools/run-analysis.js:19 > + {name: '--max-retry-ratio', type: parseFloat, default: 2.5}, Maybe even 3 might be okay. > Websites/perf.webkit.org/tools/run-analysis.js:67 > +async function processTestGroupsNeedAdditionalBuildRequests(testGroups, maximumRetryRatio) I don't think we want to use vague terms like "processing". Let's just say createAdditionalBuildRequestsForTestGroupsWithFailedRequests or something. We can use a verbose name here. > Websites/perf.webkit.org/tools/run-analysis.js:80 > + let maxMissingSuccessfulBuildRequestCount = 0; > + for (const commitSet of this.requestedCommitSet()) { > + const buildRequests = testGroup.requestsForCommitSet(commitSet); > + maxMissingSuccessfulBuildRequestCount = Math.max(maxMissingSuccessfulBuildRequestCount, buildRequests.filter((buildRequest) => !buildRequest.hasCompleted()).length); > + } I don't think this logic is quite right. buildRequest.hasCompleted() returns false for pending & canceled requests. r- because of this. In fact, we probably want to make sure this code handles the case when some request are still pending, scheduled, or canceled.

dewei_zhu

Comment 3 2018-10-03 00:49:11 PDT

Created attachment 351487 [details] Patch

Ryosuke Niwa

Comment 4 2018-10-03 01:37:56 PDT

Comment on attachment 351487 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=351487&action=review > Websites/perf.webkit.org/public/api/build-requests.php:70 > + $db->update_row('analysis_test_groups', 'testgroup', array('id' => $test_group_id), array('may_need_more_requests' => TRUE)); > + $test_groups_may_need_more_requests[$request_row['request_group']] = TRUE; Let's move this code out of the if-else to share the code between two cases. > Websites/perf.webkit.org/public/api/build-requests.php:80 > + if ($status != 'failed' || array_key_exists($test_group_id, $test_groups_may_need_more_requests)) > + continue; And keep this continue when $status != 'failed' > Websites/perf.webkit.org/public/privileged-api/add-build-requests.php:10 > + $test_group_id = array_get($data, 'group'); We need to bail out if the test group had been marked as "hidden" here. > Websites/perf.webkit.org/public/privileged-api/add-build-requests.php:19 > + > + > + Extra blank lines. > Websites/perf.webkit.org/public/privileged-api/add-build-requests.php:23 > + if ($current_order < 0) > + exit_with_error('NoBuildRequestOfTestType'); I think we should return NoTestingBuildRequests. > Websites/perf.webkit.org/public/privileged-api/add-build-requests.php:29 > + for ($i = 0; $i < $additional_build_request_count; $i++) { > + $processed_commit_set = array(); > + foreach ($existing_build_requests as $build_request) { I think we need to split this into two separate loops: The first one which finds an ordered list of commit sets (1), and the second loop which generates N build requests for each commit set we found in (1). Otherwise, this algorithm is really O(N^2). $commit_sets = array(); $processed_commit_set = array(); foreach ($existing_build_requests as $build_request) { if ($build_request['request_order'] < 0) continue; $requested_commit_set = $build_request['request_commit_set']; if (array_key_exists($requested_commit_set, $processed_commit_set)) continue; $processed_commit_set[$requested_commit_set] = &$build_request; array_push($commit_sets, $requested_commit_set); } for ($i = 0; $i < $additional_build_request_count; $i++) { foreach ($commit_sets as $commit_set) {...} } > Websites/perf.webkit.org/public/v3/models/test-group.js:231 > + async addExtraBuildRequests(count) This should be called addMoreBuildRequests > Websites/perf.webkit.org/public/v3/models/test-group.js:239 > + async clearNeedAdditionalBuildRequests() This should be called clearMayNeedMoreBuildRequests > Websites/perf.webkit.org/public/v3/models/test-group.js:297 > + static fetchAllMayNeedMoreRequests() fetchAllThatMayNeedMoreRequests / fetchAllWhichMayNeedMoreRequests? > Websites/perf.webkit.org/tools/js/retry-failed-build-requests.js:7 > + if (testGroup.isHidden()) { This early bail out is useless because we filter the results in API anyway. Also we need to have an API test for this. > Websites/perf.webkit.org/tools/js/retry-failed-build-requests.js:18 > + hasCompletedBuildRequestForEachCommitSet &= (completedBuildRequestCount > 0); I think this logic doesn't work when one side all failed. > Websites/perf.webkit.org/tools/js/retry-failed-build-requests.js:29 > + await testGroup.clearNeedAdditionalBuildRequests(); I think it's not sound to clear the flag in a separate API call when when we're creating more build requests. In fact, clearing the flag may is also racy but let's not worry about that for now. We should, however, clear the flag automatically in /add-build-requests API. > Websites/perf.webkit.org/tools/run-analysis.js:20 > + {name: '--max-retry-ratio', type: parseFloat, default: 3}, We should probably call this a factor since it's not really a ratio. > Websites/perf.webkit.org/unit-tests/retry-failed-build-requests-tests.js:144 > + it('should add 1 more build request when one failed build request found', async () => { This is rather confusing. Just say "should add one more build request when one of the existing requests failed". > Websites/perf.webkit.org/unit-tests/retry-failed-build-requests-tests.js:145 > + const data = sampleTestGroup(false, 3, true, false, ["completed", "completed", "completed", "completed", "completed", "failed"]); Please use a dictionary to pass options to sampleTestGroup. e.g. sampleTestGroup({needsNotification: true, iterationCount: 3, mayNeedMoreRequests: true, hidden: false}, [~]) > Websites/perf.webkit.org/unit-tests/retry-failed-build-requests-tests.js:174 > + it('should not schedule more build requests when no completed build request for a commit set', async () => { should read "when all requests for a commit set had failed". > Websites/perf.webkit.org/unit-tests/retry-failed-build-requests-tests.js:192 > + it('should not schedule more build requests when no completed build request is hidden', async () => { "should not schedule more build requests when the test group is hidden"? > Websites/perf.webkit.org/unit-tests/retry-failed-build-requests-tests.js:203 > + it('should not schedule more build request while exceeding maximum retry', async () => { when* we've already hit the maximum retry count.

Ryosuke Niwa

Comment 5 2018-10-03 01:38:28 PDT

Comment on attachment 351487 [details] Patch r- because there are enough problems with this patch. Let's go through another iteration but it's getting close!

Ryosuke Niwa

Comment 6 2018-10-03 01:42:06 PDT

Comment on attachment 351487 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=351487&action=review > Websites/perf.webkit.org/init-database.sql:284 > + testgroup_iteration_count integer NOT NULL, We should probably call this testgroup_initial_repetition_count or something since the actual number of repetitions can be higher.

dewei_zhu

Comment 7 2018-10-03 15:59:03 PDT

Comment on attachment 351487 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=351487&action=review >> Websites/perf.webkit.org/tools/js/retry-failed-build-requests.js:7 >> + if (testGroup.isHidden()) { > > This early bail out is useless because we filter the results in API anyway. > > Also we need to have an API test for this. Can we remove if condition that checks testGroup.mayNeedMoreRequests() as well as the API also did that? More general question, should we make assumptions on the testGroups provided in the argument?

dewei_zhu

Comment 8 2018-10-03 20:56:45 PDT

Created attachment 351579 [details] Patch

Ryosuke Niwa

Comment 9 2018-10-03 21:44:18 PDT

Comment on attachment 351579 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=351579&action=review > Websites/perf.webkit.org/public/api/build-requests.php:64 > + $test_group_id = $request_row['request_group']; Do this outside of if-else. > Websites/perf.webkit.org/public/api/build-requests.php:79 > + $test_groups_may_need_more_requests[$request_row['request_group']] = TRUE; Use $test_group_id once you've done that. > Websites/perf.webkit.org/public/privileged-api/add-build-requests.php:10 > + $additional_build_request_count = array_get($data, 'count'); Maybe we should call this addCount? > Websites/perf.webkit.org/public/privileged-api/add-build-requests.php:45 > + foreach($commit_sets as $commit_set) { NIt: Need a space between foreach and (. > Websites/perf.webkit.org/public/privileged-api/add-build-requests.php:51 > + insert_build_request_for_configuration($db, $configuration, ++$current_order, $build_request['request_triggerable'], > + $build_request['request_platform'], $build_request['request_test'], $build_request['request_group']); Just use $db->insert_row directly here. I don't think we get any benefit from using this helper function. > Websites/perf.webkit.org/public/privileged-api/add-build-requests.php:54 > + $db->update_row('analysis_test_groups', 'testgroup', array('id' => $test_group_id), array('may_need_more_requests' => FALSE)); Hm... on my second thought maybe it's better to clear this flag in a separate API call so that we can allow addition of more build requests without clearing of this flag. > Websites/perf.webkit.org/server-tests/api-test-groups.js:94 > + assert.ok(!data['testGroups'][0].mayNeedMoreRequests); Maybe use assert.equals to be more explicit about true/false? > Websites/perf.webkit.org/server-tests/privileged-api-update-test-group-tests.js:208 > + it('should be created with "may_need_more_requests" field defaults to false', async () => { This should read: "should create a test group with" > Websites/perf.webkit.org/server-tests/privileged-api-update-test-group-tests.js:212 > + let result = await PrivilegedAPI.sendRequest('create-test-group', Use const? > Websites/perf.webkit.org/server-tests/privileged-api-update-test-group-tests.js:229 > + let result = await PrivilegedAPI.sendRequest('create-test-group', Use const? > Websites/perf.webkit.org/server-tests/privileged-api-update-test-group-tests.js:252 > + it('should clear "may_need_more_requests" while hiding test group', async () => { while -> when, hiding *a* test group. > Websites/perf.webkit.org/server-tests/privileged-api-update-test-group-tests.js:256 > + let result = await PrivilegedAPI.sendRequest('create-test-group', Use const? > Websites/perf.webkit.org/tools/js/retry-failed-build-requests.js:16 > + const buildRequests = testGroup.requestsForCommitSet(commitSet); We should probably filter out non-testing build requests here. > Websites/perf.webkit.org/tools/js/retry-failed-build-requests.js:19 > + hasCompletedBuildRequestForEachCommitSet &= (completedBuildRequestCount > 0); This math is rather confusing. Just do something like: if (!completedBuildRequestCount) hasCompletedBuildRequestForEachCommitSet = false; > Websites/perf.webkit.org/tools/js/retry-failed-build-requests.js:21 > + maxMissingBuildRequestCount = Math.max(maxMissingBuildRequestCount, testGroup.initialRepetitionCount() - completedBuildRequestCount - unfinishedBuildRequestCount); Let's add an assertion that: unfinishedBuildRequestCount <= testGroup.initialRepetitionCount() and declare a local variable like: potentiallySuccessfulCount = completedBuildRequestCount + unfinishedBuildRequestCount; > Websites/perf.webkit.org/tools/js/retry-failed-build-requests.js:22 > + hasUnfinishedBuildRequest = hasUnfinishedBuildRequest || (unfinishedBuildRequestCount > 0); Ditto about using if. > Websites/perf.webkit.org/tools/js/retry-failed-build-requests.js:24 > + const willExceedMaximumRetry = (testGroup.repetitionCount() + maxMissingBuildRequestCount) > maximumRetryFactor * testGroup.initialRepetitionCount(); No need for parenthesis. > Websites/perf.webkit.org/tools/js/retry-failed-build-requests.js:27 > + if (maxMissingBuildRequestCount && !willExceedMaximumRetry && hasUnfinishedBuildRequest && !hasCompletedBuildRequestForEachCommitSet) > + continue; I think a cleaner way to do this might be that: if (!hasCompletedBuildRequestForEachCommitSet) { if (maxMissingBuildRequestCount && hasUnfinishedBuildRequest) await testGroup.clearMayNeedMoreBuildRequests(); continue; } > Websites/perf.webkit.org/tools/js/retry-failed-build-requests.js:29 > + if (maxMissingBuildRequestCount && !willExceedMaximumRetry && hasCompletedBuildRequestForEachCommitSet) Then we don't have to check hasCompletedBuildRequestForEachCommitSet here. > Websites/perf.webkit.org/unit-tests/retry-failed-build-requests-tests.js:150 > + const testGroupConfig = {needsNotification:false, initialRepetitionCount: 3, mayNeedMoreRequests: true, hidden: false, Nit: Need a space between : and false, and ditto elsewhere in this file.

dewei_zhu

Comment 10 2018-10-04 18:32:11 PDT

Landed in r236861.

dewei_zhu

Comment 11 2018-10-08 11:40:41 PDT

rdar://problem/34471207

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component New Bugs

Assignee

Nobody

Reported

2018-10-01 23:52 PDT

Modified

2018-10-08 11:40 PDT History

CC List

2 users Show

URL

Keywords InRadar

Depends on

Blocks