232529 – Move detection of flaky JSC tests to run-jsc-stress-tests

RESOLVED FIXED 232529

Move detection of flaky JSC tests to run-jsc-stress-tests

https://bugs.webkit.org/show_bug.cgi?id=232529

Summary Move detection of flaky JSC tests to run-jsc-stress-tests

Angelos Oikonomopoulos

Reported 2021-10-30 03:40:30 PDT

Move detection of flaky JSC tests to run-jsc-stress-tests

Attachments
Patch (195.75 KB, patch) 2021-10-30 03:56 PDT, Angelos Oikonomopoulos	no flags	Details Formatted Diff Diff
Patch (203.89 KB, patch) 2021-11-04 05:49 PDT, Angelos Oikonomopoulos	no flags	Details Formatted Diff Diff
Patch (205.56 KB, patch) 2021-11-07 03:23 PST, Angelos Oikonomopoulos	no flags	Details Formatted Diff Diff
Patch (119.83 KB, patch) 2021-11-08 09:16 PST, Angelos Oikonomopoulos	no flags	Details Formatted Diff Diff
Patch (120.21 KB, patch) 2021-11-09 03:37 PST, Angelos Oikonomopoulos	no flags	Details Formatted Diff Diff
Patch (120.92 KB, patch) 2021-11-21 03:03 PST, Angelos Oikonomopoulos	no flags	Details Formatted Diff Diff
Patch (120.92 KB, patch) 2021-11-21 03:04 PST, Angelos Oikonomopoulos	no flags	Details Formatted Diff Diff
Show Obsolete (6) View All Add attachment proposed patch, testcase, etc.

Angelos Oikonomopoulos

Comment 1 2021-10-30 03:56:21 PDT

Created attachment 442909 [details] Patch

Angelos Oikonomopoulos

Comment 2 2021-11-04 05:49:50 PDT

Created attachment 443296 [details] Patch

Angelos Oikonomopoulos

Comment 3 2021-11-04 06:15:19 PDT

Stephan, the changes in the playstation code are minimal but could still use some review and/or testing.

Stephan Szabo

Comment 4 2021-11-04 10:38:32 PDT

A quick test with these changes on a subset of a couple hundred tests (with no failures) was okay on device. I'm running a larger (about 8000 test) test set, but it'll likely take a few hours.

Stephan Szabo

Comment 5 2021-11-04 12:14:24 PDT

Okay - I ran a larger set, then ran a failing test with --filter and --treat-failing-as-flaky=0.6,10,200 but got D:/github/WebKit/Tools/Scripts/webkitruby/jsc-stress-test/executor.rb:48:in `refreshExecution': undefined method `prepareTestRunner' for #<NormalTestsExecutor:0x0000000008b92ca8 @runlist=[#<Plan:0x0000000006f826a8 @directory=#<Pathname:.tests/ChakraCore.yaml/ChakraCore/test/es5>, @arguments=["--useFTLJIT=false", "--useFunctionDotArguments=true", "--validateExceptionChecks=true", "--useDollarVM=true", "--maxPerThreadStackUsage=1572864", "--thresholdForJITAfterWarmUp=10", "--thresholdForJITSoon=10", "--thresholdForOptimizeAfterWarmUp=20", "--thresholdForOptimizeAfterLongWarmUp=20", "--thresholdForOptimizeSoon=20", "--thresholdForFTLOptimizeAfterWarmUp=20", "--thresholdForFTLOptimizeSoon=20", "--thresholdForOMGOptimizeAfterWarmUp=20", "--thresholdForOMGOptimizeSoon=20", "--maximumEvalCacheableSourceLength=150000", "--useEagerCodeBlockJettisonTiming=true", "--repatchBufferingCountdown=0", "jsc-lib.js", "hasItem.js"], @family="ChakraCore.yaml/ChakraCore/test/es5/hasItem.js", @name="ChakraCore.yaml/ChakraCore/test/es5/hasItem.js.default", @outputHandler=#<Proc:0x0000000006f82928 D:/github/WebKit/Tools/Scripts/webkitruby/jsc-stress-test/writer-playstation.rb:56>, @errorHandler=#<Proc:0x0000000006f82978 D:/github/WebKit/Tools/Scripts/webkitruby/jsc-stress-test/writer-playstation.rb:161>, @flaky=#<Flaky:0x00000000091bc4d0 @passPercentage=0.6, @maxTries=10>, @additionalEnv=[], @index=47>], @serialPlans=#<Set: {}>, @remoteHosts=nil, @treatFailingAsFlaky=#<OpenStruct passPercentage=0.6, maxTries=10, maxFailing=200>, @infraIterationsFloor=3, @iterationsCeiling=10, @maxIterations=10, @logStream=#<IO:<STDERR>>, @testRunner=#<TestRunnerMake:0x0000000008b92d48 @testRunnerType=:make, @runnerDir=#<Pathname:D:/github/WebKit/WebKitBuild/Release/bin/jsc-stress-results/.runner>>> (NoMethodError) It looks like the prepareTestRunner method was removed in the patch but is still referenced in BaseTestsExecutor's refreshExecution.

Angelos Oikonomopoulos

Comment 6 2021-11-04 14:59:59 PDT

Oops, looks like my last refactor to allow keeping state between the prepare*Runner and testRunnerCommand invalidated all my testing. Will have to submit a new patch after trying all combinations with fault injection again. Thanks for taking the time!

Radar WebKit Bug Importer

Comment 7 2021-11-06 03:41:19 PDT

<rdar://problem/85099219>

Angelos Oikonomopoulos

Comment 8 2021-11-07 03:23:53 PST

Created attachment 443506 [details] Patch

Angelos Oikonomopoulos

Comment 9 2021-11-07 05:17:55 PST

This should have fixed it. Stephan, if you could try out your larger test set, that'd be great, still need to properly exercise the playstation paths.

Stephan Szabo

Comment 10 2021-11-08 03:45:53 PST

It looks like my previous change for the non-find based code path for going through the result files breaks this probably for non-cygwin windows hosts (probably just wincairo and playstation in practice) because the file is open so the unlink in prepareScripts fails. In getStatusMap we're not properly closing the file in that path, there's probably a more idiomatic way, but I was just hacking it to see. - line = File.open(name).first + f = File.open(name) + line = f.first + f.close() With that the test of a single failing test on playstation with a treat-failing-as-flaky did retry successfully. I'm now running the large set, but given the small ones passed, things seem likely to work.

Angelos Oikonomopoulos

Comment 11 2021-11-08 05:08:18 PST

(In reply to Stephan Szabo from comment #10) > It looks like my previous change for the non-find based code path for going > through the result files breaks this probably for non-cygwin windows hosts > (probably just wincairo and playstation in practice) because the file is > open so the unlink in prepareScripts fails. > > In getStatusMap we're not properly closing the file in that path, there's > probably a more idiomatic way, but I was just hacking it to see. > > - line = File.open(name).first > + f = File.open(name) > + line = f.first > + f.close() > > With that the test of a single failing test on playstation with a > treat-failing-as-flaky did retry successfully. Oops. Glad we caught that. FWIW I think a more idiomatic fix would be File.open(name, 'r') { |f| line = f.first processStatusLine(map, "./#{name} #{line}") } I guess it's better if you submit something along those lines as a separate fix? > I'm now running the large set, but given the small ones passed, things seem > likely to work. Thanks, that really helps.

Jonathan Bedard

Comment 12 2021-11-08 08:37:11 PST

Comment on attachment 443506 [details] Patch Is it possible for us to separate the move parts of this patch from the code changes? Difficult to review the ruby changes since bugzilla isn't representing the move very well.

Angelos Oikonomopoulos

Comment 13 2021-11-08 08:50:02 PST

(In reply to Jonathan Bedard from comment #12) > Comment on attachment 443506 [details] > Patch > > Is it possible for us to separate the move parts of this patch from the code > changes? Difficult to review the ruby changes since bugzilla isn't > representing the move very well. Oops. Yah, I guess I can upload a new patch (though I'll need to repeat some of the testing). FWIW it should look much better if you apply-attachment and use git diff (which does rename detection).

Jonathan Bedard

Comment 14 2021-11-08 09:10:42 PST

(In reply to Angelos Oikonomopoulos from comment #13) > (In reply to Jonathan Bedard from comment #12) > > Comment on attachment 443506 [details] > > Patch > > > > Is it possible for us to separate the move parts of this patch from the code > > changes? Difficult to review the ruby changes since bugzilla isn't > > representing the move very well. > > Oops. Yah, I guess I can upload a new patch (though I'll need to repeat some > of the testing). FWIW it should look much better if you apply-attachment and > use git diff (which does rename detection). Other possibility: although we don't have EWS hooked up yet, `git-webkit pr` will create a GitHub pull-request, and the diff might be easier to read there.

Angelos Oikonomopoulos

Comment 15 2021-11-08 09:16:29 PST

Created attachment 443557 [details] Patch

Angelos Oikonomopoulos

Comment 16 2021-11-08 09:21:20 PST

New patch skips the file renaming.

Stephan Szabo

Comment 17 2021-11-08 15:04:09 PST

As a note on earlier patch version plus changed mention, the playstation run of the about 8000 tests worked fine. Looking to do a version of the unlink fix now. (Sorry, semi-off right now, so response time's a bit slow)

Stephan Szabo

Comment 18 2021-11-08 15:18:43 PST

Submitted https://bugs.webkit.org/show_bug.cgi?id=232851 for result file handling update.

Jonathan Bedard

Comment 19 2021-11-08 16:15:42 PST

Comment on attachment 443557 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=443557&action=review The buildbot and run-javascriptcore-tests bits look good, I'm not familiar with the architecture of our stress tests or Ruby, though > Tools/CISupport/ews-build/steps.py:1887 > + AnalyzeJSCTestsResults()]) We need to coordinate landing this change with an EWS restart. > Tools/Scripts/run-javascriptcore-tests:287 > + --treat-failing-as-flaky Treat failing stress tests as flaky This option is quite a bit more complicated than the help message indicates. I would like something like what you outline in the Changelog about passPercentage,maxTries,maxFailing

Angelos Oikonomopoulos

Comment 20 2021-11-09 03:37:20 PST

Created attachment 443676 [details] Patch

Angelos Oikonomopoulos

Comment 21 2021-11-09 03:43:37 PST

(In reply to Jonathan Bedard from comment #19) > Comment on attachment 443557 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=443557&action=review > > The buildbot and run-javascriptcore-tests bits look good, I'm not familiar > with the architecture of our stress tests or Ruby, though Yah, according to git blame only a few people have touched these parts of the file in the last few years. > > Tools/CISupport/ews-build/steps.py:1887 > > + AnalyzeJSCTestsResults()]) > > We need to coordinate landing this change with an EWS restart. Possibly, though AFAIU things /should/ work fine until the EWS restart. The EWS buildbots will not be passing in `--treat-failing-as-flaky`, so the script's behavior should be exactly the same. > > Tools/Scripts/run-javascriptcore-tests:287 > > + --treat-failing-as-flaky Treat failing stress tests as flaky > > This option is quite a bit more complicated than the help message indicates. > I would like something like what you outline in the Changelog about > passPercentage,maxTries,maxFailing A very valid point. Uploaded a new version of the patch with an extended help message for the option.

Angelos Oikonomopoulos

Comment 22 2021-11-09 06:10:42 PST

(In reply to Angelos Oikonomopoulos from comment #21) > (In reply to Jonathan Bedard from comment #19) > > Comment on attachment 443557 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=443557&action=review > > > > The buildbot and run-javascriptcore-tests bits look good, I'm not familiar > > with the architecture of our stress tests or Ruby, though > > Yah, according to git blame only a few people have touched these parts of > the file in the last few years. CC'ing msaboff and fpizlo for the run-jsc-stress-tests parts.

Angelos Oikonomopoulos

Comment 23 2021-11-15 06:34:58 PST

Ping. Would be good to improve the flaky handling for JSC.

Angelos Oikonomopoulos

Comment 24 2021-11-19 09:55:35 PST

Calling all rubyists.

Jonathan Bedard

Comment 25 2021-11-19 17:37:02 PST

Comment on attachment 443676 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=443676&action=review I touched base with Angelos about this patch today. With most Apple folks out next week and no JSC folks commenting, I don't think it makes sense to continue blocking this patch. I'm not a Ruby expert, but looking at this patch, combined with EWS and the fact that it shouldn't change behavior until builedbot is deployed, I'm r+ing with the expectation that Angelos will land Monday, Nov 22nd and handle any fallout next week. > Tools/Scripts/run-jsc-stress-tests:620 > + @flaky = flaky This name doesn't indicate well what expect "flaky" to be. I'm assuming it's the Flaky object? Might be good to indicate that in the initializer by calling it flakyObject. I think it's fine to keep "flaky" as is in the actual class. > Tools/Scripts/run-jsc-stress-tests:845 > +class Flaky Flaky seems like the wrong name for this object. It looks to like it's more accurately "Tolerance" or "RetryParameters". If we rename it, we should also rename any member variables which contain it.

Angelos Oikonomopoulos

Comment 26 2021-11-21 03:03:41 PST

Created attachment 444921 [details] Patch

Angelos Oikonomopoulos

Comment 27 2021-11-21 03:04:51 PST

Created attachment 444922 [details] Patch

Angelos Oikonomopoulos

Comment 28 2021-11-21 03:08:49 PST

(In reply to Jonathan Bedard from comment #25) [...] > > Tools/Scripts/run-jsc-stress-tests:620 > > + @flaky = flaky > > This name doesn't indicate well what expect "flaky" to be. I'm assuming it's > the Flaky object? Might be good to indicate that in the initializer by > calling it flakyObject. I think it's fine to keep "flaky" as is in the > actual class. > > > Tools/Scripts/run-jsc-stress-tests:845 > > +class Flaky > > Flaky seems like the wrong name for this object. It looks to like it's more > accurately "Tolerance" or "RetryParameters". If we rename it, we should also > rename any member variables which contain it. RetryParameters is much more accurate. Updated the patch and added comments to explain that Plans that have @retryParameters set are treated as flaky.

EWS

Comment 29 2021-11-22 00:46:11 PST

Committed r286109 (244496@main): <https://commits.webkit.org/244496@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 444922 [details].

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component New Bugs

Assignee

Angelos Oikonomopoulos

Reported

2021-10-30 03:40 PDT

Modified

2021-11-22 00:46 PST History

CC List

11 users Show

URL

Keywords InRadar

Depends on

Blocks