Bug 232529 - Move detection of flaky JSC tests to run-jsc-stress-tests
Summary: Move detection of flaky JSC tests to run-jsc-stress-tests
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Angelos Oikonomopoulos
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-10-30 03:40 PDT by Angelos Oikonomopoulos
Modified: 2021-11-22 00:46 PST (History)
11 users (show)

See Also:


Attachments
Patch (195.75 KB, patch)
2021-10-30 03:56 PDT, Angelos Oikonomopoulos
no flags Details | Formatted Diff | Diff
Patch (203.89 KB, patch)
2021-11-04 05:49 PDT, Angelos Oikonomopoulos
no flags Details | Formatted Diff | Diff
Patch (205.56 KB, patch)
2021-11-07 03:23 PST, Angelos Oikonomopoulos
no flags Details | Formatted Diff | Diff
Patch (119.83 KB, patch)
2021-11-08 09:16 PST, Angelos Oikonomopoulos
no flags Details | Formatted Diff | Diff
Patch (120.21 KB, patch)
2021-11-09 03:37 PST, Angelos Oikonomopoulos
no flags Details | Formatted Diff | Diff
Patch (120.92 KB, patch)
2021-11-21 03:03 PST, Angelos Oikonomopoulos
no flags Details | Formatted Diff | Diff
Patch (120.92 KB, patch)
2021-11-21 03:04 PST, Angelos Oikonomopoulos
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Angelos Oikonomopoulos 2021-10-30 03:40:30 PDT
Move detection of flaky JSC tests to run-jsc-stress-tests
Comment 1 Angelos Oikonomopoulos 2021-10-30 03:56:21 PDT
Created attachment 442909 [details]
Patch
Comment 2 Angelos Oikonomopoulos 2021-11-04 05:49:50 PDT
Created attachment 443296 [details]
Patch
Comment 3 Angelos Oikonomopoulos 2021-11-04 06:15:19 PDT
Stephan, the changes in the playstation code are minimal but could still use some review and/or testing.
Comment 4 Stephan Szabo 2021-11-04 10:38:32 PDT
A quick test with these changes on a subset of a couple hundred tests (with no failures) was okay on device. I'm running a larger (about 8000 test) test set, but it'll likely take a few hours.
Comment 5 Stephan Szabo 2021-11-04 12:14:24 PDT
Okay - I ran a larger set, then ran a failing test with --filter and --treat-failing-as-flaky=0.6,10,200 but got

D:/github/WebKit/Tools/Scripts/webkitruby/jsc-stress-test/executor.rb:48:in `refreshExecution': undefined method `prepareTestRunner' for #<NormalTestsExecutor:0x0000000008b92ca8 @runlist=[#<Plan:0x0000000006f826a8 @directory=#<Pathname:.tests/ChakraCore.yaml/ChakraCore/test/es5>, @arguments=["--useFTLJIT=false", "--useFunctionDotArguments=true", "--validateExceptionChecks=true", "--useDollarVM=true", "--maxPerThreadStackUsage=1572864", "--thresholdForJITAfterWarmUp=10", "--thresholdForJITSoon=10", "--thresholdForOptimizeAfterWarmUp=20", "--thresholdForOptimizeAfterLongWarmUp=20", "--thresholdForOptimizeSoon=20", "--thresholdForFTLOptimizeAfterWarmUp=20", "--thresholdForFTLOptimizeSoon=20", "--thresholdForOMGOptimizeAfterWarmUp=20", "--thresholdForOMGOptimizeSoon=20", "--maximumEvalCacheableSourceLength=150000", "--useEagerCodeBlockJettisonTiming=true", "--repatchBufferingCountdown=0", "jsc-lib.js", "hasItem.js"], @family="ChakraCore.yaml/ChakraCore/test/es5/hasItem.js", @name="ChakraCore.yaml/ChakraCore/test/es5/hasItem.js.default", @outputHandler=#<Proc:0x0000000006f82928 D:/github/WebKit/Tools/Scripts/webkitruby/jsc-stress-test/writer-playstation.rb:56>, @errorHandler=#<Proc:0x0000000006f82978 D:/github/WebKit/Tools/Scripts/webkitruby/jsc-stress-test/writer-playstation.rb:161>, @flaky=#<Flaky:0x00000000091bc4d0 @passPercentage=0.6, @maxTries=10>, @additionalEnv=[], @index=47>], @serialPlans=#<Set: {}>, @remoteHosts=nil, @treatFailingAsFlaky=#<OpenStruct passPercentage=0.6, maxTries=10, maxFailing=200>, @infraIterationsFloor=3, @iterationsCeiling=10, @maxIterations=10, @logStream=#<IO:<STDERR>>, @testRunner=#<TestRunnerMake:0x0000000008b92d48 @testRunnerType=:make, @runnerDir=#<Pathname:D:/github/WebKit/WebKitBuild/Release/bin/jsc-stress-results/.runner>>> (NoMethodError)

It looks like the prepareTestRunner method was removed in the patch but is still referenced in BaseTestsExecutor's refreshExecution.
Comment 6 Angelos Oikonomopoulos 2021-11-04 14:59:59 PDT
Oops, looks like my last refactor to allow keeping state between the prepare*Runner and testRunnerCommand invalidated all my testing. Will have to submit a new patch after trying all combinations with fault injection again.

Thanks for taking the time!
Comment 7 Radar WebKit Bug Importer 2021-11-06 03:41:19 PDT
<rdar://problem/85099219>
Comment 8 Angelos Oikonomopoulos 2021-11-07 03:23:53 PST
Created attachment 443506 [details]
Patch
Comment 9 Angelos Oikonomopoulos 2021-11-07 05:17:55 PST
This should have fixed it. Stephan, if you could try out your larger test set, that'd be great, still need to properly exercise the playstation paths.
Comment 10 Stephan Szabo 2021-11-08 03:45:53 PST
It looks like my previous change for the non-find based code path for going through the result files breaks this probably for non-cygwin windows hosts (probably just wincairo and playstation in practice) because the file is open so the unlink in prepareScripts fails.

In getStatusMap we're not properly closing the file in that path, there's probably a more idiomatic way, but I was just hacking it to see.

-                    line = File.open(name).first
+                    f = File.open(name)
+                    line = f.first
+                    f.close()

With that the test of a single failing test on playstation with a treat-failing-as-flaky did retry successfully.

I'm now running the large set, but given the small ones passed, things seem likely to work.
Comment 11 Angelos Oikonomopoulos 2021-11-08 05:08:18 PST
(In reply to Stephan Szabo from comment #10)
> It looks like my previous change for the non-find based code path for going
> through the result files breaks this probably for non-cygwin windows hosts
> (probably just wincairo and playstation in practice) because the file is
> open so the unlink in prepareScripts fails.
> 
> In getStatusMap we're not properly closing the file in that path, there's
> probably a more idiomatic way, but I was just hacking it to see.
> 
> -                    line = File.open(name).first
> +                    f = File.open(name)
> +                    line = f.first
> +                    f.close()
> 
> With that the test of a single failing test on playstation with a
> treat-failing-as-flaky did retry successfully.

Oops. Glad we caught that. FWIW I think a more idiomatic fix would be

File.open(name, 'r') { |f|
  line = f.first
  processStatusLine(map, "./#{name} #{line}")
}

I guess it's better if you submit something along those lines as a separate fix?

> I'm now running the large set, but given the small ones passed, things seem
> likely to work.

Thanks, that really helps.
Comment 12 Jonathan Bedard 2021-11-08 08:37:11 PST
Comment on attachment 443506 [details]
Patch

Is it possible for us to separate the move parts of this patch from the code changes? Difficult to review the ruby changes since bugzilla isn't representing the move very well.
Comment 13 Angelos Oikonomopoulos 2021-11-08 08:50:02 PST
(In reply to Jonathan Bedard from comment #12)
> Comment on attachment 443506 [details]
> Patch
> 
> Is it possible for us to separate the move parts of this patch from the code
> changes? Difficult to review the ruby changes since bugzilla isn't
> representing the move very well.

Oops. Yah, I guess I can upload a new patch (though I'll need to repeat some of the testing). FWIW it should look much better if you apply-attachment and use git diff (which does rename detection).
Comment 14 Jonathan Bedard 2021-11-08 09:10:42 PST
(In reply to Angelos Oikonomopoulos from comment #13)
> (In reply to Jonathan Bedard from comment #12)
> > Comment on attachment 443506 [details]
> > Patch
> > 
> > Is it possible for us to separate the move parts of this patch from the code
> > changes? Difficult to review the ruby changes since bugzilla isn't
> > representing the move very well.
> 
> Oops. Yah, I guess I can upload a new patch (though I'll need to repeat some
> of the testing). FWIW it should look much better if you apply-attachment and
> use git diff (which does rename detection).

Other possibility: although we don't have EWS hooked up yet, `git-webkit pr` will create a GitHub pull-request, and the diff might be easier to read there.
Comment 15 Angelos Oikonomopoulos 2021-11-08 09:16:29 PST
Created attachment 443557 [details]
Patch
Comment 16 Angelos Oikonomopoulos 2021-11-08 09:21:20 PST
New patch skips the file renaming.
Comment 17 Stephan Szabo 2021-11-08 15:04:09 PST
As a note on earlier patch version plus changed mention, the playstation run of the about 8000 tests worked fine.

Looking to do a version of the unlink fix now. (Sorry, semi-off right now, so response time's a bit slow)
Comment 18 Stephan Szabo 2021-11-08 15:18:43 PST
Submitted https://bugs.webkit.org/show_bug.cgi?id=232851 for result file handling update.
Comment 19 Jonathan Bedard 2021-11-08 16:15:42 PST
Comment on attachment 443557 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=443557&action=review

The buildbot and run-javascriptcore-tests bits look good, I'm not familiar with the architecture of our stress tests or Ruby, though

> Tools/CISupport/ews-build/steps.py:1887
> +                                                AnalyzeJSCTestsResults()])

We need to coordinate landing this change with an EWS restart.

> Tools/Scripts/run-javascriptcore-tests:287
> +  --treat-failing-as-flaky      Treat failing stress tests as flaky

This option is quite a bit more complicated than the help message indicates. I would like something like what you outline in the Changelog about passPercentage,maxTries,maxFailing
Comment 20 Angelos Oikonomopoulos 2021-11-09 03:37:20 PST
Created attachment 443676 [details]
Patch
Comment 21 Angelos Oikonomopoulos 2021-11-09 03:43:37 PST
(In reply to Jonathan Bedard from comment #19)
> Comment on attachment 443557 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=443557&action=review
> 
> The buildbot and run-javascriptcore-tests bits look good, I'm not familiar
> with the architecture of our stress tests or Ruby, though

Yah, according to git blame only a few people have touched these parts of the file in the last few years.

> > Tools/CISupport/ews-build/steps.py:1887
> > +                                                AnalyzeJSCTestsResults()])
> 
> We need to coordinate landing this change with an EWS restart.

Possibly, though AFAIU things /should/ work fine until the EWS restart. The EWS buildbots will not be passing in `--treat-failing-as-flaky`, so the script's behavior should be exactly the same.

> > Tools/Scripts/run-javascriptcore-tests:287
> > +  --treat-failing-as-flaky      Treat failing stress tests as flaky
> 
> This option is quite a bit more complicated than the help message indicates.
> I would like something like what you outline in the Changelog about
> passPercentage,maxTries,maxFailing

A very valid point. Uploaded a new version of the patch with an extended help message for the option.
Comment 22 Angelos Oikonomopoulos 2021-11-09 06:10:42 PST
(In reply to Angelos Oikonomopoulos from comment #21)
> (In reply to Jonathan Bedard from comment #19)
> > Comment on attachment 443557 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=443557&action=review
> > 
> > The buildbot and run-javascriptcore-tests bits look good, I'm not familiar
> > with the architecture of our stress tests or Ruby, though
> 
> Yah, according to git blame only a few people have touched these parts of
> the file in the last few years.

CC'ing msaboff and fpizlo for the run-jsc-stress-tests parts.
Comment 23 Angelos Oikonomopoulos 2021-11-15 06:34:58 PST
Ping. Would be good to improve the flaky handling for JSC.
Comment 24 Angelos Oikonomopoulos 2021-11-19 09:55:35 PST
Calling all rubyists.
Comment 25 Jonathan Bedard 2021-11-19 17:37:02 PST
Comment on attachment 443676 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=443676&action=review

I touched base with Angelos about this patch today. With most Apple folks out next week and no JSC folks commenting, I don't think it makes sense to continue blocking this patch. I'm not a Ruby expert, but looking at this patch, combined with EWS and the fact that it shouldn't change behavior until builedbot is deployed, I'm r+ing with the expectation that Angelos will land Monday, Nov 22nd and handle any fallout next week.

> Tools/Scripts/run-jsc-stress-tests:620
> +        @flaky = flaky

This name doesn't indicate well what expect "flaky" to be. I'm assuming it's the Flaky object? Might be good to indicate that in the initializer by calling it flakyObject. I think it's fine to keep "flaky" as is in the actual class.

> Tools/Scripts/run-jsc-stress-tests:845
> +class Flaky

Flaky seems like the wrong name for this object. It looks to like it's more accurately "Tolerance" or "RetryParameters". If we rename it, we should also rename any member variables which contain it.
Comment 26 Angelos Oikonomopoulos 2021-11-21 03:03:41 PST
Created attachment 444921 [details]
Patch
Comment 27 Angelos Oikonomopoulos 2021-11-21 03:04:51 PST
Created attachment 444922 [details]
Patch
Comment 28 Angelos Oikonomopoulos 2021-11-21 03:08:49 PST
(In reply to Jonathan Bedard from comment #25)
[...]
> > Tools/Scripts/run-jsc-stress-tests:620
> > +        @flaky = flaky
> 
> This name doesn't indicate well what expect "flaky" to be. I'm assuming it's
> the Flaky object? Might be good to indicate that in the initializer by
> calling it flakyObject. I think it's fine to keep "flaky" as is in the
> actual class.
> 
> > Tools/Scripts/run-jsc-stress-tests:845
> > +class Flaky
> 
> Flaky seems like the wrong name for this object. It looks to like it's more
> accurately "Tolerance" or "RetryParameters". If we rename it, we should also
> rename any member variables which contain it.

RetryParameters is much more accurate. Updated the patch and added comments to explain that Plans that have @retryParameters set are treated as flaky.
Comment 29 EWS 2021-11-22 00:46:11 PST
Committed r286109 (244496@main): <https://commits.webkit.org/244496@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 444922 [details].