RESOLVED FIXED Bug 222601
[JSC] detect infrastructure failure for remote stress tests
https://bugs.webkit.org/show_bug.cgi?id=222601
Summary [JSC] detect infrastructure failure for remote stress tests
Angelos Oikonomopoulos
Reported 2021-03-02 08:32:11 PST
[JSC] detect infrastructure failure for remote stress tests
Attachments
Patch (18.40 KB, patch)
2021-03-02 08:35 PST, Angelos Oikonomopoulos
no flags
Patch (18.36 KB, patch)
2021-03-02 09:12 PST, Angelos Oikonomopoulos
no flags
Patch (20.28 KB, patch)
2021-03-03 06:12 PST, Angelos Oikonomopoulos
no flags
Patch (20.29 KB, patch)
2021-03-03 06:44 PST, Angelos Oikonomopoulos
no flags
Patch (26.66 KB, patch)
2021-04-29 06:27 PDT, Angelos Oikonomopoulos
no flags
Patch (26.70 KB, patch)
2021-05-05 01:20 PDT, Angelos Oikonomopoulos
no flags
Patch (26.68 KB, patch)
2021-05-06 04:56 PDT, Angelos Oikonomopoulos
no flags
Angelos Oikonomopoulos
Comment 1 2021-03-02 08:35:58 PST
Angelos Oikonomopoulos
Comment 2 2021-03-02 08:38:29 PST
The current patch only changes jsc-stress-test-writer-default.rb. Uploading this for early comments, will update the other 2 writers if things look good.
Angelos Oikonomopoulos
Comment 3 2021-03-02 09:12:34 PST
Angelos Oikonomopoulos
Comment 4 2021-03-03 06:12:09 PST
Angelos Oikonomopoulos
Comment 5 2021-03-03 06:44:57 PST
Radar WebKit Bug Importer
Comment 6 2021-03-09 08:33:14 PST
Angelos Oikonomopoulos
Comment 7 2021-03-10 07:36:16 PST
Any objections in principle?
Yusuke Suzuki
Comment 8 2021-04-10 13:01:17 PDT
Comment on attachment 422073 [details] Patch r=me
EWS
Comment 9 2021-04-10 13:14:28 PDT
Committed r275801 (236372@main): <https://commits.webkit.org/236372@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 422073 [details].
WebKit Commit Bot
Comment 10 2021-04-11 22:34:11 PDT
Re-opened since this is blocked by bug 224423
Yusuke Suzuki
Comment 11 2021-04-11 22:35:00 PDT
@Mark found that it seems that run-jsc-stress-tests output includes a lot of texts now. So I've reverted it. 0/75193 .............................Could not open file: /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list Could not open file: /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list Could not open file: /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list Could not open file: /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list ... Could not open file: /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list 2/75193 ............................Could not open file: /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list Could not open file: /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list 3/75193 .............................Could not open file: /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list Could not open file: /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list ...
Yusuke Suzuki
Comment 12 2021-04-11 22:35:16 PDT
@Angelos can you take a look?
Angelos Oikonomopoulos
Comment 13 2021-04-11 23:42:16 PDT
(In reply to Yusuke Suzuki from comment #12) > @Angelos can you take a look? Absolutely. Note that, as stated in an earlier comment, I'll need to update the other two test writers before this can go in. Sorry that wasn't clear! I'll submit a new version when I figure out the issues that cropped up. Thanks for taking a look!
Angelos Oikonomopoulos
Comment 14 2021-04-13 07:31:27 PDT
(In reply to Yusuke Suzuki from comment #11) > @Mark found that it seems that run-jsc-stress-tests output includes a lot of > texts now. So I've reverted it. > > 0/75193 .............................Could not open file: > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > Could not open file: > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > Could not open file: > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > Could not open file: > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > ... > Could not open file: > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > 2/75193 ............................Could not open file: > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > Could not open file: > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > 3/75193 .............................Could not open file: > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > Could not open file: > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > ... The above to be messages from some non-public usage of funfuzz? I'm unclear as to how that uses run-jsc-stress-tests or where the regression-tests.list file comes from and what run-jsc-stress-tests has got to do with it. What additional output does run-jsc-stress-tests produce with this patch? The above messages don't come from any script generated by it AFAIK. (Does @Mark refer to Mark Lam? If so, please let me know and I can ping them directly.)
Mark Lam
Comment 15 2021-04-13 08:07:06 PDT
(In reply to Angelos Oikonomopoulos from comment #14) > (In reply to Yusuke Suzuki from comment #11) > > @Mark found that it seems that run-jsc-stress-tests output includes a lot of > > texts now. So I've reverted it. > > > > 0/75193 .............................Could not open file: > > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > > Could not open file: > > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > > Could not open file: > > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > > Could not open file: > > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > > ... > > Could not open file: > > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > > 2/75193 ............................Could not open file: > > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > > Could not open file: > > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > > 3/75193 .............................Could not open file: > > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > > Could not open file: > > /Users/cyril/mozilla_funfuzz/wtmp16/regression-tests.list > > ... > > The above to be messages from some non-public usage of funfuzz? I'm unclear > as to how that uses run-jsc-stress-tests or where the regression-tests.list > file comes from and what run-jsc-stress-tests has got to do with it. > > What additional output does run-jsc-stress-tests produce with this patch? > > The above messages don't come from any script generated by it AFAIK. (Does > @Mark refer to Mark Lam? If so, please let me know and I can ping them > directly.) Yes, Yusuke was referring to me. These messages came from internal tests. Before your patch, running the script did not produce this output. After your patch, it does. I suspect that the tests were always printing these output. However, your change somehow expose them. Do you have any idea where your patch enables more verbose stderr/stdout output to be shown?
Angelos Oikonomopoulos
Comment 16 2021-04-13 08:39:06 PDT
(In reply to Mark Lam from comment #15) > (In reply to Angelos Oikonomopoulos from comment #14) > > (In reply to Yusuke Suzuki from comment #11) [...] > > The above messages don't come from any script generated by it AFAIK. (Does > > @Mark refer to Mark Lam? If so, please let me know and I can ping them > > directly.) > > Yes, Yusuke was referring to me. These messages came from internal tests. > Before your patch, running the script did not produce this output. After > your patch, it does. I suspect that the tests were always printing these > output. However, your change somehow expose them. Do you have any idea > where your patch enables more verbose stderr/stdout output to be shown? I've taken a second look and I don't see any instances where the patch changes where stdout/stderr go, but it's tricky code. I can't see any extra output in my testing of course (I just checked again both with local and --remote runs). webkitruby/jsc-stress-test-writer-default.rb defines a number of handlers that the patch modifies. Are you at liberty to share what test kind is producing the extraneous output? I.e. which run* function is invoked in run-jsc-stress-tests? If so, then I could look at a diff of the generated code in detail.
Angelos Oikonomopoulos
Comment 17 2021-04-29 06:17:51 PDT
FWIW I think I've tracked down the issue with the uncaptured stderr that Mark ran into. I've updated the ruby test writer too, which was straightforward. My remaining concern is this piece of code: outp.puts JSON.generate({ name: @name, outputDir: $runnerDir, baseDir: baseDir, env: $envVars + @additionalEnv, outputName: @name.gsub(/(\\|\/)/, '_'), checkScript: filename, args: @arguments, failFile: "#{failFile}" }) in the playstation test writer. My patch removes the failFile and replaces it with a statusFile which contains a single line: $runid $exitcode $pf, where $pf is either P(ass) or F(ail). Stephan, presumably something checks that failFile on your end? Would you be ok with adapting it to the changes in this patch?
Angelos Oikonomopoulos
Comment 18 2021-04-29 06:27:42 PDT
Stephan Szabo
Comment 19 2021-04-29 10:09:29 PDT
(In reply to Angelos Oikonomopoulos from comment #17) > FWIW I think I've tracked down the issue with the uncaptured stderr that > Mark ran into. I've updated the ruby test writer too, which was > straightforward. > > My remaining concern is this piece of code: > > outp.puts JSON.generate({ > name: @name, > outputDir: $runnerDir, > baseDir: baseDir, > env: $envVars + @additionalEnv, > outputName: @name.gsub(/(\\|\/)/, '_'), > checkScript: filename, > args: @arguments, > failFile: "#{failFile}" > }) > > in the playstation test writer. My patch removes the failFile and replaces > it with a statusFile which contains a single line: $runid $exitcode $pf, > where $pf is either P(ass) or F(ail). > > Stephan, presumably something checks that failFile on your end? Would you be > ok with adapting it to the changes in this patch? Yes, that'd be fine. Actually, because of the weirdness for filesystem access for our case, I believe our driver is actually are checking the status and then making the fail file for the rest of the test harness, so I should be able to easily have it generate the statusFile in that format instead.
Angelos Oikonomopoulos
Comment 20 2021-05-05 01:20:11 PDT
Angelos Oikonomopoulos
Comment 21 2021-05-06 04:56:22 PDT
Angelos Oikonomopoulos
Comment 22 2021-05-06 06:45:21 PDT
(In reply to Stephan Szabo from comment #19) > (In reply to Angelos Oikonomopoulos from comment #17) [...] > > Stephan, presumably something checks that failFile on your end? Would you be > > ok with adapting it to the changes in this patch? > > Yes, that'd be fine. Actually, because of the weirdness for filesystem > access for our case, I believe our driver is actually are checking the > status and then making the fail file for the rest of the test harness, so I > should be able to easily have it generate the statusFile in that format > instead. Great! As I can't update your driver, at the moment the patch removes the failFile field from the json file. Do you prefer to handle this some other way?
Stephan Szabo
Comment 23 2021-05-06 09:54:10 PDT
(In reply to Angelos Oikonomopoulos from comment #22) > (In reply to Stephan Szabo from comment #19) > > (In reply to Angelos Oikonomopoulos from comment #17) > [...] > > > Stephan, presumably something checks that failFile on your end? Would you be > > > ok with adapting it to the changes in this patch? > > > > Yes, that'd be fine. Actually, because of the weirdness for filesystem > > access for our case, I believe our driver is actually are checking the > > status and then making the fail file for the rest of the test harness, so I > > should be able to easily have it generate the statusFile in that format > > instead. > > Great! As I can't update your driver, at the moment the patch removes the > failFile field from the json file. Do you prefer to handle this some other > way? I'll probably add statusFile to the json file in the end, but I think the above is fine for the current patch, since I may need to do some other related changes in practice.
Angelos Oikonomopoulos
Comment 24 2021-05-07 05:36:21 PDT
(In reply to Stephan Szabo from comment #23) [...] > I'll probably add statusFile to the json file in the end, but I think the > above is fine for the current patch, since I may need to do some other > related changes in practice. Alright, thanks! @Mark, the change to fix the stderr redirection was - cmd = "{ { { { #{shellCommand}; echo $? >&3; } | { #{outputHandler.call(@name)} ;} >&4; } 3>&1; } | { read xs; exit $xs; } } 4>&1\nexitCode=$?\n" + cmd = "{ { { { #{shellCommand} 2>&1; echo $? >&3; } | { #{outputHandler.call(@name)} ;} >&4; } 3>&1; } | { read xs; exit $xs; } } 4>&1\nexitCode=$?\n" Do you feel comfortable trying to land the patch now?
Angelos Oikonomopoulos
Comment 25 2021-05-07 05:38:54 PDT
(In reply to Angelos Oikonomopoulos from comment #24) [...] > @Mark, the change to fix the stderr redirection was > > - cmd = "{ { { { #{shellCommand}; echo $? >&3; } | { > #{outputHandler.call(@name)} ;} >&4; } 3>&1; } | { read xs; exit $xs; } } > 4>&1\nexitCode=$?\n" > + cmd = "{ { { { #{shellCommand} 2>&1; echo $? >&3; } | { > #{outputHandler.call(@name)} ;} >&4; } 3>&1; } | { read xs; exit $xs; } } > 4>&1\nexitCode=$?\n" > > Do you feel comfortable trying to land the patch now? Note, I also took the opportunity to clarify the above pipeline with an ascii diagram in a preceding comment: # +--------------------------------------------------------------------+ # | +-----------------------------------------------+ | # | | +--------------+ +-------------------+ | | # | | | cmd 1 ----> 1|---> |0 --> outH 1 ---> 4|-> 4|---------------> 1| # | | | 2 / | +-------------------+ | +-----------+ | # | | |echo $? 0 -> 3|---------------------------> 1|-> |0 read xs | | # | | +--------------+ | | exit $xs | | # | | | +-----------+ | # | +-----------------------------------------------+ | # +--------------------------------------------------------------------+
Mark Lam
Comment 26 2021-05-10 10:31:07 PDT
Comment on attachment 427872 [details] Patch Looks reasonable to me.
EWS
Comment 27 2021-05-11 01:34:15 PDT
Committed r277318 (237577@main): <https://commits.webkit.org/237577@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 427872 [details].
Note You need to log in before you can comment on or make changes to this bug.