ASSIGNED 214257
Delete old ews script DumpRenderTreeWatchDog.py
https://bugs.webkit.org/show_bug.cgi?id=214257
Summary Delete old ews script DumpRenderTreeWatchDog.py
Aakash Jain
Reported 2020-07-13 08:18:27 PDT
We should delete old ews script DumpRenderTreeWatchDog.py. This script was used on Windows bots in old EWS to terminate hanging DRT processes. In new EWS, buildbot terminate the build automatically after 20 minute of inactivity. Further, in new EWS builds, we also run kill-old-processes which kills DumpRenderTree.exe and imagediff.exe Therefore, we don't need this script anymore and it should be deleted.
Attachments
Patch (3.16 KB, patch)
2020-07-13 08:20 PDT, Aakash Jain
darin: review+
Radar WebKit Bug Importer
Comment 1 2020-07-13 08:19:48 PDT
Aakash Jain
Comment 2 2020-07-13 08:20:02 PDT
Aakash Jain
Comment 3 2020-07-13 08:29:51 PDT
This script (dumprendertreewatchdog.py) killed "ImageDiff.exe", while kill-old-processes kills "imagediff.exe" (https://trac.webkit.org/browser/webkit/trunk/Tools/BuildSlaveSupport/kill-old-processes#L45). Not sure if the capitalization matters.
Per Arne Vollan
Comment 4 2020-07-13 09:14:17 PDT
(In reply to Aakash Jain from comment #3) > This script (dumprendertreewatchdog.py) killed "ImageDiff.exe", while > kill-old-processes kills "imagediff.exe" > (https://trac.webkit.org/browser/webkit/trunk/Tools/BuildSlaveSupport/kill- > old-processes#L45). Not sure if the capitalization matters. I think this script is still in use on the EWS bots, or am I mistaken?
Aakash Jain
Comment 5 2020-07-13 09:19:55 PDT
(In reply to Per Arne Vollan from comment #4) > I think this script is still in use on the EWS bots, or am I mistaken? Some EWS bots were using it by mistake, I just stopped all that. See <rdar://problem/65142015>. Is there a reason that this script is required on EWS bots? Buildbot already terminates any hanging build, and we use kill-old-processes to kill DumpRenderTree.exe in the beginning of every build.
Per Arne Vollan
Comment 6 2020-07-13 09:27:36 PDT
(In reply to Aakash Jain from comment #5) > (In reply to Per Arne Vollan from comment #4) > > I think this script is still in use on the EWS bots, or am I mistaken? > Some EWS bots were using it by mistake, I just stopped all that. See > <rdar://problem/65142015>. > > Is there a reason that this script is required on EWS bots? Buildbot already > terminates any hanging build, and we use kill-old-processes to kill > DumpRenderTree.exe in the beginning of every build. Yes, I believe this is still required. The script will kill hanging processes during the test run, making sure the test run completes. I believe we have had cases where the test run never finishes, which is fixed by having this script running. I don't believe the script kill-old-processes supports this.
Darin Adler
Comment 7 2020-07-13 09:32:50 PDT
Then it seems like a bad idea to stop running it, and a bad idea to delete it.
Jonathan Bedard
Comment 8 2020-07-14 17:25:45 PDT
If we think such a script is needed, why don't we also need it on build.webkit.org? One of the purposes of the push to move EWS to buildbot was so that EWS was running with the same configuration as trunk testing. I'm pretty sure trunk deals with the hanging problem by leaning on run-webkit-tests to take care of it and, as a last resort, buildbot timing out.
Alexey Proskuryakov
Comment 9 2020-07-14 22:56:43 PDT
I think that the new behavior is: 1. Buildbot will terminate unresponsive steps. 2. kill-old-processes will be executed before the next test to clean up. So we should be fine? In any case, this is dead code now that Aakash removed the scheduled task from all bots where we still had it.
Per Arne Vollan
Comment 10 2020-07-15 13:04:12 PDT
(In reply to Jonathan Bedard from comment #8) > If we think such a script is needed, why don't we also need it on > build.webkit.org? > > One of the purposes of the push to move EWS to buildbot was so that EWS was > running with the same configuration as trunk testing. I'm pretty sure trunk > deals with the hanging problem by leaning on run-webkit-tests to take care > of it and, as a last resort, buildbot timing out. Yes, we should also have it on build.webkit.org (or a similar feature), since the same issue will be present there. However, it is more important on EWS, since without it, Windows EWS will often start lagging behind, and we loose test coverage.
Per Arne Vollan
Comment 11 2020-07-15 13:22:19 PDT
(In reply to Alexey Proskuryakov from comment #9) > I think that the new behavior is: > > 1. Buildbot will terminate unresponsive steps. > 2. kill-old-processes will be executed before the next test to clean up. > > So we should be fine? In any case, this is dead code now that Aakash removed > the scheduled task from all bots where we still had it. The script is able to unblock the waiting parent process of a stuck DumpRenderTree/ImageDiff while running the layout tests, making the test run succeed, although with some test failures. It seems buildbot will terminate the whole step, which means the test run would fail, or am I mistaken? Anyway, I see that this is dead code now after moving to buildbot. If we still need this feature, maybe it should be part of the run-webkit-tests script? Also, I haven't observed hangs after the buildbot move, which makes me think that this feature is not super critical at this point, since the Windows EWS bots are keeping up with the queue. Great job!
Jonathan Bedard
Comment 12 2020-07-15 14:04:04 PDT
(In reply to Per Arne Vollan from comment #11) > (In reply to Alexey Proskuryakov from comment #9) > > I think that the new behavior is: > > > > 1. Buildbot will terminate unresponsive steps. > > 2. kill-old-processes will be executed before the next test to clean up. > > > > So we should be fine? In any case, this is dead code now that Aakash removed > > the scheduled task from all bots where we still had it. > > The script is able to unblock the waiting parent process of a stuck > DumpRenderTree/ImageDiff while running the layout tests, making the test run > succeed, although with some test failures. It seems buildbot will terminate > the whole step, which means the test run would fail, or am I mistaken? > Anyway, I see that this is dead code now after moving to buildbot. If we > still need this feature, maybe it should be part of the run-webkit-tests > script? Also, I haven't observed hangs after the buildbot move, which makes > me think that this feature is not super critical at this point, since the > Windows EWS bots are keeping up with the queue. Great job! Yes, I would expect such a script to be part of run-webkit-tests, if we needed it. Seems like it would basically turn an error that broke the entire test suite into one that just broke a few tests. Although such behavior would definitely cause flakey test failures.
Note You need to log in before you can comment on or make changes to this bug.