RESOLVED FIXED 140819
Fix the false positive build failures on the Windows buildbots
https://bugs.webkit.org/show_bug.cgi?id=140819
Summary Fix the false positive build failures on the Windows buildbots
Csaba Osztrogonác
Reported 2015-01-23 00:43:00 PST
examples: https://build.webkit.org/builders/Apple%20Win%20Release%20%28Build%29/builds/66798 https://build.webkit.org/builders/Apple%20Win%20Release%20%28Build%29/builds/66807 https://build.webkit.org/builders/Apple%20Win%20Release%20%28Build%29/builds/66832 https://build.webkit.org/builders/Apple%20Win%20Release%20%28Build%29/builds/66849 The problem is that buildbot kills the compile step when there is output since 20 minutes. If a change is too big, the build time takes more than 20 minutes on the Windows bots, but the Visual Studio Express doesn't write anything to stdout during building. The only one fix here is to increase the timeout of this buildstep.
Attachments
Patch (1.29 KB, patch)
2015-01-23 00:44 PST, Csaba Osztrogonác
no flags
Patch (1.41 KB, patch)
2015-01-23 01:09 PST, Csaba Osztrogonác
no flags
Patch (1.82 KB, patch)
2015-01-23 09:17 PST, Csaba Osztrogonác
no flags
Csaba Osztrogonác
Comment 1 2015-01-23 00:44:48 PST
Csaba Osztrogonác
Comment 2 2015-01-23 01:09:40 PST
Brent Fulgham
Comment 3 2015-01-23 08:48:50 PST
Comment on attachment 245217 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=245217&action=review I'm in favor of this change, but I think some of the tools people should have final say. It certainly takes longer than 10 minutes to build on Windows from a clean slate; as things currently stand I think we hit this timeout any time we have to rebuild most of WebKit. > Tools/BuildSlaveSupport/build.webkit.org-config/master.cfg:203 > + kwargs['timeout'] = 60 * 60 It certainly can take an hour or so on Windows for a clean build, but I think most of our Mac bots do it in less time. Is there any way to configure this so that Windows has a long timeout and perhaps leave others alone? Or do other bots have a similar timeout issue and we should just change it across the board?
David Kilzer (:ddkilzer)
Comment 4 2015-01-23 08:56:28 PST
Comment on attachment 245217 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=245217&action=review >> Tools/BuildSlaveSupport/build.webkit.org-config/master.cfg:203 >> + kwargs['timeout'] = 60 * 60 > > It certainly can take an hour or so on Windows for a clean build, but I think most of our Mac bots do it in less time. Is there any way to configure this so that Windows has a long timeout and perhaps leave others alone? > > Or do other bots have a similar timeout issue and we should just change it across the board? If it's only Windows that has this issue, we should only set the timeout to be 2 hours on Windows bots. Can we use self.getProperty('platform') here (or kwargs['platform']) to only set this for Windows bots?
Brent Fulgham
Comment 5 2015-01-23 08:58:30 PST
(In reply to comment #0) > examples: > https://build.webkit.org/builders/Apple%20Win%20Release%20%28Build%29/builds/ > 66798 > https://build.webkit.org/builders/Apple%20Win%20Release%20%28Build%29/builds/ > 66807 > https://build.webkit.org/builders/Apple%20Win%20Release%20%28Build%29/builds/ > 66832 > https://build.webkit.org/builders/Apple%20Win%20Release%20%28Build%29/builds/ > 66849 > > The problem is that buildbot kills the compile step when there is output > since 20 minutes. > If a change is too big, the build time takes more than 20 minutes on the > Windows bots, > but the Visual Studio Express doesn't write anything to stdout during > building. > > The only one fix here is to increase the timeout of this buildstep. I wonder if there's any way to output to stdout from the build system as another way to avoid this.
Csaba Osztrogonác
Comment 6 2015-01-23 08:59:55 PST
(In reply to comment #3) > It certainly can take an hour or so on Windows for a clean build, but I > think most of our Mac bots do it in less time. Is there any way to configure > this so that Windows has a long timeout and perhaps leave others alone? > > Or do other bots have a similar timeout issue and we should just change it > across the board? The default 20 minutes timeout isn't mean that the build should be finished in 20 minutes. The buildmaster will kill the build only if it doesn't produce any output in 20 minutes. It isn't problem for Linux and Mac builders, but Visual Studio Express doesn't produce any output during the build.
Csaba Osztrogonác
Comment 7 2015-01-23 09:04:02 PST
(In reply to comment #4) > Comment on attachment 245217 [details] > If it's only Windows that has this issue, we should only set the timeout to > be 2 hours on Windows bots. > > Can we use self.getProperty('platform') here (or kwargs['platform']) to only > set this for Windows bots? Unfortunately it isn't so easy, because properties aren't accessible when the buildstep is instantiated in BuildFactory". But maybe we can pass the platform to CompileWebKit() or instantiate with timeout if platform=="win". Let me try it.
Csaba Osztrogonác
Comment 8 2015-01-23 09:17:23 PST
Created attachment 245230 [details] Patch I checked, CompileWebKit(timeout=xxxx) works and platform is passed to BuildFactory.__init__, so we can set timeout for Windows only.
Alexey Proskuryakov
Comment 9 2015-01-23 10:09:06 PST
That's an amazing find! Can't wait for these failures to be a thing of the past. Two comments: 1. Why are Windows builds so slow? We have decent hardware as far as I know, is there some sort of misconfiguration by any chance? This comment obviously doesn't block reviewing or landing the fix. 2. I think that a better way to fix this would be to actually pipe the logs to output as they come. This way, one could watch progress in real time on a web page, like we do for non-Windows builds. And the bot wouldn't remain stuck for a longer time when an actual freeze occurs. Is that possible to implement with Visual Studio? I think that we should fix it this way if possible.
Brent Fulgham
Comment 10 2015-01-23 13:30:36 PST
(In reply to comment #9) > That's an amazing find! Can't wait for these failures to be a thing of the > past. > > Two comments: > > 1. Why are Windows builds so slow? We have decent hardware as far as I know, > is there some sort of misconfiguration by any chance? > > This comment obviously doesn't block reviewing or landing the fix. The build itself is no slower than on Mac. The problem is that the way the VS build runs, it does not output any information to stdout; therefore the script thinks the build is hung. I spent about 5 minutes looking for a setting in Visual Studio to output the "Output Window" text to stdout, but didn't find anything. Another option might be to switch to using MSBuild directly, rather than driving it via Visual Studio. This would give us stdout logging like we have using xcodebuild on Mac. The only downside would be having to write an MSBuild input file to control things, but that's not a huge problem. > 2. I think that a better way to fix this would be to actually pipe the logs > to output as they come. This way, one could watch progress in real time on a > web page, like we do for non-Windows builds. And the bot wouldn't remain > stuck for a longer time when an actual freeze occurs. Is that possible to > implement with Visual Studio? > > I think that we should fix it this way if possible. I'll do a little more digging before giving up. Alternatively, we could land this patch now and revise things later if we figure out how to do it. If we leave things as they stand, we get false build failures from time-to-time. With ossy's proposed patch, that wouldn't happen anymore.
Alexey Proskuryakov
Comment 11 2015-01-23 13:49:10 PST
> The build itself is no slower than on Mac A clean OS X build takes 44 minutes on a Mac mini (see <https://build.webkit.org/builders/Apple%20Yosemite%20Release%20%28Build%29/builds/2387>). I don't know how long each target takes, but even WebCore is likely under 20 minutes. Also, Mac mini is not all that fast.
Brent Fulgham
Comment 12 2015-01-23 13:51:50 PST
(In reply to comment #11) > > The build itself is no slower than on Mac > > A clean OS X build takes 44 minutes on a Mac mini (see > <https://build.webkit.org/builders/Apple%20Yosemite%20Release%20%28Build%29/ > builds/2387>). > > I don't know how long each target takes, but even WebCore is likely under 20 > minutes. Also, Mac mini is not all that fast. Yes, but you are getting build output during that period. The Visual Studio process that build-webkit is watching produces no output until the entire build (consisting of all projects) has completed.
Alexey Proskuryakov
Comment 13 2015-01-23 13:57:49 PST
I understand how not getting the output breaks the build. What I'm saying is that it shouldn't be taking that long in the first place. And as previously mentioned, this question doesn't block the patch at all.
Brent Fulgham
Comment 14 2015-01-23 16:04:20 PST
Comment on attachment 245230 [details] Patch r=me. I'm filing a separate bug to deal with this more permanently.
WebKit Commit Bot
Comment 15 2015-01-23 16:49:08 PST
Comment on attachment 245230 [details] Patch Clearing flags on attachment: 245230 Committed r179043: <http://trac.webkit.org/changeset/179043>
WebKit Commit Bot
Comment 16 2015-01-23 16:49:14 PST
All reviewed patches have been landed. Closing bug.
Csaba Osztrogonác
Comment 17 2015-01-27 05:38:45 PST
Just out of curiosity I checked the clean build time on the bots: - release bot: 27 mins, 32 secs - https://build.webkit.org/builders/Apple%20Win%20Release%20%28Build%29/builds/66907 - debug bot: 28 mins, 6 secs - https://build.webkit.org/builders/Apple%20Win%20Debug%20%28Build%29/builds/85014
Alexey Proskuryakov
Comment 18 2015-01-27 09:50:54 PST
This is not fast, but reasonable. Is there no stdout output until all the targets are built? We may not need to pump the output in real time if we can make each target dump the results once done.
Note You need to log in before you can comment on or make changes to this bug.