I found a strange thing today. Qt EWS tries to process a security patch (134942 ) again and again. Of course it can't, because the EWS isn't the member of the security group. But the problem is that after it can't process the attachment, it says that queue is empty (but it isn't!!!) and it sleeps 2 minutes and push the security patch to the end of the queue: Fetching: https://bugs.webkit.org/attachment.cgi?id=134942&action=edit webkitpy.common.net.statusserver: [INFO] Releasing work item 134942 from qt-ews No work item. Sleeping until 2012-04-03 09:06:02 (2 mins). #1 135335 #2 135336 #3 135337 #4 135338 #5 134942 When it finished processing the other patches and the security patch comes again it will sleep 2 minutes again and push it to the end of the queue ...
Hmmmm ... It seems it is unrelated to security bugs, but related to releasing work items somehow: Fetching: https://bugs.webkit.org/attachment.cgi?id=134942&action=edit webkitpy.common.net.statusserver: [INFO] Releasing work item 134942 from qt-wk2-ews No work item. Sleeping until 2012-04-03 09:15:50 (2 mins). But the queue wasn't empty, there were 6 patches in it: #1 135336 #2 135337 #3 135338 #4 135339 #5 135340 #6 135341
Otherwise we shouldn't allow submitting security patches to EWS queues somehow. How was is possible? Is feeder queue bot in the security group? Or did somebody push the "Submit for EWS analysis" button accidentally?
Folks submit security bugs to the EWS all the time. We should just have the bots error out if they can't fetch the attachment.
We just need to catch the "Failed to fetch" case, and not have it go down the "we have nothing left to do" logic path.
(In reply to comment #3) > Folks submit security bugs to the EWS all the time. We should just have the bots error out if they can't fetch the attachment. Is there an EWS in the security group? It would be great if security patches aren't blocks the EWS bots because they can't fetch the patches.
> Is there an EWS in the security group? Nope, but sometimes folks add them explicitly to the CC list. > It would be great if security patches aren't blocks the EWS bots because they can't fetch the patches. Correct. We need to error out in that case rather than block the queue.
+info for fixing this bug: $ Tools/Scripts/webkit-patch apply-attachment 157161 [details] ----------------------------------------------------- Fetching: https://bugs.webkit.org/attachment.cgi?id=157161&action=edit Traceback (most recent call last): File "Tools/Scripts/webkit-patch", line 84, in <module> main() File "Tools/Scripts/webkit-patch", line 79, in main WebKitPatch(os.path.abspath(__file__)).main() File "/home/oszi/WebKit/Tools/Scripts/webkitpy/tool/multicommandtool.py", line 311, in main result = command.check_arguments_and_execute(options, args, self) File "/home/oszi/WebKit/Tools/Scripts/webkitpy/tool/multicommandtool.py", line 120, in check_arguments_and_execute return self.execute(options, args, tool) or 0 File "/home/oszi/WebKit/Tools/Scripts/webkitpy/tool/commands/download.py", line 159, in execute bugs_to_patches = self._collect_patches_by_bug(patches) File "/home/oszi/WebKit/Tools/Scripts/webkitpy/tool/commands/download.py", line 151, in _collect_patches_by_bug bugs_to_patches[patch.bug_id()] = bugs_to_patches.get(patch.bug_id(), []) + [patch] AttributeError: 'NoneType' object has no attribute 'bug_id' log from the EWS bot: ---------------------- Fetching: https://bugs.webkit.org/attachment.cgi?id=157161&action=edit No work item. Sleeping until 2012-08-25 07:37:22 (2 mins). Fetching: https://bugs.webkit.org/attachment.cgi?id=149115&action=edit No work item. Sleeping until 2012-08-25 07:39:24 (2 mins). Fetching: https://bugs.webkit.org/attachment.cgi?id=159810&action=edit No work item. Sleeping until 2012-08-25 07:41:25 (2 mins). Fetching: https://bugs.webkit.org/attachment.cgi?id=160209&action=edit No work item. Sleeping until 2012-08-25 07:43:27 (2 mins). Fetching: https://bugs.webkit.org/attachment.cgi?id=158975&action=edit No work item. Sleeping until 2012-08-25 07:45:29 (2 mins). Fetching: https://bugs.webkit.org/attachment.cgi?id=160174&action=edit No work item. Sleeping until 2012-08-25 07:47:31 (2 mins). Fetching: https://bugs.webkit.org/attachment.cgi?id=157161&action=edit No work item. Sleeping until 2012-08-25 07:49:32 (2 mins). Fetching: https://bugs.webkit.org/attachment.cgi?id=149115&action=edit No work item. Sleeping until 2012-08-25 07:51:34 (2 mins).
Now there are 25 patches in Qt-EWS queue, but it is sleeping 2 minutes after all patches ... It isn't so good. Fetching: https://bugs.webkit.org/attachment.cgi?id=160209&action=edit webkitpy.common.net.statusserver: [INFO] Releasing work item 160209 from qt-wk2-ews No work item. Sleeping until 2012-08-29 19:52:36 (2 mins). Fetching: https://bugs.webkit.org/attachment.cgi?id=149115&action=edit webkitpy.common.net.statusserver: [INFO] Releasing work item 149115 from qt-wk2-ews No work item. Sleeping until 2012-08-29 19:54:38 (2 mins). ... I propose decrease sleeping time to 10 seconds until proper fix. Any objections?
Created attachment 161268 [details] Patch workaround until proper fix
Comment on attachment 161268 [details] Patch This will increase the load on the appengine instance by a factor of 12.
(In reply to comment #10) > (From update of attachment 161268 [details]) > This will increase the load on the appengine instance by a factor of 12. Is the load of appengine instance more important to waiting 1-2 hours for all patch because of this crazy bug? Is there any volunteer to do a proper fix for this half year bug?
This bug should be trivial to fix. def _next_patch(self): patch_id = self._tool.status_server.next_work_item(self.name) if not patch_id: return None patch = self._tool.bugs.fetch_attachment(patch_id) if not patch: # FIXME: Using a fake patch because release_work_item has the wrong API. # We also don't really need to release the lock (although that's fine), # mostly we just need to remove this bogus patch from our queue. # If for some reason bugzilla is just down, then it will be re-fed later. patch = Attachment({'id': patch_id}, None) self._release_work_item(patch) return None return patch Should just be a loop. Which attempts to grab the pathc, if it can't, tries again,until next_work_item returns None. The only downside of that is that we'll end up blowing out the queue if bugzilla ever goes down. But I don't think we should optimize for that case. We could also add a check her to make sure bugzilla is reachable before attempting any of this. :)
Created attachment 164074 [details] patch
LGTM
Comment on attachment 164074 [details] patch Can we write a unittest?
Comment on attachment 164074 [details] patch This is a much better fix, thank you! Writing a unit test should be straightforward. See http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/tool/commands/queues_unittest.py#L154
Created attachment 164540 [details] patch
Comment on attachment 164540 [details] patch Great!
Comment on attachment 164540 [details] patch Clearing flags on attachment: 164540 Committed r128899: <http://trac.webkit.org/changeset/128899>
All reviewed patches have been landed. Closing bug.