Bug 83038 - EWS shouldn't sleep if there are new patches in its queue
Summary: EWS shouldn't sleep if there are new patches in its queue
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-03 09:10 PDT by Csaba Osztrogonác
Modified: 2012-09-18 08:30 PDT (History)
9 users (show)

See Also:


Attachments
Patch (2.38 KB, patch)
2012-08-29 11:04 PDT, Csaba Osztrogonác
abarth: review-
Details | Formatted Diff | Diff
patch (2.98 KB, patch)
2012-09-14 02:11 PDT, Szilard Ledan
no flags Details | Formatted Diff | Diff
patch (4.62 KB, patch)
2012-09-18 06:01 PDT, Szilard Ledan
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Csaba Osztrogonác 2012-04-03 09:10:05 PDT
I found a strange thing today. Qt EWS tries to process a security
patch (134942 ) again and again. Of course it can't, because the
EWS isn't the member of the security group.

But the problem is that after it can't process the attachment, it says
that queue is empty (but it isn't!!!) and it sleeps 2 minutes and push 
the security patch to the end of the queue:

Fetching: https://bugs.webkit.org/attachment.cgi?id=134942&action=edit
webkitpy.common.net.statusserver: [INFO] Releasing work item 134942 from qt-ews
No work item. Sleeping until 2012-04-03 09:06:02 (2 mins).

#1 	135335 	
#2 	135336 	
#3 	135337 	
#4 	135338 	
#5 	134942 

When it finished processing the other patches and the security patch comes
again it will sleep 2 minutes again and push it to the end of the queue ...
Comment 1 Csaba Osztrogonác 2012-04-03 09:17:04 PDT
Hmmmm ... It seems it is unrelated to security 
bugs, but related to releasing work items somehow:

Fetching: https://bugs.webkit.org/attachment.cgi?id=134942&action=edit
webkitpy.common.net.statusserver: [INFO] Releasing work item 134942 from qt-wk2-ews
No work item. Sleeping until 2012-04-03 09:15:50 (2 mins).

But the queue wasn't empty, there were 6 patches in it:
#1 	135336 	
#2 	135337 	
#3 	135338 	
#4 	135339 	
#5 	135340 	
#6 	135341
Comment 2 Csaba Osztrogonác 2012-04-03 09:26:32 PDT
Otherwise we shouldn't allow submitting security patches to EWS queues somehow.
How was is possible? Is feeder queue bot in the security group? Or did somebody push the "Submit for EWS analysis" button accidentally?
Comment 3 Adam Barth 2012-04-03 10:04:09 PDT
Folks submit security bugs to the EWS all the time.  We should just have the bots error out if they can't fetch the attachment.
Comment 4 Eric Seidel (no email) 2012-04-03 12:28:27 PDT
We just need to catch the "Failed to fetch" case, and not have it go down the "we have nothing left to do" logic path.
Comment 5 Csaba Osztrogonác 2012-04-07 12:50:58 PDT
(In reply to comment #3)
> Folks submit security bugs to the EWS all the time.  We should just have the bots error out if they can't fetch the attachment.

Is there an EWS in the security group?

It would be great if security patches aren't blocks the EWS bots because they can't fetch the patches.
Comment 6 Adam Barth 2012-04-07 16:33:35 PDT
> Is there an EWS in the security group?

Nope, but sometimes folks add them explicitly to the CC list.

> It would be great if security patches aren't blocks the EWS bots because they can't fetch the patches.

Correct.  We need to error out in that case rather than block the queue.
Comment 7 Csaba Osztrogonác 2012-08-25 07:52:10 PDT
+info for fixing this bug:

$ Tools/Scripts/webkit-patch apply-attachment 157161 [details]
-----------------------------------------------------
Fetching: https://bugs.webkit.org/attachment.cgi?id=157161&action=edit
Traceback (most recent call last):
  File "Tools/Scripts/webkit-patch", line 84, in <module>
    main()
  File "Tools/Scripts/webkit-patch", line 79, in main
    WebKitPatch(os.path.abspath(__file__)).main()
  File "/home/oszi/WebKit/Tools/Scripts/webkitpy/tool/multicommandtool.py", line 311, in main
    result = command.check_arguments_and_execute(options, args, self)
  File "/home/oszi/WebKit/Tools/Scripts/webkitpy/tool/multicommandtool.py", line 120, in check_arguments_and_execute
    return self.execute(options, args, tool) or 0
  File "/home/oszi/WebKit/Tools/Scripts/webkitpy/tool/commands/download.py", line 159, in execute
    bugs_to_patches = self._collect_patches_by_bug(patches)
  File "/home/oszi/WebKit/Tools/Scripts/webkitpy/tool/commands/download.py", line 151, in _collect_patches_by_bug
    bugs_to_patches[patch.bug_id()] = bugs_to_patches.get(patch.bug_id(), []) + [patch]
AttributeError: 'NoneType' object has no attribute 'bug_id'

log from the EWS bot:
----------------------
Fetching: https://bugs.webkit.org/attachment.cgi?id=157161&action=edit
No work item. Sleeping until 2012-08-25 07:37:22 (2 mins).
Fetching: https://bugs.webkit.org/attachment.cgi?id=149115&action=edit
No work item. Sleeping until 2012-08-25 07:39:24 (2 mins).
Fetching: https://bugs.webkit.org/attachment.cgi?id=159810&action=edit
No work item. Sleeping until 2012-08-25 07:41:25 (2 mins).
Fetching: https://bugs.webkit.org/attachment.cgi?id=160209&action=edit
No work item. Sleeping until 2012-08-25 07:43:27 (2 mins).
Fetching: https://bugs.webkit.org/attachment.cgi?id=158975&action=edit
No work item. Sleeping until 2012-08-25 07:45:29 (2 mins).
Fetching: https://bugs.webkit.org/attachment.cgi?id=160174&action=edit
No work item. Sleeping until 2012-08-25 07:47:31 (2 mins).
Fetching: https://bugs.webkit.org/attachment.cgi?id=157161&action=edit
No work item. Sleeping until 2012-08-25 07:49:32 (2 mins).
Fetching: https://bugs.webkit.org/attachment.cgi?id=149115&action=edit
No work item. Sleeping until 2012-08-25 07:51:34 (2 mins).
Comment 8 Csaba Osztrogonác 2012-08-29 10:55:13 PDT
Now there are 25 patches in Qt-EWS queue, but it is sleeping 2 minutes
after all patches ... It isn't so good.

Fetching: https://bugs.webkit.org/attachment.cgi?id=160209&action=edit
webkitpy.common.net.statusserver: [INFO] Releasing work item 160209 from qt-wk2-ews
No work item. Sleeping until 2012-08-29 19:52:36 (2 mins).
Fetching: https://bugs.webkit.org/attachment.cgi?id=149115&action=edit
webkitpy.common.net.statusserver: [INFO] Releasing work item 149115 from qt-wk2-ews
No work item. Sleeping until 2012-08-29 19:54:38 (2 mins).

...

I propose decrease sleeping time to 10 seconds until proper fix. Any objections?
Comment 9 Csaba Osztrogonác 2012-08-29 11:04:22 PDT
Created attachment 161268 [details]
Patch

workaround until proper fix
Comment 10 Adam Barth 2012-08-29 11:15:02 PDT
Comment on attachment 161268 [details]
Patch

This will increase the load on the appengine instance by a factor of 12.
Comment 11 Csaba Osztrogonác 2012-08-29 11:26:42 PDT
(In reply to comment #10)
> (From update of attachment 161268 [details])
> This will increase the load on the appengine instance by a factor of 12.

Is the load of appengine instance more important to waiting 
1-2 hours for all patch because of this crazy bug?

Is there any volunteer to do a proper fix for this half year bug?
Comment 12 Eric Seidel (no email) 2012-08-29 12:08:24 PDT
This bug should be trivial to fix.

    def _next_patch(self):
        patch_id = self._tool.status_server.next_work_item(self.name)
        if not patch_id:
            return None
        patch = self._tool.bugs.fetch_attachment(patch_id)
        if not patch:
            # FIXME: Using a fake patch because release_work_item has the wrong API.
            # We also don't really need to release the lock (although that's fine),
            # mostly we just need to remove this bogus patch from our queue.
            # If for some reason bugzilla is just down, then it will be re-fed later.
            patch = Attachment({'id': patch_id}, None)
            self._release_work_item(patch)
            return None
        return patch

Should just be a loop.

Which attempts to grab the pathc, if it can't, tries again,until next_work_item returns None.

The only downside of that is that we'll end up blowing out the queue if bugzilla ever goes down.  But I don't think we should optimize for that case.  We could also add a check her to make sure bugzilla is reachable before attempting any of this. :)
Comment 13 Szilard Ledan 2012-09-14 02:11:35 PDT
Created attachment 164074 [details]
patch
Comment 14 Peter Gal 2012-09-14 02:31:42 PDT
LGTM
Comment 15 Eric Seidel (no email) 2012-09-14 07:23:24 PDT
Comment on attachment 164074 [details]
patch

Can we write a unittest?
Comment 16 Adam Barth 2012-09-14 13:04:41 PDT
Comment on attachment 164074 [details]
patch

This is a much better fix, thank you!

Writing a unit test should be straightforward.  See http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/tool/commands/queues_unittest.py#L154
Comment 17 Szilard Ledan 2012-09-18 06:01:24 PDT
Created attachment 164540 [details]
patch
Comment 18 Eric Seidel (no email) 2012-09-18 08:24:25 PDT
Comment on attachment 164540 [details]
patch

Great!
Comment 19 WebKit Review Bot 2012-09-18 08:30:04 PDT
Comment on attachment 164540 [details]
patch

Clearing flags on attachment: 164540

Committed r128899: <http://trac.webkit.org/changeset/128899>
Comment 20 WebKit Review Bot 2012-09-18 08:30:08 PDT
All reviewed patches have been landed.  Closing bug.