Bug 107612

Summary: Add monitoring of patches and queues to the QueueStatusServer
Product: WebKit Reporter: Alan Cutter <alancutter>
Component: Tools / TestsAssignee: Alan Cutter <alancutter>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, eric, ossy, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 107775    
Bug Blocks:    
Attachments:
Description Flags
Patch
none
Patch
none
Patch none

Alan Cutter
Reported 2013-01-22 18:50:32 PST
Currently it's very difficult to see how long the average patch runs for in the EWS queues or how fast a particular bot is performing compared to others. The QueueStatusServer should be recording data that can easily present this type of information.
Attachments
Patch (28.53 KB, patch)
2013-01-22 19:24 PST, Alan Cutter
no flags
Patch (28.54 KB, patch)
2013-01-22 21:05 PST, Alan Cutter
no flags
Patch (28.52 KB, patch)
2013-01-22 23:15 PST, Alan Cutter
no flags
Alan Cutter
Comment 1 2013-01-22 19:24:56 PST
WebKit Review Bot
Comment 2 2013-01-22 19:49:19 PST
Attachment 184107 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Tools/ChangeLog', u'Tools/QueueStatusServe..." exit_code: 1 Tools/QueueStatusServer/model/queuelog.py:32: No name 'appengine' in module 'google' [pylint/E0611] [5] Tools/QueueStatusServer/model/queuelog.py:53: [QueueLog.get_current] Class 'QueueLog' has no 'get_or_insert' member [pylint/E1101] [5] Tools/QueueStatusServer/model/patchlog.py:31: No name 'appengine' in module 'google' [pylint/E0611] [5] Tools/QueueStatusServer/model/patchlog.py:48: [PatchLog.lookup] Class 'PatchLog' has no 'get_or_insert' member [pylint/E1101] [5] Tools/QueueStatusServer/handlers/updateworkitems.py:52: [UpdateWorkItems._update_work_items_from_request] Instance of 'UpdateWorkItems' has no 'request' member [pylint/E1101] [5] Tools/QueueStatusServer/handlers/updateworkitems.py:55: [UpdateWorkItems._update_work_items_from_request] Instance of 'UpdateWorkItems' has no 'response' member [pylint/E1101] [5] Tools/QueueStatusServer/handlers/updateworkitems.py:72: [UpdateWorkItems.post] Instance of 'UpdateWorkItems' has no 'response' member [pylint/E1101] [5] Total errors found: 7 in 15 files If any of these errors are false positives, please file a bug against check-webkit-style.
Adam Barth
Comment 3 2013-01-22 20:38:52 PST
Comment on attachment 184107 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=184107&action=review > Tools/QueueStatusServer/handlers/updateworkitems.py:58 > + work_items.date = datetime.now() Why not utcnow?
Adam Barth
Comment 4 2013-01-22 20:39:30 PST
First of all, this is a great idea. It's just going to take me a bit to remember this code well enough to review your patch. :)
Alan Cutter
Comment 5 2013-01-22 21:05:05 PST
Alan Cutter
Comment 6 2013-01-22 21:06:07 PST
(In reply to comment #3) > (From update of attachment 184107 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=184107&action=review > > > Tools/QueueStatusServer/handlers/updateworkitems.py:58 > > + work_items.date = datetime.now() > > Why not utcnow? Ah! That one slipped me. Fixed in second attachment.
WebKit Review Bot
Comment 7 2013-01-22 21:10:51 PST
Attachment 184126 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Tools/ChangeLog', u'Tools/QueueStatusServe..." exit_code: 1 Tools/QueueStatusServer/model/queuelog.py:32: No name 'appengine' in module 'google' [pylint/E0611] [5] Tools/QueueStatusServer/model/queuelog.py:53: [QueueLog.get_current] Class 'QueueLog' has no 'get_or_insert' member [pylint/E1101] [5] Tools/QueueStatusServer/model/patchlog.py:31: No name 'appengine' in module 'google' [pylint/E0611] [5] Tools/QueueStatusServer/model/patchlog.py:48: [PatchLog.lookup] Class 'PatchLog' has no 'get_or_insert' member [pylint/E1101] [5] Tools/QueueStatusServer/handlers/updateworkitems.py:52: [UpdateWorkItems._update_work_items_from_request] Instance of 'UpdateWorkItems' has no 'request' member [pylint/E1101] [5] Tools/QueueStatusServer/handlers/updateworkitems.py:55: [UpdateWorkItems._update_work_items_from_request] Instance of 'UpdateWorkItems' has no 'response' member [pylint/E1101] [5] Tools/QueueStatusServer/handlers/updateworkitems.py:72: [UpdateWorkItems.post] Instance of 'UpdateWorkItems' has no 'response' member [pylint/E1101] [5] Total errors found: 7 in 15 files If any of these errors are false positives, please file a bug against check-webkit-style.
Adam Barth
Comment 8 2013-01-22 22:00:46 PST
Comment on attachment 184126 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=184126&action=review This looks good. I don't remember this code very well, so this review isn't based on any deep knowledge of the stuff you're changing. ;) > Tools/QueueStatusServer/loggers/recordpatchevent.py:42 > + print patches_waiting Do we want to call print in production?
Eric Seidel (no email)
Comment 9 2013-01-22 22:04:51 PST
Comment on attachment 184126 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=184126&action=review > Tools/QueueStatusServer/handlers/updateworkitems.py:49 > + return return None is clearer IMO. >> Tools/QueueStatusServer/handlers/updateworkitems.py:55 >> + self.response.out.write("Failed to parse work items: %s" % items_string) > > [UpdateWorkItems._update_work_items_from_request] Instance of 'UpdateWorkItems' has no 'response' member [pylint/E1101] [5] We need to figure out why pylint is so stupid and fix it to not be. > Tools/QueueStatusServer/loggers/recordbotevent.py:35 > + def activity(cls, queue_name, bot_id): I might have given this a more "active" name. :) like record_activity. >> Tools/QueueStatusServer/loggers/recordpatchevent.py:42 >> + print patches_waiting > > Do we want to call print in production? That's how you print to the page in simple GAE, but that may not be what alan meant to do here. >> Tools/QueueStatusServer/model/patchlog.py:31 >> +from google.appengine.ext import db > > No name 'appengine' in module 'google' [pylint/E0611] [5] Yeah, clearly we need to fix sys.path on the bot or something. We probably just need to install the gae sdk on the bot, or make it auto-installed. > Tools/QueueStatusServer/model/queuelog.py:41 > + max_patches_waiting = db.IntegerProperty(default=0) I'm surprised 0 isn't the default. I guess None is?
Eric Seidel (no email)
Comment 10 2013-01-22 22:16:42 PST
I didn't mean to block this commit, btw.
Adam Barth
Comment 11 2013-01-22 22:30:59 PST
We probably should check with Alan about the print statement before landing the patch.
Alan Cutter
Comment 12 2013-01-22 23:15:43 PST
WebKit Review Bot
Comment 13 2013-01-22 23:17:38 PST
Attachment 184154 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Tools/ChangeLog', u'Tools/QueueStatusServe..." exit_code: 1 Tools/QueueStatusServer/model/queuelog.py:32: No name 'appengine' in module 'google' [pylint/E0611] [5] Tools/QueueStatusServer/model/queuelog.py:53: [QueueLog.get_current] Class 'QueueLog' has no 'get_or_insert' member [pylint/E1101] [5] Tools/QueueStatusServer/model/patchlog.py:31: No name 'appengine' in module 'google' [pylint/E0611] [5] Tools/QueueStatusServer/model/patchlog.py:48: [PatchLog.lookup] Class 'PatchLog' has no 'get_or_insert' member [pylint/E1101] [5] Tools/QueueStatusServer/handlers/updateworkitems.py:52: [UpdateWorkItems._update_work_items_from_request] Instance of 'UpdateWorkItems' has no 'request' member [pylint/E1101] [5] Tools/QueueStatusServer/handlers/updateworkitems.py:55: [UpdateWorkItems._update_work_items_from_request] Instance of 'UpdateWorkItems' has no 'response' member [pylint/E1101] [5] Tools/QueueStatusServer/handlers/updateworkitems.py:72: [UpdateWorkItems.post] Instance of 'UpdateWorkItems' has no 'response' member [pylint/E1101] [5] Total errors found: 7 in 15 files If any of these errors are false positives, please file a bug against check-webkit-style.
Alan Cutter
Comment 14 2013-01-22 23:24:39 PST
(In reply to comment #9) > (From update of attachment 184126 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=184126&action=review > > > Tools/QueueStatusServer/handlers/updateworkitems.py:49 > > + return > > return None is clearer IMO. > Added None to return. > > Tools/QueueStatusServer/loggers/recordbotevent.py:35 > > + def activity(cls, queue_name, bot_id): > > I might have given this a more "active" name. :) like record_activity. > Changed name to record_activity. > >> Tools/QueueStatusServer/loggers/recordpatchevent.py:42 > >> + print patches_waiting > > > > Do we want to call print in production? > > That's how you print to the page in simple GAE, but that may not be what alan meant to do here. > Indeed not, removed print. > >> Tools/QueueStatusServer/model/patchlog.py:31 > >> +from google.appengine.ext import db > > > > No name 'appengine' in module 'google' [pylint/E0611] [5] > > Yeah, clearly we need to fix sys.path on the bot or something. We probably just need to install the gae sdk on the bot, or make it auto-installed. > That is a good idea! We might be able to get some automated tests going for QSS. (: > > Tools/QueueStatusServer/model/queuelog.py:41 > > + max_patches_waiting = db.IntegerProperty(default=0) > > I'm surprised 0 isn't the default. I guess None is? None is the default for primitive values. The GAE docs aren't very clear on that detail.
WebKit Review Bot
Comment 15 2013-01-22 23:37:38 PST
Comment on attachment 184154 [details] Patch Clearing flags on attachment: 184154 Committed r140513: <http://trac.webkit.org/changeset/140513>
WebKit Review Bot
Comment 16 2013-01-22 23:37:42 PST
All reviewed patches have been landed. Closing bug.
Csaba Osztrogonác
Comment 17 2013-01-23 07:11:09 PST
(In reply to comment #15) > (From update of attachment 184154 [details]) > Clearing flags on attachment: 184154 > > Committed r140513: <http://trac.webkit.org/changeset/140513> It seems it broke all EWS bots somehow. For example Qt EWS bots are stucked in the following exception when they try to fetch http://queues.webkit.org/next-patch/qt-ews: Traceback (most recent call last): File "/storage/WebKit-qt-ews/Tools/Scripts/webkitpy/tool/bot/queueengine.py", line 96, in run work_item = self._delegate.next_work_item() File "/storage/WebKit-qt-ews/Tools/Scripts/webkitpy/tool/commands/queues.py", line 419, in next_work_item return self._next_patch() File "/storage/WebKit-qt-ews/Tools/Scripts/webkitpy/tool/commands/queues.py", line 216, in _next_patch patch_id = self._tool.status_server.next_work_item(self.name) File "/storage/WebKit-qt-ews/Tools/Scripts/webkitpy/common/net/statusserver.py", line 128, in next_work_item return self._fetch_url(next_patch_url) File "/storage/WebKit-qt-ews/Tools/Scripts/webkitpy/common/net/statusserver.py", line 162, in _fetch_url raise e HTTPError: HTTP Error 500: Internal Server Error I tried to fetch this URL from the browser and I got this exception: Traceback (most recent call last): File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/_webapp25.py", line 714, in __call__ handler.get(*groups) File "/base/data/home/apps/webkit-commit-queue/107612.364787856447592722/handlers/nextpatch.py", line 51, in get RecordPatchEvent.started(patch_id, queue_name) File "/base/data/home/apps/webkit-commit-queue/107612.364787856447592722/loggers/recordpatchevent.py", line 63, in started patch_log.calculate_wait_duration() File "/base/data/home/apps/webkit-commit-queue/107612.364787856447592722/model/patchlog.py", line 52, in calculate_wait_duration self.wait_duration = int(time_delta.total_seconds()) AttributeError: 'datetime.timedelta' object has no attribute 'total_seconds' Could you guys check it, please?
Csaba Osztrogonác
Comment 18 2013-01-23 09:23:14 PST
ping?
Adam Barth
Comment 19 2013-01-23 09:35:18 PST
> ping? Alan is probably asleep, but I think I've rolled back the server to the version before this change.
Csaba Osztrogonác
Comment 20 2013-01-23 09:39:55 PST
(In reply to comment #19) > > ping? > > Alan is probably asleep, but I think I've rolled back the server to the version before this change. Thanks, EWS bots started to work again.
Csaba Osztrogonác
Comment 21 2013-01-23 09:40:20 PST
Reopen to fix the bug.
Alan Cutter
Comment 22 2013-01-23 19:48:25 PST
(In reply to comment #21) > Reopen to fix the bug. Created bug 107775 to resolve problem.
Note You need to log in before you can comment on or make changes to this bug.