Bug 195120

Summary: [ews-app] Update primary keys for handling multiple Buildbot instances
Product: WebKit Reporter: Aakash Jain <aakash_jain>
Component: Tools / TestsAssignee: Aakash Jain <aakash_jain>
Status: RESOLVED FIXED    
Severity: Normal CC: aakash_jain, ap, commit-queue, dean_johnson, dewei_zhu, ews-watchlist, jbedard, lforschler, slewis, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: Other   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=194863
Attachments:
Description Flags
Proposed patch
none
Proposed patch
none
Proposed patch none

Description Aakash Jain 2019-02-27 14:04:53 PST
https://bugs.webkit.org/show_bug.cgi?id=194863 the instance table to keep tracking of multiple buildbot instance. This is also to handle the scenerio when the Buildbot machine has unexpected data loss event and the machine has to be formatted and setup again. In that case, build_id/builder_id/step_id will reset (start from 1 again). Since ews-app should be able to handle that scenario, these build_id/builder_id/step_id can't be used as primary keys in step/build tables. Therefore we should have different primary keys which are unique.
Comment 1 Aakash Jain 2019-02-27 14:07:02 PST
Created attachment 363130 [details]
Proposed patch
Comment 2 Aakash Jain 2019-02-27 14:09:22 PST
Tested on a local instance.

This will apply to ToT after https://bugs.webkit.org/show_bug.cgi?id=195047
Comment 3 Jonathan Bedard 2019-02-27 15:43:23 PST
Comment on attachment 363130 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363130&action=review

> Tools/BuildSlaveSupport/ews-app/ews/models/build.py:38
> +    uid = models.TextField(primary_key=True)

So, build_id's are unique per instance, but Build uid's combine build_id with builder_id, right? Why do we still have builder_id in this Model, then?
Comment 4 Aakash Jain 2019-02-27 15:48:13 PST
> So, build_id's are unique per instance
Yes
> but Build uid's combine build_id with builder_id, right? 
I assume you meant combine build_id with buildbot_instance_id. If so, yes. 

> Why do we still have builder_id in this Model, then?
To generate buildbot url's.
Comment 5 Aakash Jain 2019-02-28 12:18:29 PST
Created attachment 363248 [details]
Proposed patch

Included Buildbot(ews-build) changes to send hostname along-with the events data.
Comment 6 EWS Watchlist 2019-02-28 12:20:19 PST Comment hidden (obsolete)
Comment 7 Jonathan Bedard 2019-02-28 12:42:20 PST
(In reply to Aakash Jain from comment #4)
> > So, build_id's are unique per instance
> Yes
> > but Build uid's combine build_id with builder_id, right? 
> I assume you meant combine build_id with buildbot_instance_id. If so, yes. 

Actually, that is the source of my confusion, I was thinking builder_id and buildbot_instance_id were the same thing, they aren't.

> 
> > Why do we still have builder_id in this Model, then?
> To generate buildbot url's.
Comment 8 Dean Johnson 2019-02-28 14:18:41 PST
Comment on attachment 363248 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363248&action=review

Mostly looks good; slightly concerned about combining instance_id and build_or_step_id into another field.

> Tools/BuildSlaveSupport/ews-app/ews/models/build.py:94
> +            return Build.objects.get(uid=uid)

Is it necessary to use a try/except block here? I would expect `Build.objects.get` to return None if the build didn't exist (without throwing any exception).

> Tools/BuildSlaveSupport/ews-app/ews/models/buildbotinstance.py:55
> +        return '{}_{}'.format(instance_id, build_or_step_id)

What benefit do we get by combining instance_id with build_or_step_id? I'm concerned we'll have performance issues caused by trying to look up information by this uid given this makes it not possible to index on the instance_id directly.

Also, what happens if build_id and step_id collide for the same instance? Wouldn't that result in corruption or data loss?

> Tools/BuildSlaveSupport/ews-app/ews/models/step.py:84
> +        except Step.DoesNotExist:

Ditto to Build comment.

> Tools/BuildSlaveSupport/ews-build/events.py:76
> +    def __init__(self, hostname, type_prefix='', name='Events'):

It's a bit unclear to me if hostname is the worker, or the instance hostname. Could you rename this to instance_hostname or worker_hostname for clarity?
Comment 9 Aakash Jain 2019-03-01 11:24:51 PST
Comment on attachment 363248 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363248&action=review

>> Tools/BuildSlaveSupport/ews-app/ews/models/build.py:94
>> +            return Build.objects.get(uid=uid)
> 
> Is it necessary to use a try/except block here? I would expect `Build.objects.get` to return None if the build didn't exist (without throwing any exception).

Build.objects.get does throw an exception if the object doesn't exist.

https://docs.djangoproject.com/en/2.1/topics/db/queries/#retrieving-a-single-object-with-get
"If there are no results that match the query, get() will raise a DoesNotExist exception."

>> Tools/BuildSlaveSupport/ews-app/ews/models/buildbotinstance.py:55
>> +        return '{}_{}'.format(instance_id, build_or_step_id)
> 
> What benefit do we get by combining instance_id with build_or_step_id? I'm concerned we'll have performance issues caused by trying to look up information by this uid given this makes it not possible to index on the instance_id directly.
> 
> Also, what happens if build_id and step_id collide for the same instance? Wouldn't that result in corruption or data loss?

Django doesn't seems to support composite/multiple-column primary keys: https://code.djangoproject.com/ticket/373

If we get the same uid (same buildbot_instance_id and build_id) for a build, we further check if the patch_id, builder_id, number etc. are same as existing data or not. If not, it would fail the check here and would simply refuse the new data: https://trac.webkit.org/browser/webkit/trunk/Tools/BuildSlaveSupport/ews-app/ews/models/build.py#L70

Further, buildbot should indicate that failure in the UI and mark that build at failure/retry (that buildbot portion is not yet implemented).

>> Tools/BuildSlaveSupport/ews-build/events.py:76
>> +    def __init__(self, hostname, type_prefix='', name='Events'):
> 
> It's a bit unclear to me if hostname is the worker, or the instance hostname. Could you rename this to instance_hostname or worker_hostname for clarity?

This is the Buildbot master code. I can rename it to instance_hostname, however 'instance_hostname' might seem unusual, given that any buildbot instance doesn't know about any other instance. Maybe 'buildbot_hostname' is better.
Comment 10 Dean Johnson 2019-03-01 14:03:28 PST
(In reply to Aakash Jain from comment #9)
> Comment on attachment 363248 [details]
> Proposed patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=363248&action=review
> 
> >> Tools/BuildSlaveSupport/ews-app/ews/models/build.py:94
> >> +            return Build.objects.get(uid=uid)
> > 
> > Is it necessary to use a try/except block here? I would expect `Build.objects.get` to return None if the build didn't exist (without throwing any exception).
> 
> Build.objects.get does throw an exception if the object doesn't exist.
> 
> https://docs.djangoproject.com/en/2.1/topics/db/queries/#retrieving-a-single-
> object-with-get
> "If there are no results that match the query, get() will raise a
> DoesNotExist exception."
> 
Ah, okay. Thank you for explaining. I don't like that paradigm at all, but guess we're tied to it since we're using Django. :)
> >> Tools/BuildSlaveSupport/ews-app/ews/models/buildbotinstance.py:55
> >> +        return '{}_{}'.format(instance_id, build_or_step_id)
> > 
> > What benefit do we get by combining instance_id with build_or_step_id? I'm concerned we'll have performance issues caused by trying to look up information by this uid given this makes it not possible to index on the instance_id directly.
> > 
> > Also, what happens if build_id and step_id collide for the same instance? Wouldn't that result in corruption or data loss?
> 
> Django doesn't seems to support composite/multiple-column primary keys:
> https://code.djangoproject.com/ticket/373
> 
> If we get the same uid (same buildbot_instance_id and build_id) for a build,
> we further check if the patch_id, builder_id, number etc. are same as
> existing data or not. If not, it would fail the check here and would simply
> refuse the new data:
> https://trac.webkit.org/browser/webkit/trunk/Tools/BuildSlaveSupport/ews-app/
> ews/models/build.py#L70
Sorry, that wasn't quite my question. What if the second id (sourced from 'build_id' or 'step_id') collides since they're coming from two sources?

I also think it would be better to use a monotonically increasing ID with two separate columns for instanceid and build_or_step_id (pending the other issue is resolved) instead of this combination of two keys, so you can still index on instanceid.
> 
> Further, buildbot should indicate that failure in the UI and mark that build
> at failure/retry (that buildbot portion is not yet implemented).
> 
> >> Tools/BuildSlaveSupport/ews-build/events.py:76
> >> +    def __init__(self, hostname, type_prefix='', name='Events'):
> > 
> > It's a bit unclear to me if hostname is the worker, or the instance hostname. Could you rename this to instance_hostname or worker_hostname for clarity?
> 
> This is the Buildbot master code. I can rename it to instance_hostname,
> however 'instance_hostname' might seem unusual, given that any buildbot
> instance doesn't know about any other instance. Maybe 'buildbot_hostname' is
> better.
Oh, this is Buildbot EWS, not the OpenSource EWS Django-based service? If so, "master_hostname" would be a bit more clear to me.
Comment 11 Dean Johnson 2019-03-01 14:26:46 PST
Spoke with Aakash on the phone. Patch looks good; unofficial r+.
Comment 12 Aakash Jain 2019-03-01 14:39:56 PST
Created attachment 363376 [details]
Proposed patch

Renamed hostname to master_hostname.
Comment 13 EWS Watchlist 2019-03-01 14:42:00 PST Comment hidden (obsolete)
Comment 14 WebKit Commit Bot 2019-03-01 15:09:37 PST
Comment on attachment 363376 [details]
Proposed patch

Clearing flags on attachment: 363376

Committed r242291: <https://trac.webkit.org/changeset/242291>
Comment 15 WebKit Commit Bot 2019-03-01 15:09:39 PST
All reviewed patches have been landed.  Closing bug.
Comment 16 Radar WebKit Bug Importer 2019-03-01 15:11:04 PST
<rdar://problem/48524861>