Summary: | Add support for syncing repo with commit revision label | ||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Zhifei Fang <zhifei_fang> | ||||||||||||||||||||||||||||||||||||||
Component: | New Bugs | Assignee: | Zhifei Fang <zhifei_fang> | ||||||||||||||||||||||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||||||||||||||||||||||
Severity: | Normal | CC: | aakash_jain, darin, dewei_zhu, jbedard, rniwa, webkit-bug-importer, zhifei_fang | ||||||||||||||||||||||||||||||||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||||||||||||||||||||||||||||||||
Version: | WebKit Nightly Build | ||||||||||||||||||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||||||||||||||||
Bug Depends on: | 222897 | ||||||||||||||||||||||||||||||||||||||||
Bug Blocks: | |||||||||||||||||||||||||||||||||||||||||
Attachments: |
|
Description
Zhifei Fang
2021-02-16 11:45:31 PST
Created attachment 420515 [details]
Patch
Comment on attachment 420515 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420515&action=review > Websites/perf.webkit.org/tools/sync-commits.py:263 > + IDENTIFIER_RE = re.compile(r'Canonical link: https://commits.webkit.org/([\d]+@[a-zA-Z\s\-_]+)\n') Use lowercase and prefix with _ to be consistent with other code in this file. > Websites/perf.webkit.org/tools/sync-commits.py:284 > + identifier = self.IDENTIFIER_RE.search(message).group(1) Please use named group: ?P<~> > Websites/perf.webkit.org/tools/sync-commits.py:290 > + res = { Please don't abbreviate result / response as "res". > Websites/perf.webkit.org/tools/sync-commits.py:298 > + if previous_identifier: > + res['previousCommit'] = previous_identifier Please just always put previousCommit as we do in GitRepository's version. Comment on attachment 420515 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420515&action=review >> Websites/perf.webkit.org/tools/sync-commits.py:298 >> + res['previousCommit'] = previous_identifier > > Please just always put previousCommit as we do in GitRepository's version. This doesn't work for the first commit, please see validate_commits, it will validate the revision if the key exist. Either I change the API or change it here (In reply to Zhifei Fang from comment #3) > Comment on attachment 420515 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=420515&action=review > > >> Websites/perf.webkit.org/tools/sync-commits.py:298 > >> + res['previousCommit'] = previous_identifier > > > > Please just always put previousCommit as we do in GitRepository's version. > > This doesn't work for the first commit, please see validate_commits, it will > validate the revision if the key exist. Either I change the API or change it > here That seems to suggest that GitRepository._revision_from_tokens is also broken? I suggest we fix the API to allow null being specified in the previousCommit (with a test). It's awkward to disallow null there. Comment on attachment 420515 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420515&action=review >>>> Websites/perf.webkit.org/tools/sync-commits.py:298 >>>> + res['previousCommit'] = previous_identifier >>> >>> Please just always put previousCommit as we do in GitRepository's version. >> >> This doesn't work for the first commit, please see validate_commits, it will validate the revision if the key exist. Either I change the API or change it here > > That seems to suggest that GitRepository._revision_from_tokens is also broken? I suggest we fix the API to allow null being specified in the previousCommit (with a test). It's awkward to disallow null there. Yes... OK, I will change the API then Created attachment 420545 [details]
Patch
Comment on attachment 420545 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420545&action=review > Websites/perf.webkit.org/public/include/commit-updater.php:158 > - require_format('Revision', $commit_info['revision'], '/^[A-Za-z0-9 \.]+$/'); > + require_format('Revision', $commit_info['revision'], '/^[@A-Za-z0-9 \.]+$/'); I don't think this is quite right. We don't want to allow arbitrary number of @s. We should be using a similar expression to the one in sync-commits.py > Websites/perf.webkit.org/public/include/commit-updater.php:161 > + if (array_key_exists('previousCommit', $commit_info) && $commit_info['previousCommit']) Please add a test. Also, doesn't this allow previousCommit of 0 or false to be treated like null? I think we should probably use is_null here or === NULL. > Websites/perf.webkit.org/tools/sync-commits.py:263 > + _IDENTIFIER_RE = re.compile(r'Canonical link: https://commits.webkit.org/(?P<identifier>[\d]+@[a-zA-Z\s\-_]+)\n') Use lowercase? Comment on attachment 420545 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420545&action=review >> Websites/perf.webkit.org/tools/sync-commits.py:263 >> + _IDENTIFIER_RE = re.compile(r'Canonical link: https://commits.webkit.org/(?P<identifier>[\d]+@[a-zA-Z\s\-_]+)\n') > > Use lowercase? If this is meant to be a constant, I think python PEP8 recommends it to be all upper case per https://www.python.org/dev/peps/pep-0008/#constants (In reply to dewei_zhu from comment #8) > Comment on attachment 420545 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=420545&action=review > > >> Websites/perf.webkit.org/tools/sync-commits.py:263 > >> + _IDENTIFIER_RE = re.compile(r'Canonical link: https://commits.webkit.org/(?P<identifier>[\d]+@[a-zA-Z\s\-_]+)\n') > > > > Use lowercase? > > If this is meant to be a constant, I think python PEP8 recommends it to be > all upper case per https://www.python.org/dev/peps/pep-0008/#constants Huh, I'm pretty sure we used not to do that but I guess we're doing that now. Okay. Comment on attachment 420545 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420545&action=review >>>> Websites/perf.webkit.org/tools/sync-commits.py:263 >>>> + _IDENTIFIER_RE = re.compile(r'Canonical link: https://commits.webkit.org/(?P<identifier>[\d]+@[a-zA-Z\s\-_]+)\n') >>> >>> Use lowercase? >> >> If this is meant to be a constant, I think python PEP8 recommends it to be all upper case per https://www.python.org/dev/peps/pep-0008/#constants > > Huh, I'm pretty sure we used not to do that but I guess we're doing that now. Okay. It should probably be IDENTIFIER_PATTERN in that case per PEP8 style. Created attachment 420762 [details]
Patch
I will add unit test in another patch, will also need firstly update the unit test to make it can work with up to date nodejs (In reply to Ryosuke Niwa from comment #7) > Comment on attachment 420545 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=420545&action=review > > > Websites/perf.webkit.org/public/include/commit-updater.php:158 > > - require_format('Revision', $commit_info['revision'], '/^[A-Za-z0-9 \.]+$/'); > > + require_format('Revision', $commit_info['revision'], '/^[@A-Za-z0-9 \.]+$/'); > > I don't think this is quite right. We don't want to allow arbitrary number > of @s. > We should be using a similar expression to the one in sync-commits.py > > > Websites/perf.webkit.org/public/include/commit-updater.php:161 > > + if (array_key_exists('previousCommit', $commit_info) && $commit_info['previousCommit']) > > Please add a test. Also, doesn't this allow previousCommit of 0 or false to > be treated like null? > I think we should probably use is_null here or === NULL. > > > Websites/perf.webkit.org/tools/sync-commits.py:263 > > + _IDENTIFIER_RE = re.compile(r'Canonical link: https://commits.webkit.org/(?P<identifier>[\d]+@[a-zA-Z\s\-_]+)\n') > > Use lowercase? Maybe we want to land the change that fixes unit tests to be compatible with newer version of nodejs first. I think it would be cleaner to keep the fix and the unit tests those are associated with the fix under the same patch. (In reply to dewei_zhu from comment #13) > Maybe we want to land the change that fixes unit tests to be compatible with > newer version of nodejs first. > > I think it would be cleaner to keep the fix and the unit tests those are > associated with the fix under the same patch. Yes, we should do that. Created attachment 423083 [details]
Patch
Comment on attachment 423083 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=423083&action=review r- because I don't think we want to land this as is. > Websites/perf.webkit.org/tools/sync-commits.py:46 > + return GitRepository(name=repository['name'], git_url=repository['url'], git_checkout=repository['gitCheckout'], git_branch=repository.get('branch'), fetch_revision_label=repository.get("fetchRevisionLabel", False)) This isn't right. We aren't really fetching anything. If anything this is really about whether repository has revision identifier or not. > Websites/perf.webkit.org/tools/sync-commits.py:194 > + REVISION_LABEL_RE = re.compile(r'Canonical link: https://commits.webkit.org/(?P<revision_label>\d+@\S+)\n') I don't think it's right that we're hard-coding into the script. This script is supposed to work with any repository. This should probably be specified in fetch_revision_label in the config file. > Websites/perf.webkit.org/tools/sync-commits.py:245 > + if self._fetch_revision_label: > + revision_label = self.REVISION_LABEL_RE.search(message).group('revision_label') > + result['revisionLabel'] = revision_label This is unnecessarily complex. Given our API is perfectly capable of accepting null, we should just do this: 'revisionIdentifier': ~ Created attachment 425478 [details]
Patch
Comment on attachment 425478 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=425478&action=review > Websites/perf.webkit.org/tools/sync-commits.py:198 > + def __init__(self, name, git_checkout, git_url, git_branch=None, revision_identifier=False): `revision_identifier` sounds like it an identifier, but it's actually a boolean. So how about `include_revision_identifier`? Also, do we expect changes on `load_repository` ? Stepping back a bit, do we really need this variable? Can we always include `revisionIdentifier` when it's available rather than using this flag? (In reply to dewei_zhu from comment #19) > Comment on attachment 425478 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=425478&action=review > > > Websites/perf.webkit.org/tools/sync-commits.py:198 > > + def __init__(self, name, git_checkout, git_url, git_branch=None, revision_identifier=False): > > `revision_identifier` sounds like it an identifier, but it's actually a > boolean. So how about `include_revision_identifier`? > Also, do we expect changes on `load_repository` ? > Stepping back a bit, do we really need this variable? Can we always include > `revisionIdentifier` when it's available rather than using this flag? I did this because this code will be shared with other git repo, it won't include this in the commit message, running regular expression against something that I know it won't exist will be a waste of time. I think I should rename this to get_revision_identifier_from_commit_message. Created attachment 425522 [details]
Patch
Created attachment 426029 [details]
Patch
Any further comments about this ? I have made this script raise an error when we cannot find the revision identifier, I know our API is smart enough to handle if we have revision identifier or not, but the real issue is if we expected there is a revision identifier but we never find it, there must be a issue in our infra, the commit with out revision identifier will be rewrite, this will gives us a new git hash. I think we should not report this bad commit. Created attachment 426032 [details]
Patch
Created attachment 426034 [details]
Patch
Also, add the sync option for the SVN as well, we currently still sync with SVN instead of the git repo Comment on attachment 426034 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=426034&action=review > Websites/perf.webkit.org/tools/sync-commits.py:21 > +# Canonical link: https://commits.webkit.org/https://commits.webkit.org/232477@main > +REVISION_IDENTIFIER_RE = re.compile(r'Canonical link: (https://commits.webkit.org/)+(?P<revision_identifier>\d+@[\w\.\-]+)\n') Surely, you meant `(https://commits.webkit.org/)?`, not (~)+. > Websites/perf.webkit.org/tools/sync-commits.py:131 > - def __init__(self, name, svn_url, should_trust_certificate, use_server_auth, account_name_script_path): > + def __init__(self, name, svn_url, should_trust_certificate, use_server_auth, account_name_script_path, get_revision_identifier_from_commit_msg): I'd call this report_revision_identifier_in_commit_msg instead. But it doesn't seem like this is ever set? > Websites/perf.webkit.org/tools/sync-commits.py:171 > + if revision_identifier_match: WebKit's preferred way of structuring code like this is to put unusual path as an early return. So we should have `if not revision_identifier_match` then raise and shouldn't have any else clause. > Websites/perf.webkit.org/tools/sync-commits.py:177 > + # one useless. If we detect this, we should not report this commit. This comment is very wordy. It's probably better this information in the exception itself. > Websites/perf.webkit.org/tools/sync-commits.py:178 > + raise ValueError('Expected commit message includes revision identifier, but cannot find it') Something like this: `Expected commit message includes revision identifier, but did not find - may have been caused by a history rewrite` > Websites/perf.webkit.org/tools/sync-commits.py:255 > + if revision_identifier_match: Ditto. > Websites/perf.webkit.org/tools/sync-commits.py:258 > + # This is due to we may have sync error for the commit, if we don't have the revision Ditto. (In reply to Ryosuke Niwa from comment #27) > Comment on attachment 426034 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=426034&action=review > > > Websites/perf.webkit.org/tools/sync-commits.py:21 > > +# Canonical link: https://commits.webkit.org/https://commits.webkit.org/232477@main > > +REVISION_IDENTIFIER_RE = re.compile(r'Canonical link: (https://commits.webkit.org/)+(?P<revision_identifier>\d+@[\w\.\-]+)\n') > > Surely, you meant `(https://commits.webkit.org/)?`, not (~)+. > > > Websites/perf.webkit.org/tools/sync-commits.py:131 > > - def __init__(self, name, svn_url, should_trust_certificate, use_server_auth, account_name_script_path): > > + def __init__(self, name, svn_url, should_trust_certificate, use_server_auth, account_name_script_path, get_revision_identifier_from_commit_msg): > > I'd call this report_revision_identifier_in_commit_msg instead. > But it doesn't seem like this is ever set? > > > Websites/perf.webkit.org/tools/sync-commits.py:171 > > + if revision_identifier_match: > > WebKit's preferred way of structuring code like this is to put unusual path > as an early return. > So we should have `if not revision_identifier_match` then raise and > shouldn't have any else clause. > > > Websites/perf.webkit.org/tools/sync-commits.py:177 > > + # one useless. If we detect this, we should not report this commit. > > This comment is very wordy. It's probably better this information in the > exception itself. > > > Websites/perf.webkit.org/tools/sync-commits.py:178 > > + raise ValueError('Expected commit message includes revision identifier, but cannot find it') > > Something like this: > `Expected commit message includes revision identifier, but did not find - > may have been caused by a history rewrite` OK, but this won't be caused by a history rewrite, we will need a history rewrite to fix this, therefore this commit is invalid after that rewrite. I will change this to something like: `Expected commit message includes revision identifier, but did not find - need a history rewrite to fix it` > > > Websites/perf.webkit.org/tools/sync-commits.py:255 > > + if revision_identifier_match: > > Ditto. > > > Websites/perf.webkit.org/tools/sync-commits.py:258 > > + # This is due to we may have sync error for the commit, if we don't have the revision > > Ditto. (In reply to Zhifei Fang from comment #28) > (In reply to Ryosuke Niwa from comment #27) > > Comment on attachment 426034 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=426034&action=review > > > > > Websites/perf.webkit.org/tools/sync-commits.py:21 > > > +# Canonical link: https://commits.webkit.org/https://commits.webkit.org/232477@main > > > +REVISION_IDENTIFIER_RE = re.compile(r'Canonical link: (https://commits.webkit.org/)+(?P<revision_identifier>\d+@[\w\.\-]+)\n') > > > > Surely, you meant `(https://commits.webkit.org/)?`, not (~)+. > > > > > Websites/perf.webkit.org/tools/sync-commits.py:131 > > > - def __init__(self, name, svn_url, should_trust_certificate, use_server_auth, account_name_script_path): > > > + def __init__(self, name, svn_url, should_trust_certificate, use_server_auth, account_name_script_path, get_revision_identifier_from_commit_msg): > > > > I'd call this report_revision_identifier_in_commit_msg instead. > > But it doesn't seem like this is ever set? > > > > > Websites/perf.webkit.org/tools/sync-commits.py:171 > > > + if revision_identifier_match: > > > > WebKit's preferred way of structuring code like this is to put unusual path > > as an early return. > > So we should have `if not revision_identifier_match` then raise and > > shouldn't have any else clause. > > > > > Websites/perf.webkit.org/tools/sync-commits.py:177 > > > + # one useless. If we detect this, we should not report this commit. > > > > This comment is very wordy. It's probably better this information in the > > exception itself. > > > > > Websites/perf.webkit.org/tools/sync-commits.py:178 > > > + raise ValueError('Expected commit message includes revision identifier, but cannot find it') > > > > Something like this: > > `Expected commit message includes revision identifier, but did not find - > > may have been caused by a history rewrite` > > OK, but this won't be caused by a history rewrite, we will need a history > rewrite to fix this, therefore this commit is invalid after that rewrite. > > I will change this to something like: > `Expected commit message includes revision identifier, but did not find - > need a history rewrite to fix it` I see, yeah, that's a way clearer. Created attachment 426074 [details]
Patch
After test I find that in the SVN repo we won't include the revision identifier, so we will need to report SVN revision and revision identifier from the git repo. After we switch everything to git only, just switch off the new option I added, it will sync with git hash again. Any more comments ? Comment on attachment 426074 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=426074&action=review Sorry for the delay. I got caught up with other things. > Websites/perf.webkit.org/ChangeLog:2 > +2021-04-14 Zhifei Fang <zhifei_fang@apple.com> > +Add support for syncing repo with commit revision label Something went awfully wrong with this change log entry. > Websites/perf.webkit.org/tools/sync-commits.py:39 > for repository in repositories: > - try: > repository.fetch_commits_and_submit(server_config, args.max_fetch_count, args.max_ancestor_fetch_count) Nit: wrong indentation. > Websites/perf.webkit.org/tools/sync-commits.py:47 > - return GitRepository(name=repository['name'], git_url=repository['url'], git_checkout=repository['gitCheckout'], git_branch=repository.get('branch')) > + return GitRepository(name=repository['name'], git_url=repository['url'], git_checkout=repository['gitCheckout'], git_branch=repository.get('branch'), report_revision_identifier_in_commit_msg=repository.get('reportRevisionIdentifier'), report_svn_revison=repository.get('reportSVNRevision')) This line is way too long! Please insert a hard line break somewhere. > Websites/perf.webkit.org/tools/sync-commits.py:212 > + # translate the last svn revision to git hash This kind of "what" comment is a code smell. We should be extracting this logic as a helper method instead. > Websites/perf.webkit.org/tools/sync-commits.py:216 > + last_fetched_git_hash = self._run_git_command(['svn', 'find-rev', 'r{}'.format(last_fetched)]).strip() > + if not last_fetched_git_hash: > + self._fetch_all_hashes() > + last_fetched_git_hash = self._run_git_command(['svn', 'find-rev', 'r{}'.format(last_fetched)]).strip() What is this retry logic about? When is this useful? > Websites/perf.webkit.org/tools/sync-commits.py:219 > + last_fetched = last_fetched_git_hash Why are we updating last_fetched? This is just wrong. last_fetched comes from the server, it's nothing to do with the local state of git clone or git svn state. This script is meant to be fault tolerant, meaning that if the script had failed / got killed mid way, it can continue syncing the commit info from where it left off. r- because of this. > Websites/perf.webkit.org/tools/sync-commits.py:235 > + def _get_svn_revision_from_message(self, message): Nit: our coding style guidelines mandates that we do not prefix a getter function with "get" unless it has an out argument. Here, it should read _svn_revision_from_commit_message instead https://webkit.org/code-style-guidelines/#names-out-argument > Websites/perf.webkit.org/tools/sync-commits.py:239 > + if svn_revision_match: > + return svn_revision_match.group('svn_revision') > + return None We prefer early exit over nested if. i.e. normal flow of logic should continue. So this should read: if not svn_revision_match: return None return svn_revision_match.group('svn_revision') or better yet: return svn_revision_match.group('svn_revision') if svn_revision_match else None > Websites/perf.webkit.org/tools/sync-commits.py:260 > + svn_revision = self._get_svn_revision_from_message(message) Why are we parsing subversion revision from git commit? Can't we just run git svn find-rev instead? It can convert git hash to subversion revision as well as subversion to git hash. > Websites/perf.webkit.org/tools/sync-commits.py:269 > + 'revision': current_hash if not svn_revision else svn_revision, > + 'revisionIdentifier': revision_identifier, > + 'previousCommit': previous_hash if not previous_revision else previous_revision, This isn't right. We can't switch between git hashes and revisions in a single repository. That would cause all sorts of havoc down the line. We should instead error and exit if _report_svn_revision is specified but we didn't find a subversion information. r- because of this. Created attachment 427603 [details]
Patch
Comment on attachment 427603 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427603&action=review > Websites/perf.webkit.org/ChangeLog:3 > +2021-04-14 Zhifei Fang <zhifei_fang@apple.com> > +Add support for syncing repo with commit revision label > + https://bugs.webkit.org/show_bug.cgi?id=221982 This is not formatted correctly. > Websites/perf.webkit.org/tools/sync-commits.py:21 > +REVISION_IDENTIFIER_RE = re.compile(r'Canonical link: (https://commits.webkit.org/)+(?P<revision_identifier>\d+@[\w\.\-]+)\n') I believe the period characters in "commits.webkig.org" are going to compile as single character wild cards, not periods, right? I am pretty sure that is not what we want. ah, good catch! (In reply to Darin Adler from comment #35) > Comment on attachment 427603 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=427603&action=review > > > Websites/perf.webkit.org/ChangeLog:3 > > +2021-04-14 Zhifei Fang <zhifei_fang@apple.com> > > +Add support for syncing repo with commit revision label > > + https://bugs.webkit.org/show_bug.cgi?id=221982 > > This is not formatted correctly. > > > Websites/perf.webkit.org/tools/sync-commits.py:21 > > +REVISION_IDENTIFIER_RE = re.compile(r'Canonical link: (https://commits.webkit.org/)+(?P<revision_identifier>\d+@[\w\.\-]+)\n') > > I believe the period characters in "commits.webkig.org" are going to compile > as single character wild cards, not periods, right? I am pretty sure that is > not what we want. Created attachment 427610 [details]
Patch
Comment on attachment 427610 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427610&action=review > Websites/perf.webkit.org/ChangeLog:9 > + Add new syncing option reportSVNRevision, which will let the script try to report SVN revision from a git-svn repo Nit: missing '.' by the end of this sentence. > Websites/perf.webkit.org/tools/sync-commits.py:38 > repository.fetch_commits_and_submit(server_config, args.max_fetch_count, args.max_ancestor_fetch_count) Do we need to change the indentation if we remove `try - except` block? Also, is the idea of removing those to stop the script if any error occurs? Comment on attachment 427610 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427610&action=review >> Websites/perf.webkit.org/tools/sync-commits.py:38 >> repository.fetch_commits_and_submit(server_config, args.max_fetch_count, args.max_ancestor_fetch_count) > > Do we need to change the indentation if we remove `try - except` block? > Also, is the idea of removing those to stop the script if any error occurs? I don't think we want to make this change. > Websites/perf.webkit.org/tools/sync-commits.py:46 > - return GitRepository(name=repository['name'], git_url=repository['url'], git_checkout=repository['gitCheckout'], git_branch=repository.get('branch')) > + return GitRepository(name=repository['name'], git_url=repository['url'], git_checkout=repository['gitCheckout'], git_branch=repository.get('branch'), report_revision_identifier_in_commit_msg=repository.get('reportRevisionIdentifier'), report_svn_revison=repository.get('reportSVNRevision')) Please insert a line break somewhere. This is awfully long. > Websites/perf.webkit.org/tools/sync-commits.py:211 > + # translate the last svn revision to git hash Rather than adding a comment like this, we should move the code inside this if block as a helper method: _fetch_git_hash_from_svn_revision > Websites/perf.webkit.org/tools/sync-commits.py:214 > + self._fetch_all_hashes() I don't think we want to tokenize the entire history just because we're missing a few subversion revisions. Just run git svn fetch instead. > Websites/perf.webkit.org/tools/sync-commits.py:217 > + raise ValueError('Cannot find git hash for last fetched svn revision') Cannot find *the* git hash for *the* last fetched svn revision. > Websites/perf.webkit.org/tools/sync-commits.py:234 > + def _git_hash_to_svn_revision(self, git_hash): To be consistent, call this _svn_revision_from_git_hash > Websites/perf.webkit.org/tools/sync-commits.py:237 > + def _svn_revision_to_git_hash(self, revision): And _git_hash_from_svn_revision > Websites/perf.webkit.org/tools/sync-commits.py:253 > + raise ValueError('Expected commit message includes revision identifier, but cannot find it, will need a history rewrite to fix it') Expected commit message *to include* a revision identifier but did not find one. Need a history rewrite to fix it. "cannot find" doesn't sound right because it's not like we failed to fetch the commit message. Also, we need to end the sentence after "find one". It's a run-on sentence as is. > Websites/perf.webkit.org/tools/sync-commits.py:257 > + svn_revision = None > + previous_revision = None I don't think there is a need to define a new set of local variables. Just rename the existing variables to current_revision & previous_revision since "revision" is the term we're using in the perf dashboard. > Websites/perf.webkit.org/tools/sync-commits.py:261 > + raise ValueError('Expected this is a git svn repo, but cannot find SVN revison for {}'.format(svn_revision)) I don't think there is any need to say we expected this to be a git svn repository. Just say "Cannot find svn revision for X. Also, we should be including current_hash, not svn_revision, which is what we couldn't figure out. > Websites/perf.webkit.org/tools/sync-commits.py:265 > + raise ValueError('Expected this is a git svn repo, but cannot find SVN revison for {}'.format(previous_hash)) Ditto. > Websites/perf.webkit.org/tools/sync-commits.py:269 > + 'revision': current_hash if not svn_revision else svn_revision, We don't want to rely on the value of revision we found to determine whether we use hash vs. revision. This can lead to bugs like submitting hash when we should be submitting a revision when _git_hash_to_svn_revision starts returning falsey values or some code change is made above such that we don't always set a value to svn_revision when self._report_svn_revision is set. Instead, check the value of self._report_svn_revision directly or merge these two variables as I suggested above. > Websites/perf.webkit.org/tools/sync-commits.py:271 > + 'previousCommit': previous_hash if not previous_revision else previous_revision, Ditto. Created attachment 427699 [details]
Patch
Comment on attachment 427699 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427699&action=review r-. Please add all my comments above and below. > Websites/perf.webkit.org/tools/sync-commits.py:-38 > - try: > repository.fetch_commits_and_submit(server_config, args.max_fetch_count, args.max_ancestor_fetch_count) > - except Exception as error: > - print "Failed to fetch and sync:", error Again, I don't think we want to make this change. > Websites/perf.webkit.org/tools/sync-commits.py:48 > + return GitRepository(name=repository['name'], git_url=repository['url'], git_checkout=repository['gitCheckout'], > + git_branch=repository.get('branch'), report_revision_identifier_in_commit_msg=repository.get('reportRevisionIdentifier'), > + report_svn_revison=repository.get('reportSVNRevision')) Nit: wrong indentation. Despite of what PEP8 says, we don't align arguments. We always indent by 4 spaces instead. > Websites/perf.webkit.org/tools/sync-commits.py:214 > + # translate the last svn revision to git hash > + last_fetched_git_hash = self._git_hash_from_svn_revision(last_fetched) Again, this should be extracted as a helper method instead of adding a comment. > Websites/perf.webkit.org/tools/sync-commits.py:216 > + self._fetch_remote() Again, we don't want to fetch all git commits just because we can't find some svn revisions. Comment on attachment 427699 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427699&action=review >> Websites/perf.webkit.org/tools/sync-commits.py:216 >> + self._fetch_remote() > > Again, we don't want to fetch all git commits just because we can't find some svn revisions. I'm not sure I understand the objection here. If we fail to find the git hash from the provided Svn revision, we clearly need to update our checkout. Comment on attachment 427699 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427699&action=review >> Websites/perf.webkit.org/tools/sync-commits.py:216 >> + self._fetch_remote() > > Again, we don't want to fetch all git commits just because we can't find some svn revisions. I don't understand here, if we cannot find it at the first time, this means we don't have it locally, we should pull the remote to get the commits. this happens when you have a git repo out of sync with database. In such case, if we don't fetch the remote, you can find nothing by using git svn Comment on attachment 427699 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427699&action=review >>>> Websites/perf.webkit.org/tools/sync-commits.py:216 >>>> + self._fetch_remote() >>> >>> Again, we don't want to fetch all git commits just because we can't find some svn revisions. >> >> I'm not sure I understand the objection here. If we fail to find the git hash from the provided Svn revision, we clearly need to update our checkout. > > I don't understand here, if we cannot find it at the first time, this means we don't have it locally, we should pull the remote to get the commits. this happens when you have a git repo out of sync with database. In such case, if we don't fetch the remote, you can find nothing by using git svn Oh, sorry, I misread the code. This new code is fine. Created attachment 427705 [details]
Patch
Comment on attachment 427699 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427699&action=review >> Websites/perf.webkit.org/tools/sync-commits.py:214 >> + last_fetched_git_hash = self._git_hash_from_svn_revision(last_fetched) > > Again, this should be extracted as a helper method instead of adding a comment. Not really understand what should be extracted as a helper method, Do you mean translate the svn revision to git hash ? That is already a function. Or do you mean the line from 214 to 220? If so I am also confused, I think you have suggested me that we should not make any functions if there is only one caller, it will have some potential callers in future. And the logic here seems have no other caller. Created attachment 427715 [details]
Patch
(In reply to Zhifei Fang from comment #46) > Comment on attachment 427699 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=427699&action=review > > >> Websites/perf.webkit.org/tools/sync-commits.py:214 > >> + last_fetched_git_hash = self._git_hash_from_svn_revision(last_fetched) > > > > Again, this should be extracted as a helper method instead of adding a comment. > > Not really understand what should be extracted as a helper method, Do you > mean translate the svn revision to git hash ? That is already a function. Or > do you mean the line from 214 to 220? If so I am also confused, I think you > have suggested me that we should not make any functions if there is only one > caller, it will have some potential callers in future. And the logic here > seems have no other caller. Lines 217 through 224. Either that should be extracted as a helper method or the comment should be removed. It's not okay to add a "what" comment like that. Comment on attachment 427715 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427715&action=review > Websites/perf.webkit.org/tools/sync-commits.py:217 > + # translate the last svn revision to git hash This comment should be either removed or the following code should be extracted as a helper method. r- because of this. Created attachment 427723 [details]
Patch
Removed. (In reply to Ryosuke Niwa from comment #49) > Comment on attachment 427715 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=427715&action=review > > > Websites/perf.webkit.org/tools/sync-commits.py:217 > > + # translate the last svn revision to git hash > > This comment should be either removed or the following code should be > extracted as a helper method. > r- because of this. Comment on attachment 427723 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427723&action=review > Websites/perf.webkit.org/tools/sync-commits.py:240 > + return self._run_git_command(['svn', 'find-rev', '{}'.format(git_hash)]).strip() What is the point of calling format on '{}'? Why isn't this simply git_hash? Created attachment 427726 [details]
Patch
Comment on attachment 427726 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427726&action=review > Websites/perf.webkit.org/tools/sync-commits.py:266 > + raise ValueError('Cannot find SVN revison for {}'.format(current_revision)) This should be current_hash Created attachment 427728 [details]
Patch
Comment on attachment 427728 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427728&action=review > Websites/perf.webkit.org/tools/sync-commits.py:258 > + raise ValueError('Expected commit message to include revision identifier, but cannot find it, will need a history rewrite to fix it') Ugh... sorry, I keep noticing new issues. Please see my earlier comment about revising this message. (In reply to Ryosuke Niwa from comment #56) Please see my previous comment about it: """ > > Websites/perf.webkit.org/tools/sync-commits.py:178 > > + raise ValueError('Expected commit message includes revision identifier, but cannot find it') > > Something like this: > `Expected commit message includes revision identifier, but did not find - > may have been caused by a history rewrite` OK, but this won't be caused by a history rewrite, we will need a history rewrite to fix this, therefore this commit is invalid after that rewrite. I will change this to something like: `Expected commit message includes revision identifier, but did not find - need a history rewrite to fix it` """ > Comment on attachment 427728 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=427728&action=review > > > Websites/perf.webkit.org/tools/sync-commits.py:258 > > + raise ValueError('Expected commit message to include revision identifier, but cannot find it, will need a history rewrite to fix it') > > Ugh... sorry, I keep noticing new issues. Please see my earlier comment > about revising this message. Committed r277220 (237491@main): <https://commits.webkit.org/237491@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 427728 [details]. *** Bug 222608 has been marked as a duplicate of this bug. *** |