RESOLVED FIXED Bug 221982
Add support for syncing repo with commit revision label
https://bugs.webkit.org/show_bug.cgi?id=221982
Summary Add support for syncing repo with commit revision label
Zhifei Fang
Reported 2021-02-16 11:45:31 PST
Add support for syncing repo with commit identifier
Attachments
Patch (5.35 KB, patch)
2021-02-16 11:54 PST, Zhifei Fang
no flags
Patch (5.43 KB, patch)
2021-02-16 14:57 PST, Zhifei Fang
no flags
Patch (7.15 KB, patch)
2021-02-17 17:02 PST, Zhifei Fang
no flags
Patch (3.61 KB, patch)
2021-03-12 14:31 PST, Zhifei Fang
no flags
Patch (2.98 KB, patch)
2021-04-07 22:36 PDT, Zhifei Fang
no flags
Patch (2.95 KB, patch)
2021-04-08 11:03 PDT, Zhifei Fang
no flags
Patch (3.58 KB, patch)
2021-04-14 12:17 PDT, Zhifei Fang
no flags
Patch (6.00 KB, patch)
2021-04-14 12:35 PDT, Zhifei Fang
no flags
Patch (6.02 KB, patch)
2021-04-14 12:36 PDT, Zhifei Fang
no flags
Patch (7.54 KB, patch)
2021-04-14 22:03 PDT, Zhifei Fang
no flags
Patch (7.67 KB, patch)
2021-05-03 14:31 PDT, Zhifei Fang
no flags
Patch (7.69 KB, patch)
2021-05-03 16:01 PDT, Zhifei Fang
no flags
Patch (7.87 KB, patch)
2021-05-04 14:20 PDT, Zhifei Fang
no flags
Patch (7.37 KB, patch)
2021-05-04 15:55 PDT, Zhifei Fang
no flags
Patch (7.35 KB, patch)
2021-05-04 17:46 PDT, Zhifei Fang
no flags
Patch (7.29 KB, patch)
2021-05-04 18:53 PDT, Zhifei Fang
no flags
Patch (7.28 KB, patch)
2021-05-04 19:00 PDT, Zhifei Fang
no flags
Patch (7.27 KB, patch)
2021-05-04 20:06 PDT, Zhifei Fang
no flags
Zhifei Fang
Comment 1 2021-02-16 11:54:18 PST
Ryosuke Niwa
Comment 2 2021-02-16 13:10:14 PST
Comment on attachment 420515 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420515&action=review > Websites/perf.webkit.org/tools/sync-commits.py:263 > + IDENTIFIER_RE = re.compile(r'Canonical link: https://commits.webkit.org/([\d]+@[a-zA-Z\s\-_]+)\n') Use lowercase and prefix with _ to be consistent with other code in this file. > Websites/perf.webkit.org/tools/sync-commits.py:284 > + identifier = self.IDENTIFIER_RE.search(message).group(1) Please use named group: ?P<~> > Websites/perf.webkit.org/tools/sync-commits.py:290 > + res = { Please don't abbreviate result / response as "res". > Websites/perf.webkit.org/tools/sync-commits.py:298 > + if previous_identifier: > + res['previousCommit'] = previous_identifier Please just always put previousCommit as we do in GitRepository's version.
Zhifei Fang
Comment 3 2021-02-16 13:12:55 PST
Comment on attachment 420515 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420515&action=review >> Websites/perf.webkit.org/tools/sync-commits.py:298 >> + res['previousCommit'] = previous_identifier > > Please just always put previousCommit as we do in GitRepository's version. This doesn't work for the first commit, please see validate_commits, it will validate the revision if the key exist. Either I change the API or change it here
Ryosuke Niwa
Comment 4 2021-02-16 13:50:40 PST
(In reply to Zhifei Fang from comment #3) > Comment on attachment 420515 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=420515&action=review > > >> Websites/perf.webkit.org/tools/sync-commits.py:298 > >> + res['previousCommit'] = previous_identifier > > > > Please just always put previousCommit as we do in GitRepository's version. > > This doesn't work for the first commit, please see validate_commits, it will > validate the revision if the key exist. Either I change the API or change it > here That seems to suggest that GitRepository._revision_from_tokens is also broken? I suggest we fix the API to allow null being specified in the previousCommit (with a test). It's awkward to disallow null there.
Zhifei Fang
Comment 5 2021-02-16 13:51:55 PST
Comment on attachment 420515 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420515&action=review >>>> Websites/perf.webkit.org/tools/sync-commits.py:298 >>>> + res['previousCommit'] = previous_identifier >>> >>> Please just always put previousCommit as we do in GitRepository's version. >> >> This doesn't work for the first commit, please see validate_commits, it will validate the revision if the key exist. Either I change the API or change it here > > That seems to suggest that GitRepository._revision_from_tokens is also broken? I suggest we fix the API to allow null being specified in the previousCommit (with a test). It's awkward to disallow null there. Yes... OK, I will change the API then
Zhifei Fang
Comment 6 2021-02-16 14:57:07 PST
Ryosuke Niwa
Comment 7 2021-02-16 15:18:05 PST
Comment on attachment 420545 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420545&action=review > Websites/perf.webkit.org/public/include/commit-updater.php:158 > - require_format('Revision', $commit_info['revision'], '/^[A-Za-z0-9 \.]+$/'); > + require_format('Revision', $commit_info['revision'], '/^[@A-Za-z0-9 \.]+$/'); I don't think this is quite right. We don't want to allow arbitrary number of @s. We should be using a similar expression to the one in sync-commits.py > Websites/perf.webkit.org/public/include/commit-updater.php:161 > + if (array_key_exists('previousCommit', $commit_info) && $commit_info['previousCommit']) Please add a test. Also, doesn't this allow previousCommit of 0 or false to be treated like null? I think we should probably use is_null here or === NULL. > Websites/perf.webkit.org/tools/sync-commits.py:263 > + _IDENTIFIER_RE = re.compile(r'Canonical link: https://commits.webkit.org/(?P<identifier>[\d]+@[a-zA-Z\s\-_]+)\n') Use lowercase?
dewei_zhu
Comment 8 2021-02-16 15:42:47 PST
Comment on attachment 420545 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420545&action=review >> Websites/perf.webkit.org/tools/sync-commits.py:263 >> + _IDENTIFIER_RE = re.compile(r'Canonical link: https://commits.webkit.org/(?P<identifier>[\d]+@[a-zA-Z\s\-_]+)\n') > > Use lowercase? If this is meant to be a constant, I think python PEP8 recommends it to be all upper case per https://www.python.org/dev/peps/pep-0008/#constants
Ryosuke Niwa
Comment 9 2021-02-16 15:51:26 PST
(In reply to dewei_zhu from comment #8) > Comment on attachment 420545 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=420545&action=review > > >> Websites/perf.webkit.org/tools/sync-commits.py:263 > >> + _IDENTIFIER_RE = re.compile(r'Canonical link: https://commits.webkit.org/(?P<identifier>[\d]+@[a-zA-Z\s\-_]+)\n') > > > > Use lowercase? > > If this is meant to be a constant, I think python PEP8 recommends it to be > all upper case per https://www.python.org/dev/peps/pep-0008/#constants Huh, I'm pretty sure we used not to do that but I guess we're doing that now. Okay.
Ryosuke Niwa
Comment 10 2021-02-16 15:51:56 PST
Comment on attachment 420545 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420545&action=review >>>> Websites/perf.webkit.org/tools/sync-commits.py:263 >>>> + _IDENTIFIER_RE = re.compile(r'Canonical link: https://commits.webkit.org/(?P<identifier>[\d]+@[a-zA-Z\s\-_]+)\n') >>> >>> Use lowercase? >> >> If this is meant to be a constant, I think python PEP8 recommends it to be all upper case per https://www.python.org/dev/peps/pep-0008/#constants > > Huh, I'm pretty sure we used not to do that but I guess we're doing that now. Okay. It should probably be IDENTIFIER_PATTERN in that case per PEP8 style.
Zhifei Fang
Comment 11 2021-02-17 17:02:16 PST
Zhifei Fang
Comment 12 2021-02-17 17:03:32 PST
I will add unit test in another patch, will also need firstly update the unit test to make it can work with up to date nodejs (In reply to Ryosuke Niwa from comment #7) > Comment on attachment 420545 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=420545&action=review > > > Websites/perf.webkit.org/public/include/commit-updater.php:158 > > - require_format('Revision', $commit_info['revision'], '/^[A-Za-z0-9 \.]+$/'); > > + require_format('Revision', $commit_info['revision'], '/^[@A-Za-z0-9 \.]+$/'); > > I don't think this is quite right. We don't want to allow arbitrary number > of @s. > We should be using a similar expression to the one in sync-commits.py > > > Websites/perf.webkit.org/public/include/commit-updater.php:161 > > + if (array_key_exists('previousCommit', $commit_info) && $commit_info['previousCommit']) > > Please add a test. Also, doesn't this allow previousCommit of 0 or false to > be treated like null? > I think we should probably use is_null here or === NULL. > > > Websites/perf.webkit.org/tools/sync-commits.py:263 > > + _IDENTIFIER_RE = re.compile(r'Canonical link: https://commits.webkit.org/(?P<identifier>[\d]+@[a-zA-Z\s\-_]+)\n') > > Use lowercase?
dewei_zhu
Comment 13 2021-02-17 18:21:03 PST
Maybe we want to land the change that fixes unit tests to be compatible with newer version of nodejs first. I think it would be cleaner to keep the fix and the unit tests those are associated with the fix under the same patch.
Ryosuke Niwa
Comment 14 2021-02-17 19:49:05 PST
(In reply to dewei_zhu from comment #13) > Maybe we want to land the change that fixes unit tests to be compatible with > newer version of nodejs first. > > I think it would be cleaner to keep the fix and the unit tests those are > associated with the fix under the same patch. Yes, we should do that.
Radar WebKit Bug Importer
Comment 15 2021-02-23 11:46:36 PST
Zhifei Fang
Comment 16 2021-03-12 14:31:04 PST
Ryosuke Niwa
Comment 17 2021-03-31 19:36:50 PDT
Comment on attachment 423083 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=423083&action=review r- because I don't think we want to land this as is. > Websites/perf.webkit.org/tools/sync-commits.py:46 > + return GitRepository(name=repository['name'], git_url=repository['url'], git_checkout=repository['gitCheckout'], git_branch=repository.get('branch'), fetch_revision_label=repository.get("fetchRevisionLabel", False)) This isn't right. We aren't really fetching anything. If anything this is really about whether repository has revision identifier or not. > Websites/perf.webkit.org/tools/sync-commits.py:194 > + REVISION_LABEL_RE = re.compile(r'Canonical link: https://commits.webkit.org/(?P<revision_label>\d+@\S+)\n') I don't think it's right that we're hard-coding into the script. This script is supposed to work with any repository. This should probably be specified in fetch_revision_label in the config file. > Websites/perf.webkit.org/tools/sync-commits.py:245 > + if self._fetch_revision_label: > + revision_label = self.REVISION_LABEL_RE.search(message).group('revision_label') > + result['revisionLabel'] = revision_label This is unnecessarily complex. Given our API is perfectly capable of accepting null, we should just do this: 'revisionIdentifier': ~
Zhifei Fang
Comment 18 2021-04-07 22:36:29 PDT
dewei_zhu
Comment 19 2021-04-08 00:25:30 PDT
Comment on attachment 425478 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=425478&action=review > Websites/perf.webkit.org/tools/sync-commits.py:198 > + def __init__(self, name, git_checkout, git_url, git_branch=None, revision_identifier=False): `revision_identifier` sounds like it an identifier, but it's actually a boolean. So how about `include_revision_identifier`? Also, do we expect changes on `load_repository` ? Stepping back a bit, do we really need this variable? Can we always include `revisionIdentifier` when it's available rather than using this flag?
Zhifei Fang
Comment 20 2021-04-08 11:00:45 PDT
(In reply to dewei_zhu from comment #19) > Comment on attachment 425478 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=425478&action=review > > > Websites/perf.webkit.org/tools/sync-commits.py:198 > > + def __init__(self, name, git_checkout, git_url, git_branch=None, revision_identifier=False): > > `revision_identifier` sounds like it an identifier, but it's actually a > boolean. So how about `include_revision_identifier`? > Also, do we expect changes on `load_repository` ? > Stepping back a bit, do we really need this variable? Can we always include > `revisionIdentifier` when it's available rather than using this flag? I did this because this code will be shared with other git repo, it won't include this in the commit message, running regular expression against something that I know it won't exist will be a waste of time. I think I should rename this to get_revision_identifier_from_commit_message.
Zhifei Fang
Comment 21 2021-04-08 11:03:41 PDT
Zhifei Fang
Comment 22 2021-04-14 12:17:18 PDT
Zhifei Fang
Comment 23 2021-04-14 12:29:10 PDT
Any further comments about this ? I have made this script raise an error when we cannot find the revision identifier, I know our API is smart enough to handle if we have revision identifier or not, but the real issue is if we expected there is a revision identifier but we never find it, there must be a issue in our infra, the commit with out revision identifier will be rewrite, this will gives us a new git hash. I think we should not report this bad commit.
Zhifei Fang
Comment 24 2021-04-14 12:35:15 PDT
Zhifei Fang
Comment 25 2021-04-14 12:36:35 PDT
Zhifei Fang
Comment 26 2021-04-14 12:38:00 PDT
Also, add the sync option for the SVN as well, we currently still sync with SVN instead of the git repo
Ryosuke Niwa
Comment 27 2021-04-14 15:22:31 PDT
Comment on attachment 426034 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=426034&action=review > Websites/perf.webkit.org/tools/sync-commits.py:21 > +# Canonical link: https://commits.webkit.org/https://commits.webkit.org/232477@main > +REVISION_IDENTIFIER_RE = re.compile(r'Canonical link: (https://commits.webkit.org/)+(?P<revision_identifier>\d+@[\w\.\-]+)\n') Surely, you meant `(https://commits.webkit.org/)?`, not (~)+. > Websites/perf.webkit.org/tools/sync-commits.py:131 > - def __init__(self, name, svn_url, should_trust_certificate, use_server_auth, account_name_script_path): > + def __init__(self, name, svn_url, should_trust_certificate, use_server_auth, account_name_script_path, get_revision_identifier_from_commit_msg): I'd call this report_revision_identifier_in_commit_msg instead. But it doesn't seem like this is ever set? > Websites/perf.webkit.org/tools/sync-commits.py:171 > + if revision_identifier_match: WebKit's preferred way of structuring code like this is to put unusual path as an early return. So we should have `if not revision_identifier_match` then raise and shouldn't have any else clause. > Websites/perf.webkit.org/tools/sync-commits.py:177 > + # one useless. If we detect this, we should not report this commit. This comment is very wordy. It's probably better this information in the exception itself. > Websites/perf.webkit.org/tools/sync-commits.py:178 > + raise ValueError('Expected commit message includes revision identifier, but cannot find it') Something like this: `Expected commit message includes revision identifier, but did not find - may have been caused by a history rewrite` > Websites/perf.webkit.org/tools/sync-commits.py:255 > + if revision_identifier_match: Ditto. > Websites/perf.webkit.org/tools/sync-commits.py:258 > + # This is due to we may have sync error for the commit, if we don't have the revision Ditto.
Zhifei Fang
Comment 28 2021-04-14 17:20:20 PDT
(In reply to Ryosuke Niwa from comment #27) > Comment on attachment 426034 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=426034&action=review > > > Websites/perf.webkit.org/tools/sync-commits.py:21 > > +# Canonical link: https://commits.webkit.org/https://commits.webkit.org/232477@main > > +REVISION_IDENTIFIER_RE = re.compile(r'Canonical link: (https://commits.webkit.org/)+(?P<revision_identifier>\d+@[\w\.\-]+)\n') > > Surely, you meant `(https://commits.webkit.org/)?`, not (~)+. > > > Websites/perf.webkit.org/tools/sync-commits.py:131 > > - def __init__(self, name, svn_url, should_trust_certificate, use_server_auth, account_name_script_path): > > + def __init__(self, name, svn_url, should_trust_certificate, use_server_auth, account_name_script_path, get_revision_identifier_from_commit_msg): > > I'd call this report_revision_identifier_in_commit_msg instead. > But it doesn't seem like this is ever set? > > > Websites/perf.webkit.org/tools/sync-commits.py:171 > > + if revision_identifier_match: > > WebKit's preferred way of structuring code like this is to put unusual path > as an early return. > So we should have `if not revision_identifier_match` then raise and > shouldn't have any else clause. > > > Websites/perf.webkit.org/tools/sync-commits.py:177 > > + # one useless. If we detect this, we should not report this commit. > > This comment is very wordy. It's probably better this information in the > exception itself. > > > Websites/perf.webkit.org/tools/sync-commits.py:178 > > + raise ValueError('Expected commit message includes revision identifier, but cannot find it') > > Something like this: > `Expected commit message includes revision identifier, but did not find - > may have been caused by a history rewrite` OK, but this won't be caused by a history rewrite, we will need a history rewrite to fix this, therefore this commit is invalid after that rewrite. I will change this to something like: `Expected commit message includes revision identifier, but did not find - need a history rewrite to fix it` > > > Websites/perf.webkit.org/tools/sync-commits.py:255 > > + if revision_identifier_match: > > Ditto. > > > Websites/perf.webkit.org/tools/sync-commits.py:258 > > + # This is due to we may have sync error for the commit, if we don't have the revision > > Ditto.
Ryosuke Niwa
Comment 29 2021-04-14 20:16:37 PDT
(In reply to Zhifei Fang from comment #28) > (In reply to Ryosuke Niwa from comment #27) > > Comment on attachment 426034 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=426034&action=review > > > > > Websites/perf.webkit.org/tools/sync-commits.py:21 > > > +# Canonical link: https://commits.webkit.org/https://commits.webkit.org/232477@main > > > +REVISION_IDENTIFIER_RE = re.compile(r'Canonical link: (https://commits.webkit.org/)+(?P<revision_identifier>\d+@[\w\.\-]+)\n') > > > > Surely, you meant `(https://commits.webkit.org/)?`, not (~)+. > > > > > Websites/perf.webkit.org/tools/sync-commits.py:131 > > > - def __init__(self, name, svn_url, should_trust_certificate, use_server_auth, account_name_script_path): > > > + def __init__(self, name, svn_url, should_trust_certificate, use_server_auth, account_name_script_path, get_revision_identifier_from_commit_msg): > > > > I'd call this report_revision_identifier_in_commit_msg instead. > > But it doesn't seem like this is ever set? > > > > > Websites/perf.webkit.org/tools/sync-commits.py:171 > > > + if revision_identifier_match: > > > > WebKit's preferred way of structuring code like this is to put unusual path > > as an early return. > > So we should have `if not revision_identifier_match` then raise and > > shouldn't have any else clause. > > > > > Websites/perf.webkit.org/tools/sync-commits.py:177 > > > + # one useless. If we detect this, we should not report this commit. > > > > This comment is very wordy. It's probably better this information in the > > exception itself. > > > > > Websites/perf.webkit.org/tools/sync-commits.py:178 > > > + raise ValueError('Expected commit message includes revision identifier, but cannot find it') > > > > Something like this: > > `Expected commit message includes revision identifier, but did not find - > > may have been caused by a history rewrite` > > OK, but this won't be caused by a history rewrite, we will need a history > rewrite to fix this, therefore this commit is invalid after that rewrite. > > I will change this to something like: > `Expected commit message includes revision identifier, but did not find - > need a history rewrite to fix it` I see, yeah, that's a way clearer.
Zhifei Fang
Comment 30 2021-04-14 22:03:43 PDT
Zhifei Fang
Comment 31 2021-04-14 22:05:32 PDT
After test I find that in the SVN repo we won't include the revision identifier, so we will need to report SVN revision and revision identifier from the git repo. After we switch everything to git only, just switch off the new option I added, it will sync with git hash again.
Zhifei Fang
Comment 32 2021-04-16 13:17:18 PDT
Any more comments ?
Ryosuke Niwa
Comment 33 2021-04-17 01:08:48 PDT
Comment on attachment 426074 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=426074&action=review Sorry for the delay. I got caught up with other things. > Websites/perf.webkit.org/ChangeLog:2 > +2021-04-14 Zhifei Fang <zhifei_fang@apple.com> > +Add support for syncing repo with commit revision label Something went awfully wrong with this change log entry. > Websites/perf.webkit.org/tools/sync-commits.py:39 > for repository in repositories: > - try: > repository.fetch_commits_and_submit(server_config, args.max_fetch_count, args.max_ancestor_fetch_count) Nit: wrong indentation. > Websites/perf.webkit.org/tools/sync-commits.py:47 > - return GitRepository(name=repository['name'], git_url=repository['url'], git_checkout=repository['gitCheckout'], git_branch=repository.get('branch')) > + return GitRepository(name=repository['name'], git_url=repository['url'], git_checkout=repository['gitCheckout'], git_branch=repository.get('branch'), report_revision_identifier_in_commit_msg=repository.get('reportRevisionIdentifier'), report_svn_revison=repository.get('reportSVNRevision')) This line is way too long! Please insert a hard line break somewhere. > Websites/perf.webkit.org/tools/sync-commits.py:212 > + # translate the last svn revision to git hash This kind of "what" comment is a code smell. We should be extracting this logic as a helper method instead. > Websites/perf.webkit.org/tools/sync-commits.py:216 > + last_fetched_git_hash = self._run_git_command(['svn', 'find-rev', 'r{}'.format(last_fetched)]).strip() > + if not last_fetched_git_hash: > + self._fetch_all_hashes() > + last_fetched_git_hash = self._run_git_command(['svn', 'find-rev', 'r{}'.format(last_fetched)]).strip() What is this retry logic about? When is this useful? > Websites/perf.webkit.org/tools/sync-commits.py:219 > + last_fetched = last_fetched_git_hash Why are we updating last_fetched? This is just wrong. last_fetched comes from the server, it's nothing to do with the local state of git clone or git svn state. This script is meant to be fault tolerant, meaning that if the script had failed / got killed mid way, it can continue syncing the commit info from where it left off. r- because of this. > Websites/perf.webkit.org/tools/sync-commits.py:235 > + def _get_svn_revision_from_message(self, message): Nit: our coding style guidelines mandates that we do not prefix a getter function with "get" unless it has an out argument. Here, it should read _svn_revision_from_commit_message instead https://webkit.org/code-style-guidelines/#names-out-argument > Websites/perf.webkit.org/tools/sync-commits.py:239 > + if svn_revision_match: > + return svn_revision_match.group('svn_revision') > + return None We prefer early exit over nested if. i.e. normal flow of logic should continue. So this should read: if not svn_revision_match: return None return svn_revision_match.group('svn_revision') or better yet: return svn_revision_match.group('svn_revision') if svn_revision_match else None > Websites/perf.webkit.org/tools/sync-commits.py:260 > + svn_revision = self._get_svn_revision_from_message(message) Why are we parsing subversion revision from git commit? Can't we just run git svn find-rev instead? It can convert git hash to subversion revision as well as subversion to git hash. > Websites/perf.webkit.org/tools/sync-commits.py:269 > + 'revision': current_hash if not svn_revision else svn_revision, > + 'revisionIdentifier': revision_identifier, > + 'previousCommit': previous_hash if not previous_revision else previous_revision, This isn't right. We can't switch between git hashes and revisions in a single repository. That would cause all sorts of havoc down the line. We should instead error and exit if _report_svn_revision is specified but we didn't find a subversion information. r- because of this.
Zhifei Fang
Comment 34 2021-05-03 14:31:07 PDT
Darin Adler
Comment 35 2021-05-03 14:41:05 PDT
Comment on attachment 427603 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427603&action=review > Websites/perf.webkit.org/ChangeLog:3 > +2021-04-14 Zhifei Fang <zhifei_fang@apple.com> > +Add support for syncing repo with commit revision label > + https://bugs.webkit.org/show_bug.cgi?id=221982 This is not formatted correctly. > Websites/perf.webkit.org/tools/sync-commits.py:21 > +REVISION_IDENTIFIER_RE = re.compile(r'Canonical link: (https://commits.webkit.org/)+(?P<revision_identifier>\d+@[\w\.\-]+)\n') I believe the period characters in "commits.webkig.org" are going to compile as single character wild cards, not periods, right? I am pretty sure that is not what we want.
Zhifei Fang
Comment 36 2021-05-03 15:41:11 PDT
ah, good catch! (In reply to Darin Adler from comment #35) > Comment on attachment 427603 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=427603&action=review > > > Websites/perf.webkit.org/ChangeLog:3 > > +2021-04-14 Zhifei Fang <zhifei_fang@apple.com> > > +Add support for syncing repo with commit revision label > > + https://bugs.webkit.org/show_bug.cgi?id=221982 > > This is not formatted correctly. > > > Websites/perf.webkit.org/tools/sync-commits.py:21 > > +REVISION_IDENTIFIER_RE = re.compile(r'Canonical link: (https://commits.webkit.org/)+(?P<revision_identifier>\d+@[\w\.\-]+)\n') > > I believe the period characters in "commits.webkig.org" are going to compile > as single character wild cards, not periods, right? I am pretty sure that is > not what we want.
Zhifei Fang
Comment 37 2021-05-03 16:01:00 PDT
dewei_zhu
Comment 38 2021-05-03 16:20:08 PDT
Comment on attachment 427610 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427610&action=review > Websites/perf.webkit.org/ChangeLog:9 > + Add new syncing option reportSVNRevision, which will let the script try to report SVN revision from a git-svn repo Nit: missing '.' by the end of this sentence. > Websites/perf.webkit.org/tools/sync-commits.py:38 > repository.fetch_commits_and_submit(server_config, args.max_fetch_count, args.max_ancestor_fetch_count) Do we need to change the indentation if we remove `try - except` block? Also, is the idea of removing those to stop the script if any error occurs?
Ryosuke Niwa
Comment 39 2021-05-03 17:27:49 PDT
Comment on attachment 427610 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427610&action=review >> Websites/perf.webkit.org/tools/sync-commits.py:38 >> repository.fetch_commits_and_submit(server_config, args.max_fetch_count, args.max_ancestor_fetch_count) > > Do we need to change the indentation if we remove `try - except` block? > Also, is the idea of removing those to stop the script if any error occurs? I don't think we want to make this change. > Websites/perf.webkit.org/tools/sync-commits.py:46 > - return GitRepository(name=repository['name'], git_url=repository['url'], git_checkout=repository['gitCheckout'], git_branch=repository.get('branch')) > + return GitRepository(name=repository['name'], git_url=repository['url'], git_checkout=repository['gitCheckout'], git_branch=repository.get('branch'), report_revision_identifier_in_commit_msg=repository.get('reportRevisionIdentifier'), report_svn_revison=repository.get('reportSVNRevision')) Please insert a line break somewhere. This is awfully long. > Websites/perf.webkit.org/tools/sync-commits.py:211 > + # translate the last svn revision to git hash Rather than adding a comment like this, we should move the code inside this if block as a helper method: _fetch_git_hash_from_svn_revision > Websites/perf.webkit.org/tools/sync-commits.py:214 > + self._fetch_all_hashes() I don't think we want to tokenize the entire history just because we're missing a few subversion revisions. Just run git svn fetch instead. > Websites/perf.webkit.org/tools/sync-commits.py:217 > + raise ValueError('Cannot find git hash for last fetched svn revision') Cannot find *the* git hash for *the* last fetched svn revision. > Websites/perf.webkit.org/tools/sync-commits.py:234 > + def _git_hash_to_svn_revision(self, git_hash): To be consistent, call this _svn_revision_from_git_hash > Websites/perf.webkit.org/tools/sync-commits.py:237 > + def _svn_revision_to_git_hash(self, revision): And _git_hash_from_svn_revision > Websites/perf.webkit.org/tools/sync-commits.py:253 > + raise ValueError('Expected commit message includes revision identifier, but cannot find it, will need a history rewrite to fix it') Expected commit message *to include* a revision identifier but did not find one. Need a history rewrite to fix it. "cannot find" doesn't sound right because it's not like we failed to fetch the commit message. Also, we need to end the sentence after "find one". It's a run-on sentence as is. > Websites/perf.webkit.org/tools/sync-commits.py:257 > + svn_revision = None > + previous_revision = None I don't think there is a need to define a new set of local variables. Just rename the existing variables to current_revision & previous_revision since "revision" is the term we're using in the perf dashboard. > Websites/perf.webkit.org/tools/sync-commits.py:261 > + raise ValueError('Expected this is a git svn repo, but cannot find SVN revison for {}'.format(svn_revision)) I don't think there is any need to say we expected this to be a git svn repository. Just say "Cannot find svn revision for X. Also, we should be including current_hash, not svn_revision, which is what we couldn't figure out. > Websites/perf.webkit.org/tools/sync-commits.py:265 > + raise ValueError('Expected this is a git svn repo, but cannot find SVN revison for {}'.format(previous_hash)) Ditto. > Websites/perf.webkit.org/tools/sync-commits.py:269 > + 'revision': current_hash if not svn_revision else svn_revision, We don't want to rely on the value of revision we found to determine whether we use hash vs. revision. This can lead to bugs like submitting hash when we should be submitting a revision when _git_hash_to_svn_revision starts returning falsey values or some code change is made above such that we don't always set a value to svn_revision when self._report_svn_revision is set. Instead, check the value of self._report_svn_revision directly or merge these two variables as I suggested above. > Websites/perf.webkit.org/tools/sync-commits.py:271 > + 'previousCommit': previous_hash if not previous_revision else previous_revision, Ditto.
Zhifei Fang
Comment 40 2021-05-04 14:20:09 PDT
Ryosuke Niwa
Comment 41 2021-05-04 15:19:25 PDT
Comment on attachment 427699 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427699&action=review r-. Please add all my comments above and below. > Websites/perf.webkit.org/tools/sync-commits.py:-38 > - try: > repository.fetch_commits_and_submit(server_config, args.max_fetch_count, args.max_ancestor_fetch_count) > - except Exception as error: > - print "Failed to fetch and sync:", error Again, I don't think we want to make this change. > Websites/perf.webkit.org/tools/sync-commits.py:48 > + return GitRepository(name=repository['name'], git_url=repository['url'], git_checkout=repository['gitCheckout'], > + git_branch=repository.get('branch'), report_revision_identifier_in_commit_msg=repository.get('reportRevisionIdentifier'), > + report_svn_revison=repository.get('reportSVNRevision')) Nit: wrong indentation. Despite of what PEP8 says, we don't align arguments. We always indent by 4 spaces instead. > Websites/perf.webkit.org/tools/sync-commits.py:214 > + # translate the last svn revision to git hash > + last_fetched_git_hash = self._git_hash_from_svn_revision(last_fetched) Again, this should be extracted as a helper method instead of adding a comment. > Websites/perf.webkit.org/tools/sync-commits.py:216 > + self._fetch_remote() Again, we don't want to fetch all git commits just because we can't find some svn revisions.
Jonathan Bedard
Comment 42 2021-05-04 15:50:16 PDT
Comment on attachment 427699 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427699&action=review >> Websites/perf.webkit.org/tools/sync-commits.py:216 >> + self._fetch_remote() > > Again, we don't want to fetch all git commits just because we can't find some svn revisions. I'm not sure I understand the objection here. If we fail to find the git hash from the provided Svn revision, we clearly need to update our checkout.
Zhifei Fang
Comment 43 2021-05-04 15:52:09 PDT
Comment on attachment 427699 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427699&action=review >> Websites/perf.webkit.org/tools/sync-commits.py:216 >> + self._fetch_remote() > > Again, we don't want to fetch all git commits just because we can't find some svn revisions. I don't understand here, if we cannot find it at the first time, this means we don't have it locally, we should pull the remote to get the commits. this happens when you have a git repo out of sync with database. In such case, if we don't fetch the remote, you can find nothing by using git svn
Ryosuke Niwa
Comment 44 2021-05-04 15:53:45 PDT
Comment on attachment 427699 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427699&action=review >>>> Websites/perf.webkit.org/tools/sync-commits.py:216 >>>> + self._fetch_remote() >>> >>> Again, we don't want to fetch all git commits just because we can't find some svn revisions. >> >> I'm not sure I understand the objection here. If we fail to find the git hash from the provided Svn revision, we clearly need to update our checkout. > > I don't understand here, if we cannot find it at the first time, this means we don't have it locally, we should pull the remote to get the commits. this happens when you have a git repo out of sync with database. In such case, if we don't fetch the remote, you can find nothing by using git svn Oh, sorry, I misread the code. This new code is fine.
Zhifei Fang
Comment 45 2021-05-04 15:55:13 PDT
Zhifei Fang
Comment 46 2021-05-04 17:45:15 PDT
Comment on attachment 427699 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427699&action=review >> Websites/perf.webkit.org/tools/sync-commits.py:214 >> + last_fetched_git_hash = self._git_hash_from_svn_revision(last_fetched) > > Again, this should be extracted as a helper method instead of adding a comment. Not really understand what should be extracted as a helper method, Do you mean translate the svn revision to git hash ? That is already a function. Or do you mean the line from 214 to 220? If so I am also confused, I think you have suggested me that we should not make any functions if there is only one caller, it will have some potential callers in future. And the logic here seems have no other caller.
Zhifei Fang
Comment 47 2021-05-04 17:46:27 PDT
Ryosuke Niwa
Comment 48 2021-05-04 18:52:19 PDT
(In reply to Zhifei Fang from comment #46) > Comment on attachment 427699 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=427699&action=review > > >> Websites/perf.webkit.org/tools/sync-commits.py:214 > >> + last_fetched_git_hash = self._git_hash_from_svn_revision(last_fetched) > > > > Again, this should be extracted as a helper method instead of adding a comment. > > Not really understand what should be extracted as a helper method, Do you > mean translate the svn revision to git hash ? That is already a function. Or > do you mean the line from 214 to 220? If so I am also confused, I think you > have suggested me that we should not make any functions if there is only one > caller, it will have some potential callers in future. And the logic here > seems have no other caller. Lines 217 through 224. Either that should be extracted as a helper method or the comment should be removed. It's not okay to add a "what" comment like that.
Ryosuke Niwa
Comment 49 2021-05-04 18:52:36 PDT
Comment on attachment 427715 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427715&action=review > Websites/perf.webkit.org/tools/sync-commits.py:217 > + # translate the last svn revision to git hash This comment should be either removed or the following code should be extracted as a helper method. r- because of this.
Zhifei Fang
Comment 50 2021-05-04 18:53:33 PDT
Zhifei Fang
Comment 51 2021-05-04 18:54:01 PDT
Removed. (In reply to Ryosuke Niwa from comment #49) > Comment on attachment 427715 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=427715&action=review > > > Websites/perf.webkit.org/tools/sync-commits.py:217 > > + # translate the last svn revision to git hash > > This comment should be either removed or the following code should be > extracted as a helper method. > r- because of this.
Ryosuke Niwa
Comment 52 2021-05-04 18:58:25 PDT
Comment on attachment 427723 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427723&action=review > Websites/perf.webkit.org/tools/sync-commits.py:240 > + return self._run_git_command(['svn', 'find-rev', '{}'.format(git_hash)]).strip() What is the point of calling format on '{}'? Why isn't this simply git_hash?
Zhifei Fang
Comment 53 2021-05-04 19:00:08 PDT
Ryosuke Niwa
Comment 54 2021-05-04 19:02:26 PDT
Comment on attachment 427726 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427726&action=review > Websites/perf.webkit.org/tools/sync-commits.py:266 > + raise ValueError('Cannot find SVN revison for {}'.format(current_revision)) This should be current_hash
Zhifei Fang
Comment 55 2021-05-04 20:06:15 PDT
Ryosuke Niwa
Comment 56 2021-05-04 23:13:06 PDT
Comment on attachment 427728 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=427728&action=review > Websites/perf.webkit.org/tools/sync-commits.py:258 > + raise ValueError('Expected commit message to include revision identifier, but cannot find it, will need a history rewrite to fix it') Ugh... sorry, I keep noticing new issues. Please see my earlier comment about revising this message.
Zhifei Fang
Comment 57 2021-05-07 16:24:02 PDT
(In reply to Ryosuke Niwa from comment #56) Please see my previous comment about it: """ > > Websites/perf.webkit.org/tools/sync-commits.py:178 > > + raise ValueError('Expected commit message includes revision identifier, but cannot find it') > > Something like this: > `Expected commit message includes revision identifier, but did not find - > may have been caused by a history rewrite` OK, but this won't be caused by a history rewrite, we will need a history rewrite to fix this, therefore this commit is invalid after that rewrite. I will change this to something like: `Expected commit message includes revision identifier, but did not find - need a history rewrite to fix it` """ > Comment on attachment 427728 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=427728&action=review > > > Websites/perf.webkit.org/tools/sync-commits.py:258 > > + raise ValueError('Expected commit message to include revision identifier, but cannot find it, will need a history rewrite to fix it') > > Ugh... sorry, I keep noticing new issues. Please see my earlier comment > about revising this message.
EWS
Comment 58 2021-05-07 18:08:16 PDT
Committed r277220 (237491@main): <https://commits.webkit.org/237491@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 427728 [details].
Zhifei Fang
Comment 59 2021-06-14 11:31:49 PDT
*** Bug 222608 has been marked as a duplicate of this bug. ***
Note You need to log in before you can comment on or make changes to this bug.