Bug 26283

Summary: WebKit needs scripts to auto-commit patches from the commit queue
Product: WebKit Reporter: Eric Seidel (no email) <eric>
Component: Tools / TestsAssignee: Eric Seidel (no email) <eric>
Status: RESOLVED FIXED    
Severity: Normal CC: aroben, ddkilzer, dglazkov, levin, mrowe, ojan, tony
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Mac   
OS: OS X 10.5   
Bug Depends on: 26524, 26299, 26300    
Bug Blocks:    
Attachments:
Description Flags
patch-url-from-bug: returns the URL of the r+'d patch on the bug
none
bugzilla-tool -- command line access to bugzilla (requires BeautifulSoup)
none
latest bugzilla-tool, can download patch, apply it, commit it, and update the bug (not ready for general usage yet)
none
bugzilla-tool -- has successfully landed patches, but is not ready for general use yet
none
bugzilla-tool, can now obsolete patches, and is divided into classes better
none
bugzilla-tool, now with better bug comments after commit
none
bugzilla-tool -- now able to post local diffs and commits as patches
none
bugzilla-tool -- now with fancy option parsing and excessive OOP ;)
none
Working bugzilla-tool for review (script posted this patch itself)
none
Updated patch, split into modules, now supports --force-clean
none
Updated patch, split into modules, now supports --force-clean
none
Updated to respect the current working directory instead of using ../..
none
Updated to respect the current working directory instead of using ../..
levin: review+
patch
mrowe: review+
patch mrowe: review+

Description Eric Seidel (no email) 2009-06-09 17:22:21 PDT
WebKit needs scripts to auto-commit patches from the commit queue

I plan to build some of these for myself and will post them here.  Starting with a script to get an approved patch URL out from a bug number.
Comment 1 Eric Seidel (no email) 2009-06-09 17:23:11 PDT
Created attachment 31114 [details]
patch-url-from-bug: returns the URL of the r+'d patch on the bug
Comment 2 Eric Seidel (no email) 2009-06-09 18:08:38 PDT
Steps I would like to eventually build a script to do:

Scan the commit queue, for each bug:
1.  curl down the patch
2.  apply it with svn-apply
3.  build the patch
4.  run the tests with the patch applied
5.  post to the bug the results of the run
6.  fix up the changelogs to include the reviewer name
7.  land the patch
8.  Update the bug with the landed revision number
9.  optionally close the bug or obsolete the patch
Comment 3 Eric Seidel (no email) 2009-06-10 15:21:56 PDT
Created attachment 31142 [details]
bugzilla-tool -- command line access to bugzilla (requires BeautifulSoup)

Commands:
apply-patch-from-bug
bugs-to-commit
patches-to-commit
reviewed-patches-on

Still needs lots of work.
Comment 4 Eric Seidel (no email) 2009-06-11 19:02:47 PDT
Created attachment 31189 [details]
latest bugzilla-tool, can download patch, apply it, commit it, and update the bug (not ready for general usage yet)
Comment 5 Eric Seidel (no email) 2009-06-12 02:21:48 PDT
I'm using bug 26301 for testing.
Comment 6 Eric Seidel (no email) 2009-06-12 15:43:25 PDT
Tada!  I have a working script!  see bug 26191
Comment 7 Eric Seidel (no email) 2009-06-12 17:28:51 PDT
Created attachment 31223 [details]
bugzilla-tool -- has successfully landed patches, but is not ready for general use yet
Comment 8 Eric Seidel (no email) 2009-06-15 18:36:31 PDT
Created attachment 31326 [details]
bugzilla-tool, can now obsolete patches, and is divided into classes better

Nearly ready for review.
Comment 9 Eric Seidel (no email) 2009-06-15 19:23:36 PDT
Another patch committed by the script: bug 25353.  Fixing the autogenerated comment to be nicer. :)
Comment 10 Eric Seidel (no email) 2009-06-15 19:28:20 PDT
Created attachment 31327 [details]
bugzilla-tool, now with better bug comments after commit
Comment 11 Eric Seidel (no email) 2009-06-17 18:24:50 PDT
Created attachment 31480 [details]
bugzilla-tool -- now able to post local diffs and commits as patches

No, I did not post this patch with the tool... yet. :)  It's not actually stored in my WebKit tree yet, rather in ~/bin/

I think the only thing left before it's really useful is option parsing for the individual commands.  I may clean it up and post it for landing as-is though.
Comment 12 Eric Seidel (no email) 2009-06-18 04:20:51 PDT
Created attachment 31491 [details]
bugzilla-tool -- now with fancy option parsing and excessive OOP ;)

This is nearly ready for prime-time.  I might add an SVN class before submitting this for review, we'll see.
Comment 13 Eric Seidel (no email) 2009-06-18 16:05:02 PDT
Created attachment 31516 [details]
Working bugzilla-tool for review (script posted this patch itself)


---
 2 files changed, 903 insertions(+), 0 deletions(-)
Comment 14 David Levin 2009-06-19 10:45:31 PDT
I tried using this tool.  My first comment is that it seemed to make things a lot easier.

However, after starting to use it, I had a few problems:

0. There seems to be a script error in the current version when applying patches, which I fixed by adding in "False" in line 577,         tool.scm().ensure_clean_working_directory(False)).

1. I couldn't put it in my directory because it wanted to clean everything (so I had to hack the way it found the webkit dir).

2. I couldn't apply multiple patches.
I did a git commit for the first patch and then tried to apply the second patch, but the tool reset everything back to trunk.  I commented out the stuff that did the clean and the git svn rebase.

3. I wish that I could apply patches with "--whitespace=fix" :) like I can when I use git apply.

4. It started emitting *.orig files when apply patches even when there didn't seem to be a conflict. This made me think something was going wrong.  (Of course this was when I tried to apply a second patch so maybe that was the issue.) 


Comment 15 Eric Seidel (no email) 2009-06-19 11:32:43 PDT
(In reply to comment #14)
> I tried using this tool.  My first comment is that it seemed to make things a
> lot easier.

Yay!

> However, after starting to use it, I had a few problems:

It's certainly Alpha quality at best!

> 0. There seems to be a script error in the current version when applying
> patches, which I fixed by adding in "False" in line 577,        
> tool.scm().ensure_clean_working_directory(False)).

I need a way to compile-time check python (someone mentioned pycheck?) to make it robust against these kind of refactors.

> 1. I couldn't put it in my directory because it wanted to clean everything (so
> I had to hack the way it found the webkit dir).

Yup.  It needs to be landed to be useful.  I think it also needs to be taught how to work from cwd(), that the SCM classes should take a path (which can be cwd()) and then they can walk backwards to find the root.  Then it makes it useful for when you have more than one webkit checkout.

> 2. I couldn't apply multiple patches.
> I did a git commit for the first patch and then tried to apply the second
> patch, but the tool reset everything back to trunk.  I commented out the stuff
> that did the clean and the git svn rebase.

There is not currently an option to have it apply them locally, or to have it *not* clean the working directory.  I only recently added command-level option parsing.  Now that that's there, it should be easy to add both options.

> 3. I wish that I could apply patches with "--whitespace=fix" :) like I can when
> I use git apply.

I didn't know about --whitespace=fix.  Sounds awesome!

> 4. It started emitting *.orig files when apply patches even when there didn't
> seem to be a conflict. This made me think something was going wrong.  (Of
> course this was when I tried to apply a second patch so maybe that was the
> issue.) 

That's svn-apply for you (which is the way both the Git and SVN classes apply patches atm.  They use it to support file moves, etc... even though it doesn't yet work right for git).  The script needs to be made smarter.  I will likely re-write that script on top of this SVN class eventually.  See:
https://bugs.webkit.org/show_bug.cgi?id=26299
https://bugs.webkit.org/show_bug.cgi?id=26300

Again, the script is alpha-quality at best.  Once I add the proper --force option to things that need to clean the directory at least it will be safe for others, if not yet fully useful.  I hope that once I get some version of this script in it will be easy to colaborate with other python hackers to make it much much better!  My hope is just that this is a good *start* to our next-gen of WebKit scripts which will now be smarter, because they'll be able to deal with bugzilla (which they never could before) as well as git + svn in a saner way.
Comment 16 Eric Seidel (no email) 2009-06-19 16:39:09 PDT
Created attachment 31573 [details]
Updated patch, split into modules, now supports --force-clean


---
 4 files changed, 985 insertions(+), 0 deletions(-)
Comment 17 Eric Seidel (no email) 2009-06-19 16:40:59 PDT
(In reply to comment #14)

I've addressed concerns 0 and 2 in this patch.  The rest will need more work. :)

Next most important is probably to make it cwd() sensitive.  Hopefully this current patch is "good enough" to start talking about being landed.
Comment 18 Eric Seidel (no email) 2009-06-19 16:45:02 PDT
Created attachment 31575 [details]
Updated patch, split into modules, now supports --force-clean


---
 4 files changed, 1038 insertions(+), 0 deletions(-)
Comment 19 Eric Seidel (no email) 2009-06-22 11:58:41 PDT
I suspect that Dave Levin, David Kilzer, Mark Rowe, Adam Roben, or Dimitri Glazkov are the only ones who have both the python knowledge and scripting interest to review this.  Do one of you have cycles to do so today?  I'd like to iterate further with this script.
Comment 20 David Levin 2009-06-22 12:59:24 PDT
I'll look.  If someone else beats me to it or provides additional comments, that would be great too.
Comment 21 Eric Seidel (no email) 2009-06-22 14:39:32 PDT
Created attachment 31676 [details]
Updated to respect the current working directory instead of using ../..


---
 4 files changed, 1086 insertions(+), 0 deletions(-)
Comment 22 David Levin 2009-06-22 17:17:01 PDT
Comment on attachment 31575 [details]
Updated patch, split into modules, now supports --force-clean

Most of these are pretty minor but there are enough comments that it would be worth seeing after it is fixed up so r- for now.



I wish the tool would print out its --help content if I just type its name w/o any args but maybe I'm atypical?


> 91d483a5e9f4b33db4e931e6f99f31bb86066844
> diff --git a/WebKitTools/ChangeLog b/WebKitTools/ChangeLog
> index 14937a6..5885bc0 100644
> --- a/WebKitTools/ChangeLog
> +++ b/WebKitTools/ChangeLog
> @@ -1,3 +1,58 @@
> +2009-06-18  Eric Seidel  <eric@webkit.org>
> +
> +        Reviewed by NOBODY (OOPS!).
> +
> +        WebKit needs a script to interact with bugzilla and automate
> +        parts of the patch posting and commit processes.
> +        https://bugs.webkit.org/show_bug.cgi?id=26283
> +
> +        This is really a first-draft tool.
> +        It's to the point where it's useful to more people than just me now though.
> +        Git support works.  SVN support is written, but untested.
> +
> +        This tool requires BeautifulSoup and mechanize python modules to run:
> +        sudo easy_install BeautifulSoup
> +        sudo easy_install mechanize

I wish that this was part of the help in the script.  Also not everyone has easy_install installed either.

> +
> +        More important than the tool itself are the Bugzilla, Git and SVN class abstractions
> +        which I hope will allow easy writing of future tools.
> +
> +        I intend to break Git, SVN and SCM out into an scmtools.py module after landing.

It looks like you did this already.


> +        Likewise, Bugzilla into a bugzilla.py module.  I felt it would be easier to review in one file.

Already done as well!


> diff --git a/WebKitTools/Scripts/bugzilla-tool b/WebKitTools/Scripts/bugzilla-tool
> new file mode 100755
> index 0000000..f28fa0a
> --- /dev/null
> +++ b/WebKitTools/Scripts/bugzilla-tool
> @@ -0,0 +1,446 @@
> +#!/usr/bin/python
> +
> +# Copyright (C) 2009 Google, Inc.  All rights reserved.
> +#
> +# Redistribution and use in source and binary forms, with or without
> +# modification, are permitted provided that the following conditions
> +# are met:
> +# 1. Redistributions of source code must retain the above copyright
> +#    notice, this list of conditions and the following disclaimer.
> +# 2. Redistributions in binary form must reproduce the above copyright
> +#    notice, this list of conditions and the following disclaimer in the
> +#    documentation and/or other materials provided with the distribution.
> +#

Add clause #3
 *     * Neither the name of Google Inc. nor the names of its
 * contributors may be used to endorse or promote products derived from
 * this software without specific prior written permission.


> +# THIS SOFTWARE IS PROVIDED BY APPLE COMPUTER, INC. ``AS IS'' AND ANY
Consider changing to:
 # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 

> +#
> +# A tool for automating dealing with bugzilla, posting patches, commiting patches, etc.

s/commiting/committing/

> +
> +import sys
> +import re
> +import os
> +import subprocess

Nice to alphabetize these.

> +
> +from optparse import OptionParser, IndentedHelpFormatter, SUPPRESS_USAGE, make_option
> +
> +sys.path.append("modules") # Import WebKit-specific modules

Modified the sys.path is usually something done as a last resort. "from modules.bugzilla ..." would be better.  You'll need to add an empty __init__.py file in WebKitTools/Scripts to do this.



> +from bugzilla import Bugzilla
> +from scm import detect_scm_system, ScriptError
> +
> +def chdir_webkit():
> +    # We could instead ask the SCM system for its root directory (since we need that for making patches as well)
End with "."

> +    script_directory = os.path.abspath(sys.path[0])
> +    webkit_directory = os.path.abspath(os.path.join(script_directory, "../.."))
> +    print webkit_directory
Is this some debug info that was left behind?  If not, should it say something before doing the print (e.g. "Working in %s")

> +    os.chdir(webkit_directory)
> +
> +
> +def log(string):
> +    print >> sys.stderr, string
> +
> +def error(string):
> +    log(string)
> +    exit(1)
> +
> +
Extra blank line?

> +# These could be put in some sort of changelogs.py
> +def latest_changelog_entry(changelog_path):
> +    entry_lines = []
> +    changelog = open(changelog_path)
> +    try:
> +        log("Parsing ChangeLog: " + changelog_path)
> +        # The first line should be a date line
Add period at end of comment (throughout).

Why not validate that the first line is a date line by using the regex?
 
> +        entry_lines.append(changelog.readline())
> +        
> +        # e.g. 2009-06-03  Eric Seidel  <eric@webkit.org>
> +        changelog_date_line_regexp = '^([\d\-]+)  (.+)  <(.+)>$'

How about something more strict about the date format and slightly looser about the spacing:
        changelog_date_line_regexp = ('^(\d{1,4}\-){2}\d{1-4}' # Consume the date.
                                      + '\S+(.+)\S+' # Consume the name.
                                      + '<([^<>]+)>$') # And finally the email address.


> +        for line in changelog:
> +            # if we've hit the next entry, return
> +            if re.match(changelog_date_line_regexp, line):

Consider compiling the regex http://docs.python.org/library/re.html#module-contents

> +     finally:
> +             changelog.close()

changelog.close() is indented too far.

> +
> +def modified_changelogs(scm):
> +    changelog_paths = []
> +    paths = scm.changed_files()
> +    for path in paths:
> +        if re.search('ChangeLog', path):

What about:
   path.endswith("changeLog')
?


> +            changelog_paths.append(path)
> +    return changelog_paths

In WebKitTools/Scripts/commit-log-editor, it has a specific ordering for the changelogs in the commit message.  It isn't followed here.

> +
> +def commit_message_for_this_commit(scm):
> +    changelog_paths = modified_changelogs(scm)
> +    if len(changelog_paths) == 0:

Consider
   if not len(changelog_paths):

> +        error("Found no modified ChangeLogs, can't create a commit message.")
> +
> +    changelog_messages = []
> +    for path in changelog_paths:
> +        changelog_entry = latest_changelog_entry(path)
> +        if not changelog_entry:
> +            error("Failed to parse ChangeLog: " + os.path.abspath(path))
> +        changelog_messages.append(changelog_entry)

This code does do the titles "WebCore;", "LayoutTests:", etc. that are done in WebKitTools/Scripts/commit-log-editor.

> +    return ''.join(changelog_messages)
> +
> +
> +class Command:
> +    def __init__(self, help_text, argument_names="", options = []):
No spaces around = in this case.

> +        self.help_text = help_text
> +        self.argument_names = argument_names
> +        self.options = options
> +        self.option_parser = OptionParser(usage=SUPPRESS_USAGE, add_help_option=False, option_list=self.options)
> +    
> +    def name_with_arguments(self, command_name):
> +        usage_string = command_name
> +        if (len(self.options) > 0):
No need for ().

> +            usage_string += " [options]"
> +        if (self.argument_names != ""):
Prefer
   if self.argument_names:

> +            usage_string += " " + self.argument_names
> +        return usage_string
> +
> +    def parse_args(self, args):
> +        return self.option_parser.parse_args(args)
> +
> +    def execute(self, options, args, tool):
> +        raise NotImplementedError, "subclasses must implement"
> +
> +
> +class BugsInCommitQueue (Command):
No space before ( even in inheritance (throughout).

> +    def __init__(self):
> +        Command.__init__(self, 'Bugs in the commit queue')
> +
> +    def execute(self, options, args, tool):
> +        bug_ids = tool.bugs.fetch_bug_ids_from_commit_queue()
> +        for bug_id in bug_ids:
> +            print tool.bugs.bug_url_for_bug_id(bug_id)
In general it is safer to print like this:

 print "%s" % tool.bugs.bug_url_for_bug_id(bug_id)

which will avoid treating anything in the string as an escape sequence.

> +
> +
> +class PatchesInCommitQueue (Command):
> +    def __init__(self):
> +        Command.__init__(self, 'Patches attached to bugs in the commit queue')
> +
> +    def execute(self, options, args, tool):
> +        patches = tool.bugs.fetch_patches_from_commit_queue()
> +        log("Patches in commit queue:")
> +        for patch in patches:
> +            print patch['url']

print "%s" % patch['url']
(throughout)

> +
> +
> +class ReviewedPatchesOnBug (Command):
> +    def __init__(self):
> +        Command.__init__(self, 'r+\'d patches on a bug', 'BUGID')
> +
> +    def execute(self, options, args, tool):
> +        bug_id = args[0]
> +        patches_to_land = tool.bugs.fetch_reviewed_patches_from_bug(bug_id)
> +        for patch in patches_to_land:
> +            print patch['url']
> +
> +
> +class ApplyPatchesFromBug (Command):
> +    def __init__(self):
> +        options = [
> +            make_option("--no-update", action="store_false", dest="update", default=True, help="Don't update the working directory before applying patches"),
> +            make_option("--force-clean", action="store_true", dest="force_clean", default=False, help="Clean working directory before applying patches (removes local changes and commits)"),
> +            make_option("--no-clean", action="store_false", dest="clean", default=True, help="Don't check if the working directory is clean before applying patches"),
> +        ]
> +        Command.__init__(self, 'Applies all patches on a bug to the local working directory without committing.', 'BUGID', options=options)
> +
> +    def execute(self, options, args, tool):
> +        bug_id = args[0]
> +        patches = tool.bugs.fetch_reviewed_patches_from_bug(bug_id)
> +        chdir_webkit()
> +        if options.clean:
> +            tool.scm().ensure_clean_working_directory(options.force_clean, allow_local_commits = True)
No space around = in this case.  (Personally, I'd rather always do spaces around = but alas such is not python style.)


> +        if options.update:
> +            tool.scm().update_webkit()
> +        
> +        # Should we error out if tool.scm does not support local commits?
> +        for patch in patches:
> +            tool.scm().apply_patch(patch)

Do all the patches get applied as one bunch?  I typically apply each one as a separate "git commit".
No sure how do-able this though.  (I guess I'd like to encourage one patch per bug and then there is no issue :).)


> +    def run_and_throw_if_fail(self, script_name):
> +        build_webkit_process = subprocess.Popen(script_name, shell=True)
> +        return_code = build_webkit_process.wait()
> +        if return_code != 0:

Use
        if return_code:

> +            raise ScriptError(script_name + " failed with code " + return_code)



> diff --git a/WebKitTools/Scripts/modules/bugzilla.py b/WebKitTools/Scripts/modules/bugzilla.py


> +def read_config(key):
> +    # Need a way to read from svn too
> +    config_process = subprocess.Popen("git config --get bugzilla." + key, stdout=subprocess.PIPE, shell=True)
> +    value = config_process.communicate()[0]
> +    return_code = config_process.wait()
> +
> +    if return_code != 0:

Use
    if return_code:

> +        return None
> +
> +class Bugzilla:
> +    def __init__(self, dryrun = False):
No space around =.


> +        self.br = Browser()
s/self.br/self.browser/


> +
> +    # This could eventually be a text file
> +    reviewer_usernames_to_full_names = {
> +        "abarth" : "Adam Barth",
> +        "adele" : "Adele Peterson",
> +        "ariya.hidayat" : "Ariya Hidayat",
> +        "darin" : "Darin Adler",
> +        "dglazkov" : "Dimitri Glazkov",
> +        "eric" : "Eric Seidel",
> +        "ddkilzer" : "David Kilzer",
> +        "fishd" : "Darin Fisher",
> +        "gns" : "Gustavo Noronha",
> +        "hyatt" : "David Hyatt",
> +        "jmalonzo" : "Jan Alonzo",
> +        "levin" : "David Levin",
> +        "mitz" : "Dan Bernstein",
> +        "mjs" : "Maciej Stachoviak",
> +        "mrowe" : "Mark Rowe",
> +        "oliver" : "Oliver Hunt",
> +        "staikos" : "George Staikos",
> +        "treat" : "Adam Treat",
> +        "timothy" : "Timothy Hatcher",
> +        "xan.lopez" : "Xan Lopez",
> +        "zecke" : "Holger Freyther",
> +    }
This is almost sorted but ddkilzer and treat are out of order.
Also, it looks like these folks are missing aroben, smfr, bradee-oh? (probably more...)


> +    def attachment_url_for_id(self, attachment_id, action = "view"):
No spaces around =.

> +        attachment_base_url = self.bug_server + "attachment.cgi?id="
> +        return "%s%s&action=%s" % (attachment_base_url, attachment_id, action)

If attachment_id is just an id, then doesn't the url need id= in it?
(Or perhaps the var name should be attachment_param?)



> +    def fetch_attachments_from_bug(self, bug_id):
...
> +            attachment['obsolete'] = (attachment_row.has_key('class') and attachment_row['class'] == "bz_obsolete")

No need for () here.

> +
> +            if (str(review_status).find("review+") != -1):
No need for ().


> +
> +    def fetch_bug_ids_from_commit_queue(self):
> +        # unassigned_only = "&emailassigned_to1=1&emailtype1=substring&email1=unassigned"
Did you mean to keep this line?



> +    def fetch_patches_from_commit_queue(self):
> +        patches_to_land = []
> +        for bug_id in self.fetch_bug_ids_from_commit_queue():
> +            patches = self.fetch_reviewed_patches_from_bug(bug_id)
> +            patches_to_land.extend(patches)
This also works
   patches_to_land += patches

> +        return patches_to_land
> +
> +    def authenticate(self, username = None, password = None):
No spaces around ='s.

> +        if (self.authenticated):
> +            return
> +        
> +        if not username:
> +            username = read_config("username")
> +            if not username:
> +                username = raw_input("Bugzilla login: ")
> +        if not password:
> +            password = read_config("password")
> +            if not password:
> +                password = getpass.getpass("Bugzilla password for %s: " % (username,))
> +
> +        log("Logging in as %s..." % (username,))
> +        if self.dryrun:
Should it set 
       self.authenticated = True
here?

> +            return

> +
> +    def add_patch_to_bug(self, bug_id, patch_file_object, description, comment_text = None, mark_for_review = False):
No spaces around ='s.

> +        
> +        log("Adding patch \"%s\" to bug %s" % (description, bug_id))

You could do

        log('Adding patch "%s" to bug %s' % (description, bug_id))

to avoid the escpaing.



> +    def obsolete_attachment(self, attachment_id, comment_text = None):
No space around =.

> +
> +        log("Obsoleting attachment: %s" % (attachment_id,))

Remove the tuple creation. ("...",) -> "..."


> +        # Also clear any review flag (to remove it from review/commit queues)
> +        self.br.find_control(type='select', nr=0).value = ("X",)

It seems odd that this needs a tuple.  I'm being too lazy to look it up but just wanted to point this out for your double check.



> +    def post_comment_to_bug(self, bug_id, comment_text):
> +        self.authenticate()
> +
> +        log("Adding comment to bug %s" % (bug_id,))

Use
        log("Adding comment to bug %s" % bug_id)


> +    def close_bug_as_fixed(self, bug_id, comment_text = None):
No space around =.

> +        self.authenticate()
> +
> +        log("Closing bug %s as fixed" % (bug_id,))

Use:
        log("Closing bug %s as fixed" % bug_id)




> diff --git a/WebKitTools/Scripts/modules/scm.py b/WebKitTools/Scripts/modules/scm.py
> new file mode 100644
> index 0000000..63afd45
> --- /dev/null
> +++ b/WebKitTools/Scripts/modules/scm.py
>

> +class SCM:
> +    def __init__(self, dryrun = False):
No space around =.


> +
> +    def ensure_clean_working_directory(self, force, allow_local_commits = False):

No space around =.


> +        if not force and not self.working_directory_is_clean():
> +            log("Working directory has modifications, pass --force-clean or --no-clean to continue.")
> +            os.system(self.status_command())
> +            exit(1)
> +        
> +        log("Cleaning the working directory")
> +        self.clean_working_directory(discard_local_commits = not allow_local_commits)

No space around =.


    
> +    def ensure_no_local_commits(self, force):
...
> +        commits = self.local_commits()
> +        if len(commits) == 0:

if len(commits) :



> +
> +    def apply_patch(self, patch):
> +        # It's possible that the patch was not made from the root directory
Nit: directory"."

> +        # we should detect and handle that case.
Nit: "W"e

> +        return_code = os.system("curl " + patch['url'] + " | svn-apply --reviewer \"" + patch['reviewer'] + "\"")

How about?
        return_code = os.system('curl %s | svn-apply --reviewer "%s"' % (patch['url'], patch['reviewer']))

> +        if (return_code != 0):

Use
        if return_code:

> +            raise ScriptError("Patch " + patch['url'] + " failed to download and apply.")
> +


> +    # It's slightly hacky to share this code, since there are implicit assumptions about the regexp format
Less so with named groups, so many the comment can go away if you do that.
> +    def run_status_and_extract_filenames(self, status_command, status_regexp):
> +        file_names = []
> +        for line in os.popen(status_command).readlines():
> +            match = re.search(status_regexp, line)
> +            if not match:
> +                continue
> +            # status = match.group(1)
> +            file_name = match.group(2)

If you go with the named group suggestion, this becomes
            file_name = match.group('filename')

> +            file_names.append(file_name)
> +        return file_names


> +class SVN (SCM):
> +    def __init__(self, dryrun = False):
No spaces around =.


> +
> +    def working_directory_is_clean(self):
> +        diff_process = subprocess.Popen("svn diff", stdout=subprocess.PIPE, shell=True)
> +        diff_output = diff_process.communicate()[0]
> +        if dif_process.wait() != 0:
Two issues (a typo dif vs diff) and no need for "!= 0"), so consider

        if diff_process.wait():


> +
> +    def changed_files(self):
> +        status_regexp = "^([ACDMR]).{6} (.+)$" if self.svn_version() > "1.6" else "^([ACDMR]).{5} (.+)$"

There is something about all of the "" in that line that makes it hard for me to parse.  Personally, I'd
prefer the standard c++ look in this case.

if self.svn_version() > "1.6":
    status_regexp = "^([ACDMR]).{6} (.+)$"
else
    status_regexp = "^([ACDMR]).{5} (.+)$"

Also consider used named groups (see my comment in the git class).



> +    def commit_with_message(self, message):
> +        commit_process = subprocess.Popen('svn commit -F -', stdin=subprocess.PIPE, shell=True)
> +        commit_process.communicate(message)
Should this do
           commit_process.wait()
?


> +
> +# All git-specific logic should go here.
> +class Git (SCM):
> +    def __init__(self, dryrun = False):
No space around =.

> +    
> +    def working_directory_is_clean(self):
> +        diff_process = subprocess.Popen("git diff-index head", stdout=subprocess.PIPE, shell=True)
> +        diff_output = diff_process.communicate()[0]
> +        if dif_process.wait() != 0:

Two issues (a typo dif vs diff) and no need for "!= 0"), so consider

        if diff_process.wait():

> +
> +    def changed_files(self):
> +        status_command = 'git diff -r --name-status -C -C -M'
Extra -C

> +        status_regexp = '^([ADM])\t(.+)$'
Consider using named groups, for example:

  '^(?P<status>[ADM])\t(?P<filename>.+)$'

See http://docs.python.org/library/re.html#regular-expression-syntax

> +        return self.run_status_and_extract_filenames(status_command, status_regexp)
> +    


> +    # Git-specific methods:
> +    
> +    def commit_locally_with_message(self, message):
> +        commit_process = subprocess.Popen('git commit -a -F -', stdin=subprocess.PIPE, shell=True)
In many other places you used "" for strings.  Python also allows single quotes, but it is nice to be consistent (unless you want to switch to put a " in the middle.)


> +        commit_process.communicate(message)
It seems like it should do commit_process.wait() here.
 

> +    def push_local_commits_to_server(self):
...
> +        return out
> +    
> +
extra blank line.

> +    def commit_ids_from_range_arguments(self, args, cherry_pick = False):
No spaces around =.

> +        
> +        # If not cherry-picking and only passed one revision, assume "^revision head" aka "revision..head"

I can read "If we're not cherry-picking..." more easily.


> +        if len(revisions) < 2:
> +            revisions[0] = "^" + revisions[0]
> +            revisions.append("head")
> +        
> +        rev_list_args = ['git', 'rev-list']
> +        rev_list_args.extend(revisions)

fwiw, 

   rev_list_arg += revisions

will work too.  Your choice.
Comment 23 Mark Rowe (bdash) 2009-06-22 17:20:34 PDT
> I wish that this was part of the help in the script.  Also not everyone has
> easy_install installed either.

What're the licenses like on these modules?  If they're sufficiently agreeable then the best approach may be to land known good versions in the repository so that the script works out of the box.
Comment 24 Eric Seidel (no email) 2009-06-22 17:28:06 PDT
(In reply to comment #23)
> > I wish that this was part of the help in the script.  Also not everyone has
> > easy_install installed either.
> 
> What're the licenses like on these modules?  If they're sufficiently agreeable
> then the best approach may be to land known good versions in the repository so
> that the script works out of the box.

mechanize is tiny, and BSD.  Easy to include.
BeautifulSoup is quite large.  It's under the python license though, so also should be OK to include.

Lets do that in a separate change.  For now, I've added code to detect when they're missing and print a nice help message. :)
Comment 25 Mark Rowe (bdash) 2009-06-22 17:31:33 PDT
> Lets do that in a separate change.  For now, I've added code to detect when
> they're missing and print a nice help message. :)

For sure.
Comment 26 Eric Seidel (no email) 2009-06-22 17:56:46 PDT
(In reply to comment #22)
> (From update of attachment 31575 [details] [review])
> I wish the tool would print out its --help content if I just type its name w/o
> any args but maybe I'm atypical?

Agreed.  I'll see what I can do.  OptionParse.error() does not have this behavior by default. :(

> I wish that this was part of the help in the script.  Also not everyone has
> easy_install installed either.

Done.

> Add clause #3
>  *     * Neither the name of Google Inc. nor the names of its
>  * contributors may be used to endorse or promote products derived from
>  * this software without specific prior written permission.
Fixed.

> s/commiting/committing/
Fixed.

> Nice to alphabetize these.
Fixed.

> Modified the sys.path is usually something done as a last resort. "from
> modules.bugzilla ..." would be better.  You'll need to add an empty __init__.py
> file in WebKitTools/Scripts to do this.
Fixed.

> Why not validate that the first line is a date line by using the regex?
Fixed.

> How about something more strict about the date format and slightly looser about
> the spacing:
>         changelog_date_line_regexp = ('^(\d{1,4}\-){2}\d{1-4}' # Consume the
> date.
>                                       + '\S+(.+)\S+' # Consume the name.
>                                       + '<([^<>]+)>$') # And finally the email
Fixed.

> Consider compiling the regex
> http://docs.python.org/library/re.html#module-contents
Fixed.

> changelog.close() is indented too far.
Fixed.

> In WebKitTools/Scripts/commit-log-editor, it has a specific ordering for the
> changelogs in the commit message.  It isn't followed here.

Added a FIXME.  I think that's best done in a later patch.  Perhaps when we unify this script with commit-log-editor. :)

> Consider
>    if not len(changelog_paths):
Fixed.

> > +class Command:
> > +    def __init__(self, help_text, argument_names="", options = []):
> No spaces around = in this case.
Fixed.  TextMate likes to add these for some reason.

> > +        if (len(self.options) > 0):
> No need for ().
Fixed.

> Prefer
>    if self.argument_names:
Fixed.

> No space before ( even in inheritance (throughout).
Fixed.

> In general it is safer to print like this:
>  print "%s" % tool.bugs.bug_url_for_bug_id(bug_id)
Fixed.

> > +            tool.scm().ensure_clean_working_directory(options.force_clean, allow_local_commits = True)
> No space around = in this case.  (Personally, I'd rather always do spaces
> around = but alas such is not python style.)
Fixed.  I agree.  I dislike this part of Python's style.

> Do all the patches get applied as one bunch?  I typically apply each one as a
> separate "git commit".
> No sure how do-able this though.  (I guess I'd like to encourage one patch per
> bug and then there is no issue :).)

Currently apply-patches only applies the patches to your tree.  It does not have a --commit option yet.
I plan to add one.  I've added a FIXME for now.

> > +        self.br = Browser()
> s/self.br/self.browser/
Fixed. :)  pylint beat you to it!  (I had copied from mechanize's examples.)


> This is almost sorted but ddkilzer and treat are out of order.
> Also, it looks like these folks are missing aroben, smfr, bradee-oh? (probably
> more...)

Yup.  The list of reviewers is incomplete.  I should push it out into its own file.  I'll do that when I post a new one.
That list maps from bugzilla usernames to names.  I have no easy way to generate that list.  I guess I could search for all r+'d patches ever...

> > +        attachment_base_url = self.bug_server + "attachment.cgi?id="
> > +        return "%s%s&action=%s" % (attachment_base_url, attachment_id, action)
> 
> If attachment_id is just an id, then doesn't the url need id= in it?
> (Or perhaps the var name should be attachment_param?)

I'm confused.  Are you suggesting that I should distribute the id= differently between the strings?

> > +    def fetch_attachments_from_bug(self, bug_id):
> ...
> > +            attachment['obsolete'] = (attachment_row.has_key('class') and attachment_row['class'] == "bz_obsolete")
> 
> No need for () here.

Yeah, but I find it reads clearer...

> > +
> > +            if (str(review_status).find("review+") != -1):
> No need for ().
Fixed.

> > +
> > +    def fetch_bug_ids_from_commit_queue(self):
> > +        # unassigned_only = "&emailassigned_to1=1&emailtype1=substring&email1=unassigned"
> Did you mean to keep this line?

Yes.  I plan to use it.  I'll add a FIXME

> This also works
>    patches_to_land += patches
Nifty!

> Should it set 
>        self.authenticated = True
> here?
Fixed.

> You could do
>         log('Adding patch "%s" to bug %s' % (description, bug_id))
> to avoid the escpaing.
Fixed.

> > +        log("Obsoleting attachment: %s" % (attachment_id,))
> 
> Remove the tuple creation. ("...",) -> "..."

I hit errors w/o them before... I swear.  Fixed.

> > +        # Also clear any review flag (to remove it from review/commit queues)
> > +        self.br.find_control(type='select', nr=0).value = ("X",)
> 
> It seems odd that this needs a tuple.  I'm being too lazy to look it up but
> just wanted to point this out for your double check.

Yes.  ClientForm (part of mechanize) is strange in its need for tuples.

> How about?
>         return_code = os.system('curl %s | svn-apply --reviewer "%s"' %
> (patch['url'], patch['reviewer']))

Fixed.

> > +    # It's slightly hacky to share this code, since there are implicit assumptions about the regexp format
> Less so with named groups, so many the comment can go away if you do that.

Excellent suggestion!  Fixed.

> > +    def commit_with_message(self, message):
> > +        commit_process = subprocess.Popen('svn commit -F -', stdin=subprocess.PIPE, shell=True)
> > +        commit_process.communicate(message)
> Should this do
>            commit_process.wait()

communicate() waits for the process termination, so I don't need wait() unless I need the return code.  At least that's my understanding of these subprocess calls.

> > +        status_command = 'git diff -r --name-status -C -C -M'
> Extra -C
Fixed.


Thanks for all the fantastic comments!  New patch coming shortly!
Comment 27 Eric Seidel (no email) 2009-06-22 17:58:49 PDT
Created attachment 31696 [details]
Updated to respect the current working directory instead of using ../..


---
 5 files changed, 1130 insertions(+), 0 deletions(-)
Comment 28 Eric Seidel (no email) 2009-06-22 17:59:19 PDT
Addressed Dave Levin's comments (even if I didn't update the patch description).
Comment 29 David Levin 2009-06-22 18:53:12 PDT
Comment on attachment 31696 [details]
Updated to respect the current working directory instead of using ../..

Only a few trivial comments below.  Feel free to change and submit.



> diff --git a/WebKitTools/Scripts/bugzilla-tool b/WebKitTools/Scripts/bugzilla-tool

> +def latest_changelog_entry(changelog_path):
...
> +        changelog_date_line_regexp = re.compile('^(\d{4}-\d{2}-\d{2})' # Consume the date.
> +                                      + '\s+(.+)\s+' # Consume the name.
> +                                      + '<([^<>]+)>$') # And finally the email address.

You can remove this one since you already have it defined a few lines above this.



> + def commit_message_for_this_commit(scm):
..
> +    # FIXME: We should sort and label the ChangeLog messages like commit-log-editor does

Add "."



> + class PostDiffAsPatchToBug(Command):
...
> +     def execute(self, options, args, tool):
...
> +         description = options.description if options.description else "patch"

You could also do this if you find it more readable (I do):

  description = options.description || "patch"


> +            log("Are you sure you want to attach %d patches to bug %s?" % (len(commit_ids), bug_id))
> +            # Should could --patches-limit option.

Fix comment "Should could"?



> +    def format_epilog(self, epilog):
> +        if epilog:
> +            return "\n" + epilog + "\n"
> +        else:

Remove the else in this case.  (Following standard WebKit style).


> +    @staticmethod
> +    def split_args(args):
> +        # Assume the first argument which doesn't start with '-' is the command name.
> +        command_index = 0
> +        for arg in args:
> +            if arg[0] != '-':
> +                break
> +            command_index += 1
> +
> +        global_args = args[:command_index]
> +        if command_index >= len(args):
> +            return (global_args, None, [])
> +
> +        command = args[command_index]
> +        command_args = args[command_index + 1:]
> +        return (global_args, command, command_args)

I really like the "else" statement in python for "while" and "for" statements.

Totally optional but consider

        command_index = 0
        for arg in args:
            if arg[0] != '-':
                break
            command_index += 1
        else:
            return (args[:], None, [])

        global_args = args[:command_index]
        command = args[command_index]
        command_args = args[command_index + 1:]
        return (global_args, command, command_args)

btw, above I did the "args[:]" above just to get a new copy of the array (as opposed to return a reference to args) to be consistent with the other code path in the function.
 


> diff --git a/WebKitTools/Scripts/modules/bugzilla.py b/WebKitTools/Scripts/modules/bugzilla.py


> +# HACK: This should not depend on git for config storage

HACK or FIXME?


> +    reviewer_usernames_to_full_names = {

Would be nice to sort: ddkilzer and treat are out of place.


> diff --git a/WebKitTools/Scripts/modules/scm.py b/WebKitTools/Scripts/modules/scm.py

> +class SCM:
> +    # Subclasses must indicate if they support local commits
> +    # but the SCM baseclass will only call local_commits methods when this is true

Add "."



>  I'm confused.  Are you suggesting that I should distribute the id= differently
>  between the strings?

My mistake: I missed the "id=" when I was reading the code.  I was dealing with a really small font at that moment.
Comment 30 Eric Seidel (no email) 2009-06-23 01:36:47 PDT
Fixed your comments!  Thanks for the great reviews!
Comment 31 Eric Seidel (no email) 2009-06-23 01:42:57 PDT
The script landed this as r44979, but it errored out and failed to update the bug. :(
Comment 32 Eric Seidel (no email) 2009-06-23 01:44:32 PDT
Created attachment 31710 [details]
patch
Comment 33 Mark Rowe (bdash) 2009-06-23 01:45:38 PDT
Comment on attachment 31710 [details]
patch

Very presumptuous, Mr Seidel.
Comment 34 Eric Seidel (no email) 2009-06-23 01:52:57 PDT
Created attachment 31712 [details]
patch
Comment 35 Mark Rowe (bdash) 2009-06-23 01:57:56 PDT
Comment on attachment 31712 [details]
patch

I'm a little sad that you've been landing untested code.
Comment 36 Eric Seidel (no email) 2009-06-23 01:59:39 PDT
(In reply to comment #35)
> (From update of attachment 31712 [details] [review])
> I'm a little sad that you've been landing untested code.

Agreed.  This is the "actually push it to the svn server" code-path.  Which was tested at one point.  But broke during refactoring. :(  Such is the problem of scripting languages... they don't refactor nicely.

I need to write unit-tests for all this.  I could also set up a local svn server for some of this testing.
Comment 37 Eric Seidel (no email) 2009-06-23 01:59:55 PDT
Committing to http://svn.webkit.org/repository/webkit/trunk ...
	M	WebKitTools/ChangeLog
	M	WebKitTools/Scripts/modules/scm.py
Comment 38 Eric Seidel (no email) 2009-06-23 02:13:48 PDT
(In reply to comment #37)
> Committing to http://svn.webkit.org/repository/webkit/trunk ...
>         M       WebKitTools/ChangeLog
>         M       WebKitTools/Scripts/modules/scm.py

For those watching at home... the script error'd out during this commit too (I forgot that the land-and-update-bug option does not know how to fix the ChangeLogs to have the right reviewer yet...). :(  Still lots of pieces to fix.