RESOLVED FIXED72090
Make ChangeLogEntry's reviewer parsing algorithm support last 4 WebCore change logs
https://bugs.webkit.org/show_bug.cgi?id=72090
Summary Make ChangeLogEntry's reviewer parsing algorithm support last 4 WebCore chang...
Ryosuke Niwa
Reported 2011-11-10 22:34:19 PST
Significantly improve ChangeLogEntry's reviewer parsing algorithm. This version can successfully parse the following change logs in Source/WebCore/ ChangeLog, ChangeLog-2011-10-19, ChangeLog-2011-06-04, ChangeLog-2011-02-16, ChangeLog-2010-12-06
Attachments
Patch (12.88 KB, patch)
2011-11-10 22:39 PST, Ryosuke Niwa
no flags
fixed the test (12.88 KB, patch)
2011-11-10 22:59 PST, Ryosuke Niwa
eric: review+
Ryosuke Niwa
Comment 1 2011-11-10 22:39:03 PST
Ryosuke Niwa
Comment 2 2011-11-10 22:44:38 PST
Comment on attachment 114628 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=114628&action=review > Tools/Scripts/webkitpy/common/checkout/changelog.py:88 > + reviewed_by_regexp = r'^\s*((\w+\s+)+and\s+)?(Review|Rubber(\s*|-)stamp)(s|ed)?\s+([a-z]+\s+)*?by\s+(?P<reviewer>.*?)[\.,]?\s*$' > + > + reviewed_byless_regexp = r'^\s*((Review|Rubber(\s*|-)stamp)(s|ed)?|RS)(\s+|\s*=\s*)(?P<reviewer>([A-Z]\w+\s*)+)[\.,]?\s*$' > + > + # e.g. "landed by", email address, and phrases like "given a glance-over by" and "looked over by" > + contributor_name_noise_regexp = re.compile(r'(\s+(landed|committed|)\s+by.+)|\..+' > + + r'|([(<]\s*[\w_\-\.]+@[\w_\-\.]+[>)])|((?<=and)\s+([a-z\-]+\s+)+by)', re.IGNORECASE) I wish I didn't have to write such ugly regular expressions :( I'm more than happy to explain what they do
Ryosuke Niwa
Comment 3 2011-11-10 22:51:51 PST
Comment on attachment 114628 [details] Patch Some webkitpy tests are failing...
Ryosuke Niwa
Comment 4 2011-11-10 22:59:01 PST
Created attachment 114631 [details] fixed the test
Eric Seidel (no email)
Comment 5 2011-11-11 09:43:18 PST
Comment on attachment 114631 [details] fixed the test LGTM. You should also be aware of bug 26533.
Tony Chang
Comment 6 2011-11-11 10:09:40 PST
Comment on attachment 114628 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=114628&action=review >> Tools/Scripts/webkitpy/common/checkout/changelog.py:88 >> + + r'|([(<]\s*[\w_\-\.]+@[\w_\-\.]+[>)])|((?<=and)\s+([a-z\-]+\s+)+by)', re.IGNORECASE) > > I wish I didn't have to write such ugly regular expressions :( I'm more than happy to explain what they do When I have long regular expressions, I like to use re.VERBOSE and add comments to each part. There's a small example in the verbose section: http://docs.python.org/library/re.html Or here's an example in perl: http://www.perl.com/pub/2004/01/16/regexps.html
Ryosuke Niwa
Comment 7 2011-11-11 11:08:27 PST
(In reply to comment #6) > When I have long regular expressions, I like to use re.VERBOSE and add comments to each part. There's a small example in the verbose section: > http://docs.python.org/library/re.html That's a good idea! Will do before landing it.
Ryosuke Niwa
Comment 8 2011-11-11 11:25:47 PST
(In reply to comment #5) > (From update of attachment 114631 [details]) > LGTM. You should also be aware of bug 26533. Yup, in fact, this is marked as a blocker of the bug 26533.
Ryosuke Niwa
Comment 9 2011-11-13 23:23:38 PST
Note You need to log in before you can comment on or make changes to this bug.