WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
110356
Parse author names with commas in ChangeLogs
https://bugs.webkit.org/show_bug.cgi?id=110356
Summary
Parse author names with commas in ChangeLogs
Tony Chang
Reported
2013-02-20 10:46:27 PST
Parse author names with commas in ChangeLogs
Attachments
Patch
(3.52 KB, patch)
2013-02-20 10:48 PST
,
Tony Chang
no flags
Details
Formatted Diff
Diff
Patch
(4.68 KB, patch)
2013-02-20 12:15 PST
,
Tony Chang
no flags
Details
Formatted Diff
Diff
Patch for landing
(4.90 KB, patch)
2013-02-20 13:44 PST
,
Tony Chang
no flags
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Tony Chang
Comment 1
2013-02-20 10:48:57 PST
Created
attachment 189339
[details]
Patch
Tony Chang
Comment 2
2013-02-20 10:49:25 PST
This is kind of gross, alternate suggestions welcome.
Tony Chang
Comment 3
2013-02-20 10:50:06 PST
You can see the failure on
https://bugs.webkit.org/show_bug.cgi?id=109303
.
Dirk Pranke
Comment 4
2013-02-20 11:52:16 PST
Comment on
attachment 189339
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=189339&action=review
> Tools/ChangeLog:8 > + PaweÅ's name has a comma in it, which was confusing the ChangeLog parser.
interesting that this is showing up as \u212b (angstrom sign) rather than \u0142 (latin small l with stroke) in the bugzilla diff. What is it in the file? Perhaps there's a separate bug for some character encoding conversion here?
> Tools/Scripts/webkitpy/common/checkout/changelog.py:150 > + return names
It looks like this function is used in two places, in the author text and in the reviewer text. If I'm correctly understanding this patch, this will work for the author_text case (where we want name_and_email=True), but I think if Pawel put his full name as one of the reviewers, we'd probably still be stuck? Also, it seems like maybe we should just split this into two different functions, and maybe define the regex as a constant or something. Combining them into one function makes it hard to follow (especially lines 148-149).
Tony Chang
Comment 5
2013-02-20 12:15:33 PST
Created
attachment 189354
[details]
Patch
Tony Chang
Comment 6
2013-02-20 12:16:15 PST
(In reply to
comment #4
)
> (From update of
attachment 189339
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=189339&action=review
> > > Tools/ChangeLog:8 > > + PaweÅ's name has a comma in it, which was confusing the ChangeLog parser. > > interesting that this is showing up as \u212b (angstrom sign) rather than \u0142 (latin small l with stroke) in the bugzilla diff. What is it in the file? Perhaps there's a separate bug for some character encoding conversion here?
This is a bug in PrettyPatch.rb. The ruby script isn't unicode aware.
> > Tools/Scripts/webkitpy/common/checkout/changelog.py:150 > > + return names > > It looks like this function is used in two places, in the author text and in the reviewer text. If I'm correctly understanding this patch, this will work for the author_text case (where we want name_and_email=True), but I think if Pawel put his full name as one of the reviewers, we'd probably still be stuck?
That's true, it will fail for reviewers. I'm OK with punting this problem until Pawel becomes a reviewer.
> Also, it seems like maybe we should just split this into two different functions, and maybe define the regex as a constant or something. Combining them into one function makes it hard to follow (especially lines 148-149).
Done.
Tony Chang
Comment 7
2013-02-20 13:44:31 PST
Created
attachment 189366
[details]
Patch for landing
Dirk Pranke
Comment 8
2013-02-20 13:45:06 PST
(In reply to
comment #6
)
> (In reply to
comment #4
) > > (From update of
attachment 189339
[details]
[details]) > > View in context:
https://bugs.webkit.org/attachment.cgi?id=189339&action=review
> > > > > Tools/ChangeLog:8 > > > + PaweÅ's name has a comma in it, which was confusing the ChangeLog parser. > > > > interesting that this is showing up as \u212b (angstrom sign) rather than \u0142 (latin small l with stroke) in the bugzilla diff. What is it in the file? Perhaps there's a separate bug for some character encoding conversion here? > > This is a bug in PrettyPatch.rb. The ruby script isn't unicode aware. >
Is there a bug filed for that?
> > > Tools/Scripts/webkitpy/common/checkout/changelog.py:150 > > > + return names > > > > It looks like this function is used in two places, in the author text and in the reviewer text. If I'm correctly understanding this patch, this will work for the author_text case (where we want name_and_email=True), but I think if Pawel put his full name as one of the reviewers, we'd probably still be stuck? > > That's true, it will fail for reviewers. I'm OK with punting this problem until Pawel becomes a reviewer. > > > Also, it seems like maybe we should just split this into two different functions, and maybe define the regex as a constant or something. Combining them into one function makes it hard to follow (especially lines 148-149). > > Done.
As we discussed over IM, I think there are three problems here: there's the obvious one (we can't handle phajdan.jr's name), and two less obvious ones: 1) We don't expect the parser to be able to handle all possible combinations of stuff, and when it does fail to parse, the CQ chokes on this rather than failing semi-robustly somehow. We should file a bug to fix this. 2) The existing approach uses very complicated (IMO) regexps and a couple of ad-hoc workarounds to try and do the best it can. The code is unmaintainable (to me) and this patch makes it slightly worse. The only way I personally would fix this bug is to throw out the code and rewrite it to use a proper parser, but that's a lot more work. I don't want to require it now, so I've r+'ed this patch (so that we can handle phajdan.jr in the meantime), but we should file a separate bug to rework this code to make it more maintainable and/or correct. does that capture everything?
WebKit Review Bot
Comment 9
2013-02-20 14:05:32 PST
Comment on
attachment 189366
[details]
Patch for landing Clearing flags on attachment: 189366 Committed
r143504
: <
http://trac.webkit.org/changeset/143504
>
WebKit Review Bot
Comment 10
2013-02-20 14:05:37 PST
All reviewed patches have been landed. Closing bug.
Tony Chang
Comment 11
2013-02-20 14:13:31 PST
https://bugs.webkit.org/show_bug.cgi?id=110380
filed for (1).
Carlos Garcia Campos
Comment 12
2013-04-10 00:36:29 PDT
***
Bug 84582
has been marked as a duplicate of this bug. ***
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug