RESOLVED FIXED77886
KeywordLookupGenerator.py script fails in some cases
https://bugs.webkit.org/show_bug.cgi?id=77886
Summary KeywordLookupGenerator.py script fails in some cases
Ashod Nakashian
Reported 2012-02-06 11:08:39 PST
KeywordLookupGenerator.py splits the input file by newline characters ('\n') using the split() function. The code also trims each line of whitespaces after splitting. However, carriage-return ('\r') isn't necessarily a whitespace character, and, if it exists in the input file, it remains dangling at the end of the the lines. This is not an issue for most lines, except for the first and last, where they are compared verbatim with the expected values. On my build system (Win7/CygWin/Python2.6) the last line is '@end\r' which fails to compare with the expected '@end' and thereby the complete input file fails to parse and the script exists with an exception. This in turn breaks the build. The generic fix is to also assume the input may contain both carriage-returns and split by both newline and carriage-return characters (empty lines are removed anyway).
Attachments
Patch (1.26 KB, patch)
2012-02-06 12:27 PST, Ashod Nakashian
no flags
Patch (1.24 KB, patch)
2012-02-06 20:46 PST, Ashod Nakashian
no flags
Patch (4.09 KB, patch)
2012-02-11 05:09 PST, Ashod Nakashian
no flags
Patch (3.86 KB, patch)
2012-02-12 05:43 PST, Ashod Nakashian
no flags
Ashod Nakashian
Comment 1 2012-02-06 12:27:14 PST
Benjamin Poulain
Comment 2 2012-02-06 13:08:18 PST
Comment on attachment 125688 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=125688&action=review I am not sure why the input would use the Windows line ending. Is that converted automatically by git/svn? > Source/JavaScriptCore/ChangeLog:7 > + Reviewed by NOBODY (OOPS!). > + Missing explanation in the ChangeLog. > Source/JavaScriptCore/ChangeLog:8 > + * JavaScriptCore.vcproj/JavaScriptCore/JavaScriptCore.def: ???
Eric Seidel (no email)
Comment 3 2012-02-06 13:09:32 PST
Comment on attachment 125688 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=125688&action=review This whole split cascade is a bit odd. Is this file not unittested? Normally we test all changes in WebKit, including python changes. > Source/JavaScriptCore/ChangeLog:8 > + * JavaScriptCore.vcproj/JavaScriptCore/JavaScriptCore.def: Looks like you didn't actually change this file.
Ashod Nakashian
Comment 4 2012-02-06 20:46:38 PST
Ashod Nakashian
Comment 5 2012-02-06 21:48:13 PST
(In reply to comment #2) > (From update of attachment 125688 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=125688&action=review > > I am not sure why the input would use the Windows line ending. Is that converted automatically by git/svn? I wouldn't know, but I went back half a dozen or so commits of the input file and found no changes to the last line. This suggests the file did contain cr and it wasn't causing problems. Considering that the split also trims the strings of whitespace, and considering that in the past I have successfully run this script on WinXP, I'm suspecting it has something to do with the default whitespace list of python in the given version/build/system. There might be some other explanation, but I find this the most reasonable at this point. At any rate, I think non-print chars like this are bound to creep-in considering that we work on multiple platforms. My suggested fix is to take cr into account and treat them as splitting character as well. There are other solutions too, but probably not as generic. > > > Source/JavaScriptCore/ChangeLog:7 > > + Reviewed by NOBODY (OOPS!). > > + > > Missing explanation in the ChangeLog. Fixed. > > > Source/JavaScriptCore/ChangeLog:8 > > + * JavaScriptCore.vcproj/JavaScriptCore/JavaScriptCore.def: > > ??? Removed. No idea why the patch script thought this modified.
Ashod Nakashian
Comment 6 2012-02-06 21:51:46 PST
(In reply to comment #3) > (From update of attachment 125688 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=125688&action=review > > This whole split cascade is a bit odd. Is this file not unittested? Normally we test all changes in WebKit, including python changes. Agreed on the odd part, but considering that line breaks aren't predictable, we either have to remove carriage-returns (before or after splitting) or we have to be more generic and break at CR as well as LF. After all, we *can* have lines with CR breaks only. > > > Source/JavaScriptCore/ChangeLog:8 > > + * JavaScriptCore.vcproj/JavaScriptCore/JavaScriptCore.def: > > Looks like you didn't actually change this file. Fixed.
Darin Adler
Comment 7 2012-02-07 12:26:50 PST
(In reply to comment #6) > line breaks aren't predictable Maybe true for the programming world at large, but need not be true for the WebKit project. We can make whatever rule we like about line breaks. There is no good reason for us to support CR or CRLF line breaks in project tools; we should consistently set properties so our source files all use LF line breaks except for the small number of files that may be subject to specific platform requirements on Windows that might require something different. > After all, we *can* have lines with CR breaks only. Not a goal for the project.
Ashod Nakashian
Comment 8 2012-02-07 20:23:10 PST
(In reply to comment #7) > (In reply to comment #6) > > line breaks aren't predictable > > Maybe true for the programming world at large, but need not be true for the WebKit project. We can make whatever rule we like about line breaks. There is no good reason for us to support CR or CRLF line breaks in project tools; we should consistently set properties so our source files all use LF line breaks except for the small number of files that may be subject to specific platform requirements on Windows that might require something different. > > > After all, we *can* have lines with CR breaks only. > > Not a goal for the project. Fair enough. However, I'd like to avoid accidental mistakes as much as possible. I see 3 possible solutions: 1) Force LF only via tests and convert the input file. 2) Force LF only on the input file via a commit-time hook (git would make that a breeze, alas). 3) Accept my patch which by only adding one line supports the current input file as-is. I don't know whether or not we have tests for build scripts and specifically for KeywordLookupGenerator.py. Any tips are welcome. I don't know how to add SVN commit-time hooks on the client. I've submitted a working patch as the third alternative. Of course we can always manually fix the input file, until it fails again on someone... or pick on of the above (or another suggestion?) Let me know what's the preferred action in a similar case for WebKit.
Benjamin Poulain
Comment 9 2012-02-07 20:37:10 PST
> Fair enough. However, I'd like to avoid accidental mistakes as much as possible. I see 3 possible solutions: > > 1) Force LF only via tests and convert the input file. Yep, you can do that with check-webkit-style. Optionally you can add support for converting automatically the input file through webkit-patch. > Of course we can always manually fix the input file, until it fails again on someone... or pick on of the above (or another suggestion?) Given no port nor bot require support for this, I think Darin has a fair point about not supporting CRLF here.
Ashod Nakashian
Comment 10 2012-02-07 21:35:53 PST
(In reply to comment #9) > > Fair enough. However, I'd like to avoid accidental mistakes as much as possible. I see 3 possible solutions: > > > > 1) Force LF only via tests and convert the input file. > > Yep, you can do that with check-webkit-style. Optionally you can add support for converting automatically the input file through webkit-patch. Great. I'll make the appropriate changes in at least check-webkit-style and possibly auto-convert in webkit-patch and resubmit. > > > Of course we can always manually fix the input file, until it fails again on someone... or pick on of the above (or another suggestion?) > > Given no port nor bot require support for this, I think Darin has a fair point about not supporting CRLF here. Sure.
Ashod Nakashian
Comment 11 2012-02-11 05:09:16 PST
Benjamin Poulain
Comment 12 2012-02-11 14:05:22 PST
Comment on attachment 126629 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=126629&action=review Some nitpicking: > Source/JavaScriptCore/ChangeLog:12 > + Keywords.table has been converted to LF-only. > + > + * parser/Keywords.table: The line "* parser/Keywords.table:" is made for the comments about that file. You should have: * parser/Keywords.table: Converted to LF-only. > Source/JavaScriptCore/parser/Keywords.table:38 > -# reserved for future use > +# reserved for future use. Our rule is "Make comments look like sentences by starting with a capital letter and ending with a period (punctation)." If you fix the period, you should fix the capital letter :) > Source/JavaScriptCore/parser/Keywords.table:49 > -# reserved for future use in strict code > +# reserved for future use in strict code. ditto > Tools/ChangeLog:10 > + Keywords.table has been converted to LF-only. This should not be part of this particular ChangeLog. > Tools/Scripts/webkitpy/style/checker.py:243 > + # This is only critical for the first and last lines > + # which are compared verbatim. This is describing an implementation detail in an other script, it is a bad idea to have that in a comment. You can drop those lines. > Tools/Scripts/webkitpy/style/checker_unittest.py:273 > + path = "Source/JavaScriptCore/parser/Keywords.table" > + assertCheck(path, "whitespace/carriage_return") No need for the variable "path", follow the style of this function passing strings as literal.
Ashod Nakashian
Comment 13 2012-02-12 05:14:14 PST
(In reply to comment #12) > (From update of attachment 126629 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=126629&action=review > > Some nitpicking: > Thanks for all the help. All of your notes make sense. Since some of them are project-specific, without knowing what WebKit specifically prefers, I err on the safer side of being as conforming as much as possible (hence not fixing the case of the first letter and creating a superfluous path variable.)
Ashod Nakashian
Comment 14 2012-02-12 05:43:10 PST
Benjamin Poulain
Comment 15 2012-02-12 18:47:43 PST
Comment on attachment 126675 [details] Patch Looks good. Thanks for the patch.
WebKit Review Bot
Comment 16 2012-02-12 21:00:40 PST
Comment on attachment 126675 [details] Patch Clearing flags on attachment: 126675 Committed r107527: <http://trac.webkit.org/changeset/107527>
WebKit Review Bot
Comment 17 2012-02-12 21:00:46 PST
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.