Bug 91212

Summary: ruby1.9 fails in PrettyPatch.rb with invalid byte sequence in UTF-8
Product: WebKit Reporter: Simon Pena <spenap>
Component: Tools / TestsAssignee: Simon Pena <spenap>
Status: RESOLVED FIXED    
Severity: Normal CC: cmarcelo, morrita, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch none

Description Simon Pena 2012-07-13 03:58:17 PDT
After upgrading the setup in one of the GTK 64bits bots, we found that if ruby1.9 is used instead of ruby1.8, the following error happens when running test-webkitpy:

/home/spenap/Projects/WebKit/Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:108:in `gsub': invalid byte sequence in UTF-8 (ArgumentError)
	from /home/spenap/Projects/WebKit/Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:108:in `normalize_line_ending'
	from /home/spenap/Projects/WebKit/Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:19:in `prettify'
	from /home/spenap/Projects/WebKit/Websites/bugs.webkit.org/PrettyPatch/prettify.rb:26:in `<main>'

This is caused by the new encoding engine that Ruby 1.9 uses (see http://blog.grayproductions.net/articles/ruby_19s_string)

I took a look at http://smyck.net/2011/05/13/files-with-mixed-and-invalid-encodings-in-ruby/ to work around it, but iconv will be deprecated in the future, and String#encode is recommended instead. However, by looking at http://blog.segment7.net/2010/12/17/from-iconv-iconv-to-string-encode, the new alternative is not as straight-forward as using iconv.

We're currently keeping ruby in version 1.8, which works as expected.
Comment 1 Simon Pena 2012-07-13 04:03:55 PDT
CCing Caio, who added the normalize_line_ending method originally.
Comment 2 Simon Pena 2012-07-13 04:16:13 PDT
Created attachment 152212 [details]
Patch
Comment 3 Simon Pena 2012-07-13 04:17:48 PDT
(In reply to comment #2)
> Created an attachment (id=152212) [details]
> Patch

I've attached a patch using Iconv, although, as mentioned in comment #0, it should be better to use String#encode.
Comment 4 Hajime Morrita 2012-07-18 19:25:09 PDT
(In reply to comment #3)
> (In reply to comment #2)
> > Created an attachment (id=152212) [details] [details]
> > Patch
> 
> I've attached a patch using Iconv, although, as mentioned in comment #0, it should be better to use String#encode.
Let's do what you think the best :-)
Comment 5 Simon Pena 2012-07-19 01:38:04 PDT
Comment on attachment 152212 [details]
Patch

OK, I'll try to come up with a proper fix later.
Comment 6 Simon Pena 2012-10-02 00:20:17 PDT
Created attachment 166628 [details]
Patch
Comment 7 Simon Pena 2012-10-02 00:23:30 PDT
This new version of the patch uses the "proper" implementation when Ruby's version is >= 1.9 (using the string encode method) and keeps the current one for 1.8.
Comment 8 Hajime Morrita 2012-10-02 22:04:04 PDT
Comment on attachment 166628 [details]
Patch

Wow, I didn't know that encode() support :fallback option.
Comment 9 WebKit Review Bot 2012-10-03 04:53:45 PDT
Comment on attachment 166628 [details]
Patch

Clearing flags on attachment: 166628

Committed r130276: <http://trac.webkit.org/changeset/130276>
Comment 10 WebKit Review Bot 2012-10-03 04:53:48 PDT
All reviewed patches have been landed.  Closing bug.