RESOLVED INVALID 100384
Webkit 1.10.1 shows "question mark" (?) instead utf8 char
https://bugs.webkit.org/show_bug.cgi?id=100384
Summary Webkit 1.10.1 shows "question mark" (?) instead utf8 char
Dâniel Fraga
Reported 2012-10-25 08:45:29 PDT
Created attachment 170661 [details] Wrong chars (utf8 problem?) After youtube changed the e-mail format form the comments received, I can't seeany chars with accents (for example: é, á, í etc). I use Fancy plugin for Claws-mail which uses webkit to show html e-mail. The Fancy plugin author said it was a webkit bug. This only happens with e-mails from youtube comments. Here it's the e-mail content: <html lang="pt"> <head> <title> Comentário postado sobre "Ração MAIS CARA para cães e gatos" </title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body> <table width="620" cellspacing="0" cellpadding="0" border="0" align="center"><tr><td bgcolor="#F0F0F0"> <table width="578" cellspacing="0" cellpadding="0" border="0" align="center"> <tr> <td height="16"></td> </tr> <tr> <td> <img src="http://s.ytimg.com/yt/img/email/digest/email_header.png"> </td> </tr> <tr> <td height="16"></td> </tr> <tr> <td align="left" bgcolor="#FFFFFF"> <div style="border-style:solid; border-width:1px; border-color:#CCCCCC;"> <table width="578" cellspacing="0" cellpadding="0" border="0" align="center"> <tr> <td height="22" colspan="3"></td> </tr> <tr> <td width="40"></td> <td width="498"> <div style=" font-family:arial,Arial,sans-serif; "> <table cellspacing="0" cellpadding="0" border="0"> <tr> <td bgcolor="#FFFFFF" align="left" width="50"> <img src="https://lh4.googleusercontent.com/-0raL6fsqLd8/AAAAAAAAAAI/AAAAAAAAAAA/-kee1yVLSUM/s28-c-k/photo.jpg" height="50" width="50"> </td> <td width="16"></td> <td> <div style=" font-family:arial,Arial,sans-serif; font-size:18px; color:#333333; line-height:24px; " height:"59" dir="ltr"> <a href="http://www.youtube.com/user/iaralice?feature=em-comment_received" style="text-decoration:none; color:#1C62B9;">Iara Alice Raymundo</a> fez um comentário sobre <a href="http://www.youtube.com/watch?v=nSikPg7XXgk&lc=d7sgBrAnVdmfpXDPp640qKEngjBPWwpuXfEPY5omtUs&lch=email&feature=em-comment_received" style="text-decoration:none; color:#1C62B9;" dir="ltr">Ração MAIS CARA para cães e gatos</a> </div> </td> </tr> </table> <table cellspacing="0" cellpadding="0" border="0"> <tr> <td width="498"> <div style="font-family:arial,Arial,sans-serif; font-size:13px; color:#333333; line-height:16px;" dir="ltr"> <div style=" font-family:arial,Arial,sans-serif; font-size:11px; color:#999999; line-height:14px; "> Para responder a este comentário, <a href="http://www.youtube.com/watch?v=nSikPg7XXgk&lcor=1&lc=d7sgBrAnVdmfpXDPp640qKEngjBPWwpuXfEPY5omtUs&lch=email&feature=em
Attachments
Wrong chars (utf8 problem?) (65.56 KB, image/png)
2012-10-25 08:45 PDT, Dâniel Fraga
no flags
Sample youtube e-mail (10.58 KB, text/plain)
2012-10-30 17:45 PDT, Dâniel Fraga
no flags
Midori output (69.22 KB, image/png)
2012-10-30 18:57 PDT, Dâniel Fraga
no flags
Sample youtube notification (8.28 KB, text/plain)
2012-11-03 15:55 PDT, Dâniel Fraga
no flags
Dâniel Fraga
Comment 1 2012-10-25 08:45:59 PDT
This also happens with 1.8.3 and 1.8.1 versions of webkit gtk.
Martin Robinson
Comment 2 2012-10-30 11:37:27 PDT
I saved the email you included in your comment and cannot reproduce this issue with GtkLauncher or Epiphany. What version of WebKitGTK+ are you using?
Martin Robinson
Comment 3 2012-10-30 11:37:45 PDT
(In reply to comment #2) > I saved the email you included in your comment and cannot reproduce this issue with GtkLauncher or Epiphany. What version of WebKitGTK+ are you using? Ah, sorry. I see that you already included that information.
Martin Robinson
Comment 4 2012-10-30 11:44:08 PDT
(In reply to comment #3) > (In reply to comment #2) > > I saved the email you included in your comment and cannot reproduce this issue with GtkLauncher or Epiphany. What version of WebKitGTK+ are you using? > > Ah, sorry. I see that you already included that information. Fancy seems to be trying to override the default encoding of the email: g_object_set(viewer->settings, "default-encoding", charset, NULL); from fancy_viewer.c line 141. It doesn't seem like that should cause an issue, but I wonder in your case what encoding is being used here. If I knew, I could test locally. Right before that line is a line like this: debug_print("using %s charset\n", charset); Maybe you can try to get that output.
Dâniel Fraga
Comment 5 2012-10-30 12:50:49 PDT
> Fancy seems to be trying to override the default encoding of the email: > > g_object_set(viewer->settings, "default-encoding", charset, NULL); > > from fancy_viewer.c line 141. It doesn't seem like that should cause an issue, but I wonder in your case what encoding is being used here. If I knew, I could test locally. > > Right before that line is a line like this: > > debug_print("using %s charset\n", charset); > > Maybe you can try to get that output. Hi Martin! Thanks for the reply. Fancy returns the following: fancy_viewer.c:141:using windows-1252 charset so maybe this is the problem right? So it's Fancy's fault?
Martin Robinson
Comment 6 2012-10-30 16:59:33 PDT
I can't seem to reproduce the problem just by setting the default-encoding alone. Are you sure that the data that Fancy is interpreting has the content-type meta tag? Is Claws or Fancy stripping it or not inserting it in some cases? Can you try opening this file in Epiphany or Midori on your computer?
Dâniel Fraga
Comment 7 2012-10-30 17:45:12 PDT
(In reply to comment #6) > I can't seem to reproduce the problem just by setting the default-encoding alone. Are you sure that the data that Fancy is interpreting has the content-type meta tag? Is Claws or Fancy stripping it or not inserting it in some cases? Can you try opening this file in Epiphany or Midori on your computer? Hi Martin, I'm waiting Fancy developer answer. I don't use Epiphany nor Midori, nor gnome. To install Epiphany I need gnome? It would be overkill. But I'm attaching the complete e-mail from youtube. If you can't reproduce there, then it's surely a Fancy bug. The attachment is youtube-message-utf8.txt
Dâniel Fraga
Comment 8 2012-10-30 17:45:43 PDT
Created attachment 171559 [details] Sample youtube e-mail
Martin Robinson
Comment 9 2012-10-30 18:12:27 PDT
(In reply to comment #8) > Created an attachment (id=171559) [details] > Sample youtube e-mail Epiphany has some Gnome dependencies, but Midori has very few.
Dâniel Fraga
Comment 10 2012-10-30 18:57:54 PDT
Created attachment 171565 [details] Midori output Ok, I installed Midori and here's the result: same problem. So it's a problem with webkit right?
Martin Robinson
Comment 11 2012-10-31 15:09:38 PDT
(In reply to comment #10) > Created an attachment (id=171565) [details] > Midori output > > Ok, I installed Midori and here's the result: same problem. > > So it's a problem with webkit right? Perhaps it's something to do with your environment settings?
Dâniel Fraga
Comment 12 2012-10-31 15:19:03 PDT
(In reply to comment #11) > Perhaps it's something to do with your environment settings? Hmm maybe, but what settings? I have, for example: LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= *** Is there any other environment variable I should check?
Dâniel Fraga
Comment 13 2012-11-03 00:55:50 PDT
(In reply to comment #11) > Perhaps it's something to do with your environment settings? Martin, are you the maintainer of webkit gtk? Is there any other place I can ask about this? This seems very difficult to solve :(
Dâniel Fraga
Comment 14 2012-11-03 15:55:14 PDT
Created attachment 172232 [details] Sample youtube notification Hi, I think I found the problem (please see the attachment): <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dutf-8= "> The youtube notification sends the e-mail with a broken line, so webkit gtk gets confused about this tag. If I change it to: <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dutf-8="> it works. So is it a bug?
Martin Robinson
Comment 15 2012-11-05 08:51:17 PST
(In reply to comment #14) > Created an attachment (id=172232) [details] > Sample youtube notification > > Hi, I think I found the problem (please see the attachment): > > <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dutf-8= > "> > > The youtube notification sends the e-mail with a broken line, so webkit gtk > gets confused about this tag. If I change it to: > > <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dutf-8="> > > it works. So is it a bug? Great find. I'm not sure if with the newline this should be interpreted correctly. You'd probably need to check the HTML5 spec to see what should happen in this case. Most likely though is that claws is adding the newline though and corrupting the HTML.
Martin Robinson
Comment 16 2012-12-11 02:26:08 PST
I find it very suspicious that the line length is around 80 characters. I'm going to close this bug since it looks like a but in claws.
Dâniel Fraga
Comment 17 2012-12-11 08:08:48 PST
(In reply to comment #16) > I find it very suspicious that the line length is around 80 characters. I'm going to close this bug since it looks like a but in claws. But it happens in Midori too :( Claws developers wrote that it is a bug in Youtube... I'll try again to contact youtube (so difficult). Thank you!
Martin Robinson
Comment 18 2012-12-11 08:37:07 PST
Did you save the output of the email and then load it in Midori? Are you sure that claws isn't line-wrapping the content before you load it in Midori?
Dâniel Fraga
Comment 19 2012-12-11 08:48:31 PST
(In reply to comment #18) > Did you save the output of the email and then load it in Midori? Are you sure that claws isn't line-wrapping the content before you load it in Midori? Hi Martin. Yes, I'm sure. You can confirm it here: http://www.thewildbeast.co.uk/claws-mail/bugzilla/show_bug.cgi?id=2768 The developer wrote that it is a youtube bug... maybe? Here it's what he wrote: ********************************** If I change the head line to: <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Diso-8859-1= "> It renders perfectly, as the HTML is in fact ISO, not UTF-8. Sorry, but I'm afraid this is a Youtube generator bug. ********************************* So I'm afraid it is a youtube bug. Anyway, thanks, let's hope youtube can fix it.
Note You need to log in before you can comment on or make changes to this bug.