RESOLVED FIXED 12165
REGRESSION: text encoding problem at jn.sapo.pt
https://bugs.webkit.org/show_bug.cgi?id=12165
Summary REGRESSION: text encoding problem at jn.sapo.pt
José Luís Andrade
Reported 2007-01-08 10:39:14 PST
Safari and Firefox show fine this site <http://jn.sapo.pt> but Webkit no. It has some problem with the reading of the Text Enconding.
Attachments
proposed fix (5.20 KB, patch)
2007-01-13 04:32 PST, Alexey Proskuryakov
darin: review+
Alexey Proskuryakov
Comment 1 2007-01-08 12:19:16 PST
Confirmed as a regression with r18673. That appears to be caused by some garbage before the beginning of HTML document: ---------------------------------------------- <!-- temp --><script language="JavaScript" type="text/JavaScript"> document.write ('<SCR' + 'IPT SRC="http://ads.sapo.pt/js.ng/site=lusomundo&chan=jn&adsize=1x1&type=richmedia&TileID='+TileID+'"></SCR' + 'IPT>'); </script> <!-- /temp --><!--HEADER--> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> ----------------------------------------------
José Luís Andrade
Comment 2 2007-01-08 13:25:33 PST
The validator.w3.org says about <http://sapo.pt>: Sorry! This document can not be checked. Sorry, I am unable to validate this document because on line 636, 660 it contained one or more bytes that I cannot interpret as utf-8 (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication.
José Luís Andrade
Comment 3 2007-01-08 13:32:13 PST
Correction: The validator.w3.org says about <http://jn.sapo.pt/> ... and not "The validator.w3.org says about <http://sapo.pt> ..."
Alexey Proskuryakov
Comment 4 2007-01-13 04:32:26 PST
Created attachment 12414 [details] proposed fix Invalid HTML has lots of ways to fool our charset meta detector. I'm wondering why we aren't getting a lot more reports of such, though.
Darin Adler
Comment 5 2007-01-13 07:11:58 PST
Comment on attachment 12414 [details] proposed fix I guess this is OK, but I'm worried that it's a little risky to ignore tags in scripts when we don't know enough about script syntax to properly handle comments inside the script and know when the script ends. But ... r=me
Alexey Proskuryakov
Comment 6 2007-01-13 08:52:12 PST
Committed revision 18833.
Note You need to log in before you can comment on or make changes to this bug.