Safari and Firefox show fine this site <http://jn.sapo.pt> but Webkit no. It has some problem with the reading of the Text Enconding.
Confirmed as a regression with r18673. That appears to be caused by some garbage before the beginning of HTML document: ---------------------------------------------- <!-- temp --><script language="JavaScript" type="text/JavaScript"> document.write ('<SCR' + 'IPT SRC="http://ads.sapo.pt/js.ng/site=lusomundo&chan=jn&adsize=1x1&type=richmedia&TileID='+TileID+'"></SCR' + 'IPT>'); </script> <!-- /temp --><!--HEADER--> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> ----------------------------------------------
The validator.w3.org says about <http://sapo.pt>: Sorry! This document can not be checked. Sorry, I am unable to validate this document because on line 636, 660 it contained one or more bytes that I cannot interpret as utf-8 (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication.
Correction: The validator.w3.org says about <http://jn.sapo.pt/> ... and not "The validator.w3.org says about <http://sapo.pt> ..."
Created attachment 12414 [details] proposed fix Invalid HTML has lots of ways to fool our charset meta detector. I'm wondering why we aren't getting a lot more reports of such, though.
Comment on attachment 12414 [details] proposed fix I guess this is OK, but I'm worried that it's a little risky to ignore tags in scripts when we don't know enough about script syntax to properly handle comments inside the script and know when the script ends. But ... r=me
Committed revision 18833.