Bug 34063 - fail to parse application/xhtml+xml files with encoding="iso-8859-1" and libxml2 >= 2.7.4
Summary: fail to parse application/xhtml+xml files with encoding="iso-8859-1" and libx...
Status: RESOLVED DUPLICATE of bug 30508
Alias: None
Product: WebKit
Classification: Unclassified
Component: XML (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC All
: P2 Major
Assignee: Nobody
URL: http://www.vinc17.net/test/webkit-lat...
Keywords:
Depends on:
Blocks:
 
Reported: 2010-01-24 18:57 PST by Vincent Lefevre
Modified: 2010-08-30 08:09 PDT (History)
2 users (show)

See Also:


Attachments
testcase (299 bytes, application/xhtml+xml)
2010-01-24 18:57 PST, Vincent Lefevre
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vincent Lefevre 2010-01-24 18:57:53 PST
Created attachment 47303 [details]
testcase

Webkit-based applications (midori, liferea, GtkLauncher) fail to parse XHTML files with encoding="iso-8859-1".

With the above URL (file also added as an attachment), under Linux (Debian) with libwebkit 1.1.19, I get:
  This page contains the following errors:
  error on line 2 at column 2: StartTag: invalid element name

and with a similar page (which validates with xmllint), under Mac OS X Tiger with Liferea and webkit-gtk 1.1.10, I get:
  This page contains the following errors:
  error on line 2 at column 2: Char 0x0 out of allowed range

(though there isn't such a character in the page).

There's no such problem with encoding="utf-8", e.g.
  http://www.vinc17.net/test/webkit-utf8.html

Note that these simplified examples contain only ASCII characters.

Also, I couldn't try with the latest nightly build (23 Jan) on my Mac OS X machine because it crashes immediately.
Comment 1 Vincent Lefevre 2010-01-24 19:03:05 PST
The bug occurs only when the file is served as application/xhtml+xml, not when it is served as text/html. That's bad because webkit declares to support application/xhtml+xml.
Comment 2 Alexey Proskuryakov 2010-01-25 17:08:33 PST
I cannot reproduce with Safari on Mac OS X.
Comment 3 Vincent Lefevre 2010-01-26 07:31:02 PST
I couldn't reproduce it either with Safari, but my machine is under Mac OS X Tiger, so that's quite old. Now, I wonder whether this is specific to GTK (but I don't see what GTK has to do with something related to the encoding or MIME type declaration).

Also I think that there were no such problems in the past (several months ago), but the bug still occurs with old Debian packages of midori and libwebkit-1.0-1.
Comment 4 Alexey Proskuryakov 2010-01-26 07:38:36 PST
Could be related to bug 30508.
Comment 5 Vincent Lefevre 2010-01-26 12:19:35 PST
(In reply to comment #4)
> Could be related to bug 30508.

Yes, the bug occurs with the libxml2 2.7.4.dfsg-1 Debian package, but not with 2.7.3.dfsg-2.1.
Comment 6 Vincent Lefevre 2010-08-30 08:09:19 PDT
The patch fixing bug 30508 also fixes the problem I've reported. So, this is really a duplicate of bug 30508.

*** This bug has been marked as a duplicate of bug 30508 ***