<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>73519</bug_id>
          
          <creation_ts>2011-11-30 21:19:18 -0800</creation_ts>
          <short_desc>[Qt] QtWebKit does not apply correct encoding on some pages with CJK characters</short_desc>
          <delta_ts>2014-02-03 03:19:19 -0800</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>WebKit Qt</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>INVALID</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords>Qt</keywords>
          <priority>P3</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>0</everconfirmed>
          <reporter name="Dawit A.">adawit</reporter>
          <assigned_to name="Nobody">webkit-unassigned</assigned_to>
          <cc>ap</cc>
    
    <cc>moriramar</cc>
    
    <cc>sfcheng</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>512423</commentid>
    <comment_count>0</comment_count>
    <who name="Dawit A.">adawit</who>
    <bug_when>2011-11-30 21:19:18 -0800</bug_when>
    <thetext>The following bug was reported downstream against the kwebkitpart, but was validated to be an upstream issue using QtTestBrowser:

https://bugs.kde.org/show_bug.cgi?id=287690

          Summary: KWebkitPart does not apply correct locale encoding
                        settings on some pages with CJK characters.
          Product: kwebkitpart
          Version: unspecified
         Platform: Gentoo Packages
       OS/Version: Linux
           Status: UNCONFIRMED
         Severity: normal
         Priority: NOR
        Component: general
       AssignedTo: webkit-devel@kde.org
       ReportedBy: moriramar@gmail.com


Version:           unspecified (using KDE 4.7.2)
OS:                Linux

When I open some pages with both simplified Chinese characters and traditional Chinese characters, some characters are not displayed correctly. Pages
containing both Chinese characters and Japanese characters might cause this problem as well.

Personal guess:
These pages might be encoded in zh_CN.GBK or zh_CN.GB18030 (which contains more character encodings), while KWebkitPart might apply zh_CN.GB2312 (which is
generally considered as a subset of GBK).

Reproducible: Always

Steps to Reproduce:
1. Install a font covering CJK characters. Bitstream Cyberbit, WenQuanYi Zen Hei, WenQuanYi Microhei or Droid is OK.
2. Make sure zh_CN.GBK, zh_CN.GB2312, zh_CN.GB18030, zh_CN.UTF-8 locales are available on the system.
3. Open Konqueror 4.7.2 and enable Webkit mode.
4. Go to http://www.acfun.tv/v/ac265957/ , which might be a little slow.

Actual Results:
In the top bold title line of the page content, a black box with white question mark appears. In the next line, there are two black boxes seperated by a &quot;W&quot; character, followed by a &quot;o&quot; character. Trying &quot;View &gt;&gt; Encoding &gt;&gt; Simplified Chinese &gt;&gt;&quot; any GB* locales does not solve the problem. Opening this kind of pages has a chance to crash Konqueror.

Expected Results:
No these black boxes and &quot;W&quot; or &quot;o&quot; characters in these two line. KHTML can show this page well when encoding is set to &quot;Simplified Chinese &gt;&gt; GBK&quot; or &quot;Simplified Chinese &gt;&gt; GB18030&quot;, which can be referred to.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>512489</commentid>
    <comment_count>1</comment_count>
    <who name="">moriramar</who>
    <bug_when>2011-11-30 22:28:38 -0800</bug_when>
    <thetext>*** Bug 73447 has been marked as a duplicate of this bug. ***</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>531915</commentid>
    <comment_count>2</comment_count>
      <attachid>121609</attachid>
    <who name="Stephen">sfcheng</who>
    <bug_when>2012-01-08 18:52:49 -0800</bug_when>
    <thetext>Created attachment 121609
test page</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>531916</commentid>
    <comment_count>3</comment_count>
    <who name="Stephen">sfcheng</who>
    <bug_when>2012-01-08 18:56:29 -0800</bug_when>
    <thetext>I can produce the same bug as well. The attachment above is a test page which contains both chinese simplified and chinese traditional characters. The chinese traditional characters show up as junk inside QtWebkit. The same page is displayed correctly inside IE and Webkit. 

(In reply to comment #2)
&gt; Created an attachment (id=121609) [details]
&gt; test page</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>975351</commentid>
    <comment_count>4</comment_count>
    <who name="Jocelyn Turcotte">jturcotte</who>
    <bug_when>2014-02-03 03:19:19 -0800</bug_when>
    <thetext>=== Bulk closing of Qt bugs ===

If you believe that this bug report is still relevant for a non-Qt port of webkit.org, please re-open it and remove [Qt] from the summary.

If you believe that this is still an important QtWebKit bug, please fill a new report at https://bugreports.qt-project.org and add a link to this issue. See http://qt-project.org/wiki/ReportingBugsInQt for additional guidelines.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="0"
              isprivate="0"
          >
            <attachid>121609</attachid>
            <date>2012-01-08 18:52:49 -0800</date>
            <delta_ts>2012-01-08 18:52:49 -0800</delta_ts>
            <desc>test page</desc>
            <filename>gb2312_testpage.htm</filename>
            <type>text/html</type>
            <size>576</size>
            <attacher name="Stephen">sfcheng</attacher>
            
              <data encoding="base64">PCFET0NUWVBFIEhUTUwgUFVCTElDICItLy9XM0MvL0RURCBIVE1MIDQuMCBUcmFuc2l0aW9uYWwv
L0VOIj4NCjxIVE1MPjxIRUFEPjxUSVRMRT7DwNP9s/bB+dbWu/nS8rvsus+67yAtIM60w/u/1bzk
KG1pdGJicy5jb20pIC0guqPN4ruqyMu12tK7w8W7pzwvVElUTEU+DQo8TUVUQSBjb250ZW50PSJ0
ZXh0L2h0bWw7IGNoYXJzZXQ9Z2IyMzEyIiBodHRwLWVxdWl2PUNvbnRlbnQtVHlwZT4NCjwvSEVB
RD4NCjxCT0RZPg0KPHA+DQrDwLn6tu3A1bjUv8bRp7zS0PuyvKOszbi5/bvsus82uPa647rTuu+1
xLyr1OfG2sXfzKWjrMrXtM6zybmmxeDT/bP2M+tiobi7+dLyu+y6z6G5uu/X06Os09DN+86q0r3R
p9HQvr+0+MC01ti08827xsaho9HQvr+xqLjmvavT2tfu0MLSu8bayKjN/sn6w/y/xtGnxtq/r6G2
z7iw+6G3v6+1x6Gjsru5/dPQ16i80rWj0MTR0L6/u/LOqtF90XXIy7+qwrejrNLg09C2r87vyKjS
5tfp1q/F+sbA08zI58Xa0XWhuL/G0ae51s7vobmjrNbK0sm/xtGnvefErsrTtq/O78io0uajrLWj
0MS7+dLyuMTU7LrzufvE0cHPo6y/ycTcwe7KtdHptq/O78rcv+ChozwvcD4NCjwvQk9EWT48L0hU
TUw+IA0K
</data>

          </attachment>
      

    </bug>

</bugzilla>