WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
10094
REGRESSION: Japanese characters improperly rendering in TOT
https://bugs.webkit.org/show_bug.cgi?id=10094
Summary
REGRESSION: Japanese characters improperly rendering in TOT
Dan Wood
Reported
2006-07-24 17:16:13 PDT
To reproduce, take the attached reduced file with some Japanese characters. In Safari 419.3, you see different characters which correspond to the HTML source. In TOT (nightly build), you see the first character repeates several times. Notes: This seems to be a problem whether it's UTF-8 or UTF-16 encoded. These Japanese words came from files, so -- if I understand the issue -- the characters may be "decomposed" rather than "precomposed" Unicode. <
http://developer.apple.com/qa/qa2001/qa1235.html
>
Attachments
HTML file showing some Japanese characters
(642 bytes, text/html)
2006-07-24 17:17 PDT
,
Dan Wood
no flags
Details
decomposed vs. precomposed characters
(701 bytes, text/html)
2006-07-24 17:34 PDT
,
Dan Wood
no flags
Details
patch
(42.94 KB, patch)
2006-07-24 23:50 PDT
,
Graham Dennis
no flags
Details
Formatted Diff
Diff
patch (fixed)
(42.58 KB, patch)
2006-07-25 00:01 PDT
,
Graham Dennis
darin
: review+
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Dan Wood
Comment 1
2006-07-24 17:17:44 PDT
Created
attachment 9664
[details]
HTML file showing some Japanese characters
Dan Wood
Comment 2
2006-07-24 17:34:41 PDT
Created
attachment 9665
[details]
decomposed vs. precomposed characters The rendering problem seems to be in decompsed Japanese characters, not precomposed ones. The attachment shows the same string in both precomposed and decomposed forms. In the released Safari, these are IDENTICAL. In TOT, they are definitely not.
Graham Dennis
Comment 3
2006-07-24 23:50:46 PDT
Created
attachment 9667
[details]
patch This bug only occurs when the first character of a text run is a decomposed Japanese hiragana or katakana character with voice marks. The bug is caused because WidthIterator::advance does not update the m_currentCharacter variable while iterating, and then the call to WidthIterator::normalizeVoicingMarks normalises the character starting at m_currentCharacter instead of starting at currentCharacter. As a result, if the first character of a text run requires normalising, all subsequent characters in the run will be displayed as being identical to the first character (as m_currentCharacter will be 0). This patch fixes the bug by turning the currentCharacter variable into an argument to normalizeVoicingMarks. A testcase has been included in the patch, however it must be run as a pixel test for the test to actually check that the bug has been fixed.
Graham Dennis
Comment 4
2006-07-25 00:01:18 PDT
Created
attachment 9668
[details]
patch (fixed) I accidentally had some remnants of a previous patch in the previous patch file. I've fixed that in this version.
Alexey Proskuryakov
Comment 5
2006-07-25 01:39:44 PDT
Regression->P1. However, please note that any process generating HTML/XML/other Web content SHOULD normalize the text to NFC <
http://www.w3.org/TR/charmod-norm/#C300
>.
Alexey Proskuryakov
Comment 6
2006-07-25 02:38:38 PDT
Why is this special case for Hiragana&Katakana needed at all? The code appeared in r8701 without a test case: ------------------------------------------------------------------------ r8701 | rjw | 2005-02-25 23:54:19 +0300 (Fri, 25 Feb 2005) | 9 lines Fixed <
rdar://problem/4000962
> 8A375: Help Viewer displays voiced sound and semi-voiced characters strangely (characters don't seem to be composed) Added special case for voiced marks. Reviewed by John. * WebCoreSupport.subproj/WebTextRenderer.m: (widthForNextCharacter): ------------------------------------------------------------------------
Darin Adler
Comment 7
2006-07-25 08:30:00 PDT
Comment on
attachment 9668
[details]
patch (fixed) Looks good. r=me
Alexey Proskuryakov
Comment 8
2006-07-25 12:24:27 PDT
(In reply to
comment #6
)
> Why is this special case for Hiragana&Katakana needed at all?
Answering my own question: yes it is, because without it the voicing marks are drawn incorrectly. I still don't understand why voicing marks need to be handled differently from Latin accents or any other combining characters, but that's of course not related to this bug.
Alexey Proskuryakov
Comment 9
2006-07-27 11:52:12 PDT
Committed revision 15651.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug