Bug 5563 - Combining diacritical marks on non-joining characters
Summary: Combining diacritical marks on non-joining characters
Status: RESOLVED WORKSFORME
Alias: None
Product: WebKit
Classification: Unclassified
Component: Layout and Rendering (show other bugs)
Version: 523.x (Safari 3)
Hardware: Mac OS X 10.5
: P2 Normal
Assignee: Nobody
URL: http://david.latapie.name/blog/1512-18-6
Keywords:
Depends on:
Blocks:
 
Reported: 2005-10-30 09:58 PST by David Latapie
Modified: 2012-08-17 00:15 PDT (History)
4 users (show)

See Also:


Attachments
Comparison of expected and actual results (33.80 KB, image/png)
2005-10-31 06:09 PST, David Latapie
no flags Details
Reduced test case (487 bytes, text/html)
2005-10-31 12:54 PST, Alexey Proskuryakov
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Latapie 2005-10-30 09:58:49 PST
Copied from bugreport.apple.com to male it publicly available

<http://blog.empyree.org/?2005/09/26/1878--bug-report-standalone-tildes-support>

The example below won't work correctly

-----------------
[??pw?] and not [??puw?g]

(you may have to read the source code with Camino to get the real text, as there is a lot of rendering 
problems)

If I use Camino, things occur as expected (the same goes if I enter the text on BBEdit, by the way), with 
the notable exception of ?, but it may come from my approximate knowledge of the IPA.

On the other hand, if I use Safari or Opera, tildes are not deported but instead put on top of the 
characters (normal behaviour for languages such as Spanish, but not for IPA). Even weirder, if I copy 
this “corrupted” text in BBEdit, normal behaviour occurs (with the exception, once again, of the i—see 
above). As a consequence, it makes me think of a display problem.

Escaping characters, did not help ([??pw?] and not [??puw?g])—and I don't want to hack it either.
Comment 1 Alexey Proskuryakov 2005-10-30 23:44:50 PST
Please attach screenshots of expected and actual renderings - from the description, I am not sure that I'm 
getting the same results as you.
Comment 2 David Latapie 2005-10-31 06:09:14 PST
Created attachment 4540 [details]
Comparison of expected and actual results

This is screenshot of two screenshots. The expected behaviour had been obtained
with BBEdit, by manually entering the characters (even copy-and-paste
concatenate the second tilde)

The actual is a screenshot on Opera (the only one to allow me to zoom enough).
Notice that result is the same in Camino (the Wikipedia page had been chenged
since last time. It may be the reason) or Safari.
Comment 3 Alexey Proskuryakov 2005-10-31 12:54:09 PST
Created attachment 4546 [details]
Reduced test case
Comment 4 Alexey Proskuryakov 2005-10-31 12:57:29 PST
First, let me say that you really want &#732; for the tilde, and not &#771. The latter is a combining tilde - 
it is supposed to be drawn on top of the preceding character. To examine properties of Unicode 
characters, I highly recommend UnicodeChecker <http://earthlingsoft.net>.

I have tried to come up with an explanation of Safari's behavior (where the tilde gets combined with the 
next character), but cannot explain it completely so far. Apparently, it has something to do with several 
text runs being created (using Verdana and Lucida Grande), and with the fact that some of the characters 
are themselves non-joining.

I'm confirming this as a bug, although it may end up as INVALID, being a bug in system software. 
However, TextEdit doesn't behave exactly like Safari here...
Comment 5 Alexey Proskuryakov 2005-10-31 13:22:21 PST
(In reply to comment #3)
> Reduced test case

I am not really sure what the expected results are here. All the lines should probably render the same 
characters, and I _think_ that Geneva and Times lines are currently correct (tilde is combined with 'i', but 
not with other characters)
Comment 6 David Latapie 2005-10-31 14:12:39 PST
(In reply to comment #4)
> First, let me say that you really want &#732; for the tilde, and not &#771. The latter is a combining tilde

Thank you for the information. Following you reply, I decided to look in Apple's character palette. It 
happens there are six different tildes. The one I was looking for seems to be neither the “normal” nor the 
combining one but a mix of both, both put at the top and standalone. This is the small tilde (CB9C in 
Unicode).

Anyway, it seems I serendipituously found a bug.
Comment 7 Alexey Proskuryakov 2005-10-31 21:16:01 PST
(In reply to comment #6)
> This is the small tilde (CB9C in Unicode).

Yes, CB9C is UTF-8 representation for &#732; :-)
Comment 8 Nicholas Shanks 2007-06-22 12:01:12 PDT
I still see this, for example on a wikipedia page I was reading a few moments ago[1], I saw this in Georgia Bold:

Ha’i’ą́há   (set page to UTF-8 if you can't read it)

The ą and ´ appear separated (i.e. do not combine), but it works fine if pasted into TextEdit.
Bumping version to 522+ and 9A466 accordingly.

I tried to fix this two years ago, but couldn't figure out exactly what was causing the bug, as the code looked to be correct.

[1] http://en.wikipedia.org/wiki/Chiricahua
Comment 9 Alexey Proskuryakov 2009-12-14 16:37:42 PST
<rdar://problem/4273656>
Comment 10 Alexey Proskuryakov 2012-08-17 00:15:26 PDT
This all looks correct to me in Safari 6 on Lion now.