Bug 3435 - Parentheses are backwards in Hebrew text (no bidi mirroring?)
Summary: Parentheses are backwards in Hebrew text (no bidi mirroring?)
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Layout and Rendering (show other bugs)
Version: 412
Hardware: Macintosh OS X 10.4
: P2 Normal
Assignee: Dave Hyatt
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-06-11 06:01 PDT by Ari
Modified: 2005-08-19 14:07 PDT (History)
1 user (show)

See Also:


Attachments
testcase (128 bytes, text/html)
2005-06-11 06:03 PDT, Ari
no flags Details
Proposed patch (4.53 KB, patch)
2005-07-03 12:12 PDT, mitz
mjs: review+
Details | Formatted Diff | Diff
layout test (446 bytes, text/html)
2005-07-12 10:26 PDT, Geoffrey Garen
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ari 2005-06-11 06:01:23 PDT
Parentheses in Hebrew text seem to be displayed in the wrong direction. In the attachment, the 
parentheses should open toward the word (i.e. be around the word), but instead open away from the word.

I think right and left parentheses are among the characters that must be replaced with "mirror" characters 
according to the Unicode spec. From a very brief perusal of the code, it seems that WebKit doesn't 
implement this.

See:
http://www.unicode.org/reports/tr9/
http://www.unicode.org/Public/UNIDATA/BidiMirroring.txt
Comment 1 Ari 2005-06-11 06:03:46 PDT
Created attachment 2247 [details]
testcase
Comment 2 mitz 2005-06-11 07:37:10 PDT
(In reply to comment #1)

I can't seem to reproduce this. (I think mirroring is mostly handled by ATSUI, although you can see that 
WebKit/WebCoreSupport.subproj/WebTextRenderer.m:2041 does take it into account in width 
calculations). There are mirroring-related issues in the rendering of "visually-ordered Hebrew", where 
mirroring happens although it shouldn't.
Comment 3 Ari 2005-06-11 08:34:46 PDT
(In reply to comment #2)
> (In reply to comment #1)
> 
> I can't seem to reproduce this. (I think mirroring is mostly handled by ATSUI, although you can see 
that 
> WebKit/WebCoreSupport.subproj/WebTextRenderer.m:2041 does take it into account in width 
> calculations). There are mirroring-related issues in the rendering of "visually-ordered Hebrew", where 
> mirroring happens although it shouldn't.

I've done a little more investigating, and it seems that WebTextRenderer.m has two code paths for text 
rendering, CG and ATSUI, and it uses ATSUI to render Hebrew. WebTextRenderer.m:2041 is on the CG 
code path, and so it never gets executed for Hebrew.

The ATSUI code path sets the text direction with kATSULineDirectionTag in 
_createATSUTextLayoutForRun, 
but its unclear whether ATSUI is supposed to do mirroring automatically. If it is, maybe it's an ATSUI 
bug.

What OS version are you using? I'm on 10.4.1

I've actually managed to fix this by putting in a hack that calls u_charMirror() in the ATSUI code path.
Comment 4 mitz 2005-06-11 09:14:58 PDT
(In reply to comment #3)

> What OS version are you using? I'm on 10.4.1

10.4.1, Safari 2.0 (412), Standard Font: Times 16, Default Encoding: Western (ISO Latin 1), and I get the 
same results with the CVS version. I remember mirroring being an issue in early releases, but I also 
remember that it has been fixed some time ago.

This may provide an important clue:
<http://forums.ort.org.il/scripts/showsm.asp?which_forum=307&mess=2242612>
A user reporting that he's seeing reverse parentheses and that his work around is disabling Arial in Font 
Book. Some Hebrew-speaking Mac users install a version of Arial that comes from Microsoft (and 
includes Hebrew), so maybe that's the culprit.
Comment 5 Nicholas Shanks 2005-06-11 10:53:11 PDT
Arabic also has problems. Where 12345 represents arabic characters as encoded in the bytestream, I see:

<p dir="rtl">(12345)</p>

come out as:

                                                                      )(54321

This is not related to Arial Unicode MS as the font used is Geeza Pro.
Comment 6 mitz 2005-06-11 14:02:47 PDT
(In reply to comment #3)

Looks like an ATSUI issue with certain fonts. Here's how I confirmed this:
Built ATSUIFontPanelDemo from /Developer/Examples/ATSUI/Font Panel Support/, run it, changed the 
text to Hebrew text in parentheses, then tried various fonts. Lucida Grande, New Peninim MT and Arial 
Hebrew do the mirroring fine, while that Arial from Microsoft (Version 2.98; Unique name: Monotype:Arial 
Regular:Version 2.98 (Microsoft)) gives reversed parentheses.

You should probably file a bug against ATSUI on Radar.
Comment 7 mitz 2005-06-11 14:16:59 PDT
(In reply to comment #5)
> Arabic also has problems. Where 12345 represents arabic characters as encoded in the bytestream, I 
see:
> 
> <p dir="rtl">(12345)</p>
> 
> come out as:
> 
>                                                                       )(54321
> 
> This is not related to Arial Unicode MS as the font used is Geeza Pro.

I can reproduce this with the latest shipping Safari but not with the latest CVS version (which includes a 
fix for paragraphs beginning with neutrals). What do you get when the paragraph doesn't begin with a 
neutral, e.g.
<p dir="rtl">12 (345) 67</p>
?
Comment 8 mitz 2005-07-02 09:37:14 PDT
Analysis:

ATSUI is supposed to handle mirroring, but it does so only if the font's glyph properties table ('prop') 
contains the mirrored glyph information. Thus mirroring doesn't happen with fonts that don't have that 
information in their 'prop' (or don't have a 'prop' at all), and when ATSUI uses a fallback font, but the 
original font doesn't have mirroring information.

There seems to be no way (short of looking at the 'prop' table) of finding out what ATSUI is going to do 
with mirrored characters. However, there is a way to force ATSUI to never mirror, and then 
WebTextRenderer could always mirror (like it does for CG). This can be done by reversing the RTL text 
run and prepending an LRE prior to passing to ATSUI, and not to setting the ATSUI line direction to RTL.

Another option is to drop ATSUI altogether for Hebrew (I don't know about Arabic) and use CG. This 
would solve other issues with Hebrew (such as poor choice of fallback fonts and no support for text 
justification, letter spacing and word spacing).
Comment 9 mitz 2005-07-03 12:12:47 PDT
Created attachment 2774 [details]
Proposed patch

This patch tries to guess whether ATSUI will perform mirroring or not, based on
the existence of a 'prop' table for the font, and applies mirroring itself it
ATSUI will not.
Comment 10 Maciej Stachowiak 2005-07-07 21:37:25 PDT
Comment on attachment 2774 [details]
Proposed patch

r=me

Make sure to add the test case to layout tests when landing.
Comment 11 Geoffrey Garen 2005-07-12 10:26:51 PDT
Created attachment 2927 [details]
layout test

The old test lacked a font declaration, so it wasn't guaranteed to fail.
Comment 12 Adele Peterson 2005-08-19 14:07:30 PDT
using <rdar://problem/4215911> to track this.