Bug 89843

Summary: REGRESSION (r95391): Arabic shaping is incorrect if ZWNJ exist
Product: WebKit Reporter: Kenichi Ishibashi <bashi>
Component: TextAssignee: Kenichi Ishibashi <bashi>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, behdad, mitz, webkit.review.bot
Priority: P1 Keywords: InRadar, Regression
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
URL: http://crbug.com/133720
Attachments:
Description Flags
test case
none
Patch
none
screenshot of WebKit nightly and Firefox13 none

Description Kenichi Ishibashi 2012-06-24 18:38:37 PDT
Details at http://crbug.com/133720.

WebKit ToT mac port doesn't render Arabic script correctly when there is ZWNJ in the text. Test case is attached.

The stable version of Safari can render it correctly. The difference comes from http://trac.webkit.org/changeset/95391. The change fixed a bug related with combining character sequence and handled ZWJ and ZWNJ as a part of combining character sequence. The cause is that fontDataForCombiningCharacterSequence() is failing when the sequence contains ZWNJ and the font doesn't contain glyph for ZWNJ. As the result, the word is divided into two complex text runs and they are rendered in isolated form.

To fix the problem, I suggest:
- Treat ZWJ and ZWNJ as base characters. I don't think ZWJ and ZWNJ compose a combining character sequence because they just tell whether to form cursive connection or not (they aren't combined with other character).
- If we see a ZWJ, don't separate a complex text run at that point so that CoreText can handle it.

I'll post a patch soon.
Comment 1 Kenichi Ishibashi 2012-06-24 18:39:07 PDT
Created attachment 149220 [details]
test case
Comment 2 Kenichi Ishibashi 2012-06-24 18:44:52 PDT
Created attachment 149221 [details]
Patch
Comment 3 Kenichi Ishibashi 2012-06-24 18:47:47 PDT
Hi mitz,

Could you take a look?
Comment 4 Kenichi Ishibashi 2012-06-26 21:38:32 PDT
Ping mitz?
Comment 5 Alexey Proskuryakov 2012-06-27 01:18:35 PDT
Does this affect any real life web sites?
Comment 6 Kenichi Ishibashi 2012-06-27 01:32:08 PDT
(In reply to comment #5)
> Does this affect any real life web sites?

I think so. http://www.google.com/intl/fa/goodtoknow/ is an example. I'll attach a screenshot of WebKit nightly and Firefox 13.
Comment 7 Kenichi Ishibashi 2012-06-27 01:32:38 PDT
Created attachment 149704 [details]
screenshot of WebKit nightly and Firefox13
Comment 8 Alexey Proskuryakov 2012-06-27 01:37:57 PDT
<rdar://problem/11757694>
Comment 9 Behdad Esfahbod 2012-06-27 05:06:06 PDT
Given that Geeza Pro doesn't have a ZWNJ glyph, and it's default Arabic font on OS X, pretty much any Persian website that doesn't request Arial or Tahoma is broken, as well as any using Google's Droid Arabic Naskh as a webfont.   I'm sure there are more fonts without those glyphs out there...
Comment 10 WebKit Review Bot 2012-07-01 16:08:22 PDT
Comment on attachment 149221 [details]
Patch

Clearing flags on attachment: 149221

Committed r121643: <http://trac.webkit.org/changeset/121643>
Comment 11 WebKit Review Bot 2012-07-01 16:08:27 PDT
All reviewed patches have been landed.  Closing bug.