Bug 178625

Summary: REGRESSION(r222090): [HarfBuzz] Arabic shaping is broken except for first word in line
Product: WebKit Reporter: Khaled Hosny <dr.khaled.hosny>
Component: WebKitGTKAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: anass.1430, bugs-noreply, cgarcia, mcatanzaro
Priority: P3 Keywords: Gtk
Version: Other   
Hardware: PC   
OS: Linux   
See Also: https://bugzilla.redhat.com/show_bug.cgi?id=1506590
Bug Depends on: 178788    
Bug Blocks:    
Attachments:
Description Flags
Patch
none
Patch
mcatanzaro: review+
Test file for brackets handling none

Description Khaled Hosny 2017-10-21 13:03:24 PDT
In certain websites (e.g. https://fonts.google.com/?subset=arabic, http://www.linuxac.org/forum/threads/73644), Arabic shaping is broken except in the first word in each line; the Arabic letters are disjointed. It does not seem to affect all Arabic text, though.

This seems to be a regression as I only noticed it a few days ago.
Comment 1 Michael Catanzaro 2017-10-21 16:49:06 PDT
With what version of WebKitGTK+? Trunk, 2.18.0, 2.18.1?
Comment 2 Khaled Hosny 2017-10-22 04:17:19 PDT
(In reply to Michael Catanzaro from comment #1)
> With what version of WebKitGTK+? Trunk, 2.18.0, 2.18.1?

2.18.1
Comment 3 Carlos Garcia Campos 2017-10-24 08:55:59 PDT
Created attachment 324676 [details]
Patch

This patch fixes the issue without breaking material design icons. I haven't run the tests yet, though.
Comment 4 Carlos Garcia Campos 2017-10-24 09:36:39 PDT
Comment on attachment 324676 [details]
Patch

This broke several tests, I'll investigate it.
Comment 5 Carlos Garcia Campos 2017-10-26 03:15:46 PDT
Created attachment 324979 [details]
Patch
Comment 6 Michael Catanzaro 2017-10-26 04:51:54 PDT
*** Bug 178856 has been marked as a duplicate of this bug. ***
Comment 7 Michael Catanzaro 2017-10-26 04:54:53 PDT
Comment on attachment 324979 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=324979&action=review

> Source/WebCore/platform/graphics/harfbuzz/ComplexTextControllerHarfBuzz.cpp:153
> +        // FIXME: cover all other cases mentioned in the spec (ie. brackets or quotation marks).
> +        // https://bugs.webkit.org/show_bug.cgi?id=177003.

This is probably important.
Comment 8 Carlos Garcia Campos 2017-10-26 05:01:39 PDT
(In reply to Michael Catanzaro from comment #7)
> Comment on attachment 324979 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=324979&action=review
> 
> > Source/WebCore/platform/graphics/harfbuzz/ComplexTextControllerHarfBuzz.cpp:153
> > +        // FIXME: cover all other cases mentioned in the spec (ie. brackets or quotation marks).
> > +        // https://bugs.webkit.org/show_bug.cgi?id=177003.
> 
> This is probably important.

Not that much, this fixes edge cases like for example 

text [text with different script] more text

same applies for << >> and { }. It can happen that because of inherit or common, we end up using a different font for the open and close brackets. I would need a test case to fix this, though.
Comment 9 Carlos Garcia Campos 2017-10-26 05:03:39 PDT
Committed r224015: <https://trac.webkit.org/changeset/224015>
Comment 10 Anass Ahmed 2017-10-26 05:15:13 PDT
This is the first time I notice that the first word is properly rendered, but the remaining text is garbled. :)

Waiting for the fix to reach Koji on Fedora to be able to test.
Comment 11 Khaled Hosny 2017-10-26 06:08:57 PDT
Created attachment 324999 [details]
Test file for brackets handling

(In reply to Carlos Garcia Campos from comment #8)
> I would need a test case to fix this, though.

That is easy, in the attached HTML file the period should be rendered the same in both lines (you need the font from http://www.amirifont.org/), but currently the second line is different because the closing bracket takes the script of the Latin text before it and subsequently the period is rendered with the Latin script instead of Arabic.

It should be noted that both Firefox and Chrome do not seem to handle this, so it seems not to be a priority (LibreOffice does, but I wrote that code and it isn’t a web browser).
Comment 12 Carlos Garcia Campos 2017-10-26 06:19:29 PDT
(In reply to Khaled Hosny from comment #11)
> Created attachment 324999 [details]
> Test file for brackets handling
> 
> (In reply to Carlos Garcia Campos from comment #8)
> > I would need a test case to fix this, though.
> 
> That is easy, in the attached HTML file the period should be rendered the
> same in both lines (you need the font from http://www.amirifont.org/), but
> currently the second line is different because the closing bracket takes the
> script of the Latin text before it and subsequently the period is rendered
> with the Latin script instead of Arabic.
> 
> It should be noted that both Firefox and Chrome do not seem to handle this,
> so it seems not to be a priority (LibreOffice does, but I wrote that code
> and it isn’t a web browser).

Thanks! Could you move the test case and your explanation to bug #177003, please? This bug is closed now, so I'm afraid I will forget about this.