WebKitGTK is unable to render some Hebrew characters, displaying square boxes instead.
Here's the link to the test case:
Here's what it's supposed to look like:
And here's how it appears on WebKitGTK:
The problem seems to be that some of those characters are decomposed (i.e combinations of two or more characters). The W3C validator does indeed emit a warning saying that "Text run is not in Unicode Normalization Form C", but that doesn't seem to be a requirement for HTML and it should be possible to display the text correctly.
I reproduced this problem with WebKitGTK 2.22.6, and master seems to be affected as well.
Other tests (using Debian stretch packages):
- Chromium 71.0.3578.80 also fails.
- Firefox-ESR 60.5.0 works fine.
- Chrome 71.0.3578.98 works fine.
Other apps (Gedit, gnome-terminal, ...) seem to work fine as well.
Created attachment 361673 [details]
Here's a simpler version of the text case.
Created attachment 361674 [details]
Test case (NFC normalization)
Here's the same file as before but normalized using Unicode NFC.
This one can be displayed correctly.
Note this affects many languages, not just Hebrew.
(In reply to Michael Catanzaro from comment #3)
> Note this affects many languages, not just Hebrew.
Yes, most certainly, but this is the test case we have.
Created attachment 361789 [details]
(In reply to Carlos Garcia Campos from comment #5)
> Created attachment 361789 [details]
I tried it in the 2.22.x branch, I confirm that it fixes this issue.
Comment on attachment 361789 [details]
It's frustrating that so many ICU APIs have to be called twice in a row to be used safely, instead of just allocating the buffer for us. Oh well.
Committed r241402: <https://trac.webkit.org/changeset/241402>
*** Bug 184448 has been marked as a duplicate of this bug. ***