WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
NEW
Bug 110230
[harfbuzz] Always pass correct text direction to HarfBuzz
https://bugs.webkit.org/show_bug.cgi?id=110230
Summary
[harfbuzz] Always pass correct text direction to HarfBuzz
Behdad Esfahbod
Reported
2013-02-19 09:01:22 PST
The code in ./Source/WebCore/platform/graphics/harfbuzz/HarfBuzzShaper.cpp currently doesn't pass text direction to HarfBuzz when webkit is measuring text. I'm not sure whether this is a webkit limitation or just the harfbuzz layer. But we should fix this. FWIW, we should *always* pass the correct direction, script, and language (if known) to harfbuzz. This is followup from
bug 110145
.
Attachments
Add attachment
proposed patch, testcase, etc.
Glenn Adams
Comment 1
2013-02-19 09:48:58 PST
(In reply to
comment #0
)
> FWIW, we should *always* pass the correct direction, script, and language (if known) to harfbuzz.
How are you defining "correct"? Do you have a counter example showing the passing of an "incorrect" direction, script, or language? Since script is not specified by the author and the HTML5 et al specs do not formally define an algorithm for mapping some sequence of text to script, then how are you defining "correct" in this regard? How are you dealing with cases where some font formats define multiple values for "script" tags based on different versions of the font technology? For example, see [1] (script tag 'dev2' with post-2005 specifications) versus [2][3] (script tag 'deva' with pre-2005 implementations): [1]
http://www.microsoft.com/typography/OpenTypeDev/devanagari/intro.htm
[2]
http://lb1.www.ms.akadns.net/typography/otfntdev/devanot/
[3]
http://lb1.www.ms.akadns.net/typography/otfntdev/devanot/appen.htm
Behdad Esfahbod
Comment 2
2013-02-19 14:19:30 PST
(In reply to
comment #1
)
> (In reply to
comment #0
) > > FWIW, we should *always* pass the correct direction, script, and language (if known) to harfbuzz. > > How are you defining "correct"?
Correct is whatever the Unicode Bidirectional Algorithm says the piece of text should take. UBA is run before shaping happens.
> Do you have a counter example showing the passing of an "incorrect" direction, script, or language?
Yes. Normally, Arabic runs right-to-left. But you can force it to go left-to-right using special Unicode characters (aka LRO) or the <bdo> tag. When Arabic runs left-to-right, it "shapes" to different glyphs than when it goes right-to-left, because the shaping is dependent on what actually comes to the left and right of each character. If you measure the text without telling HarfBuzz it's left-to-right, it will assume that it's right-to-left, because that's the default direction for Arabic. And you get wrong results. Try selection this piece of text: data:text/html;charset=utf-8,<html><body style="font-size: 700px"><bdo dir=ltr>%D8%B3%D9%84%D9%85</body> The desired behavior is that it should behave the same as this: data:text/html;charset=utf-8,<html><body style="font-size: 700px">%D9%85%D9%84%D8%B3</body> The second test has the Arabic characters reversed, and running right-to-left. The first one has them forced left-to-right.
> Since script is not specified by the author and the HTML5 et al specs do not formally define an algorithm for mapping some sequence of text to script, then how are you defining "correct" in this regard?
Right. Unicode defines Script per character. All text rendering implementations have heuristics to assign script to characters of type Script=Common and Script=Inherited. They take their property from surrounding characters. For example, a U+002E FULL STOP character assumes the Script=Arabic property when used in Arabic text.
> How are you dealing with cases where some font formats define multiple values for "script" tags based on different versions of the font technology? For example, see [1] (script tag 'dev2' with post-2005 specifications) versus [2][3] (script tag 'deva' with pre-2005 implementations):
HarfBuzz knows about those. You can ignore it. What we're interested is the Unicode script assigned to the piece of text This, again, can be guess by HarfBuzz, except for the case that the whole piece of text has Script=Common or Script=Inherited. This can result in inferior shaping, but is not as serious as letting HarfBuzz guess text direction, which has much more severe implications.
Glenn Adams
Comment 3
2013-02-19 14:45:37 PST
(In reply to
comment #2
)
> (In reply to
comment #1
) > This can result in inferior shaping, but is not as serious as letting HarfBuzz guess text direction, which has much more severe implications.
I already understand the processing. I was just trying to get to the bottom of this bug, which I understand now as a failure to pass the UBA determined directionality (and other lang/script info) to HB. Are you working on or planning on working on this bug?
Behdad Esfahbod
Comment 4
2013-02-19 14:48:53 PST
(In reply to
comment #3
)
> (In reply to
comment #2
) > > (In reply to
comment #1
) > > This can result in inferior shaping, but is not as serious as letting HarfBuzz guess text direction, which has much more severe implications. > > I already understand the processing. I was just trying to get to the bottom of this bug, which I understand now as a failure to pass the UBA determined directionality (and other lang/script info) to HB.
Yes. It only happens when measuring text though. Rendering is fine. That happened as a result of this issue:
https://code.google.com/p/chromium/issues/detail?id=158969
> Are you working on or planning on working on this bug?
I'm studying the code to fix that, yes. Since bashi is away, I don't think anyone else will be looking into it if I don't.
Glenn Adams
Comment 5
2013-02-19 14:58:58 PST
(In reply to
comment #4
)
> (In reply to
comment #3
) > > Are you working on or planning on working on this bug? > > I'm studying the code to fix that, yes. Since bashi is away, I don't think anyone else will be looking into it if I don't.
If you don't have time, I could fix it. Sounds straightforward.
Behdad Esfahbod
Comment 6
2013-02-25 22:00:19 PST
(In reply to
comment #5
)
> (In reply to
comment #4
) > > (In reply to
comment #3
) > > > Are you working on or planning on working on this bug? > > > > I'm studying the code to fix that, yes. Since bashi is away, I don't think anyone else will be looking into it if I don't. > > If you don't have time, I could fix it. Sounds straightforward.
That's very nice of you to offer. I don't have a Chrome build, and won't have for a while (in the middle of relocation without a beefy machine). So if you can help, that's really appreciated. I did build webkitgtk today though, and the results in webkitgtk are different from Chrome. It looks like webkitgtk build doesn't respect <bdo> at all. At any rate, yes, lets try to nail this down. If you can take a look and see what you can find, that's appreciated. I'll also take a look at it with my webkitgtk build.
Glenn Adams
Comment 7
2013-02-25 22:04:19 PST
(In reply to
comment #6
)
> (In reply to
comment #5
) > > (In reply to
comment #4
) > > > (In reply to
comment #3
) > > > > Are you working on or planning on working on this bug? > > > > > > I'm studying the code to fix that, yes. Since bashi is away, I don't think anyone else will be looking into it if I don't. > > > > If you don't have time, I could fix it. Sounds straightforward. > > That's very nice of you to offer. I don't have a Chrome build, and won't have for a while (in the middle of relocation without a beefy machine). So if you can help, that's really appreciated. > > I did build webkitgtk today though, and the results in webkitgtk are different from Chrome. It looks like webkitgtk build doesn't respect <bdo> at all. > > At any rate, yes, lets try to nail this down. If you can take a look and see what you can find, that's appreciated. I'll also take a look at it with my webkitgtk build.
i'll try to look at this something in the next few days
Dominik Röttsches (drott)
Comment 8
2013-03-26 00:58:47 PDT
(In reply to
comment #7
)
> i'll try to look at this something in the next few days
Glenn, if you don't find the time, pls let me know. I would like to try in that case.
Glenn Adams
Comment 9
2013-03-26 07:44:28 PDT
(In reply to
comment #8
)
> (In reply to
comment #7
) > > > i'll try to look at this something in the next few days > > Glenn, if you don't find the time, pls let me know. I would like to try in that case.
go ahead Dominik, i'm occupied with some other bugs at present
Ahmad Saleem
Comment 10
2023-03-31 19:02:07 PDT
@ap - Do any of the ports use Harfbuzz? Because I am able to find this in WebKit Source only but not 'HarfBuzzShaper.cpp'.
https://github.com/WebKit/WebKit/tree/main/Source/WebCore/platform/graphics/harfbuzz
Alexey Proskuryakov
Comment 11
2023-03-31 21:17:32 PDT
I would have said no, but seeing recent changes in this directory from Apple contributors makes me feel uncertain. Myles will know for certain.
Michael Catanzaro
Comment 12
2023-04-01 05:50:42 PDT
I think all ports use Harfbuzz except Apple ports. But this bug report is 10 years old, so it's no surprise that HarfBuzzShaper.cpp does not exist anymore. I have no clue whether this bug is still a problem or not,
Adrian Perez
Comment 13
2023-04-01 06:04:41 PDT
(In reply to Michael Catanzaro from
comment #12
)
> I think all ports use Harfbuzz except Apple ports. But this bug report is 10 > years old, so it's no surprise that HarfBuzzShaper.cpp does not exist > anymore.
The file was removed in
bug #167956
, the patch moved some code around as well, but from a quick glance at it I am not able to decide if the issue is still there or not... I am far from knowing anything about HarfBuzz, but maybe that serves as a starting point?
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug