Bug 99614 - [MathML] Render stretched operators on all platforms
: [MathML] Render stretched operators on all platforms
Status: NEW
: WebKit
MathML
: 528+ (Nightly build)
: Unspecified Unspecified
: P2 Normal
Assigned To:
:
:
: 47780 72828 122297
: 84019 99623
  Show dependency treegraph
 
Reported: 2012-10-17 10:44 PST by
Modified: 2014-02-17 12:58 PST (History)


Attachments
Example page with Cambria Math font, extender glyphs for large operators (1.45 KB, text/html)
2012-10-21 14:38 PST, Dave Barton
no flags Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2012-10-17 10:44:11 PST
Currently, stretched parentheses and various stretched brackets are rendered using "extension glyphs", e.g. U+239B to U+23AE, using the Mac's Symbol font. We could use the STIXSizeOneSym font for this on all systems, ideally by including it with WebKit, or perhaps linking to it as a downloadable web font. There are licensing issues with this, plus occasional gaps between glyphs that will need to be debugged. Alternatively, we could just draw entire stretched glyphs ourselves using Bezier cubics. I don't know if this would produce noticeably worse output because of anti-aliasing, or if that is important. Simply using a transform matrix to vertically stretch a normal parenthesis or bracket character is not desirable because it thickens the horizontal parts of the character more than the vertical parts.

So before diving into licensing issues or webfonts or debugging, the first question is whether the newer approach using Bezier cubics is acceptable. Does anyone have opinions or knowledge on this?

This is the most critical current MathML issue. Until it is resolved, MathML in WebKit is basically unusable by web page authors, since the rendering on e.g. Windows is unacceptably bad compared to MathJax, or LaTeX images.
------- Comment #1 From 2012-10-17 11:46:02 PST -------
How do MathJax, or LaTex or other implementations handle this question?  What about Firefox?
------- Comment #2 From 2012-10-17 12:22:12 PST -------
I've asked the MathJax project leader (Peter Krautzberger). They may have experience with both approaches, since they have an option to render via SVG. I believe LaTeX and Firefox and MathPlayer all use the very old approach of glyph parts, even with custom pre-Unicode fonts.
------- Comment #3 From 2012-10-17 12:26:01 PST -------
They also tend to have tables of hard-coded glyph metrics, I think.

I'm not trying to bias our answer, just point out that those implementations are very old. I don't think dynamic Bezier cubics, high screen resolution, and automatic anti-aliasing were available when those implementations (other than MathJax) were written. In TeX's case, there probably weren't even TrueType or Postscript fonts, just bitmapped fonts.
------- Comment #4 From 2012-10-18 10:35:24 PST -------
(In reply to comment #0)

> So before diving into licensing issues or webfonts or debugging, the first question is whether the newer approach using Bezier cubics is acceptable. Does anyone have opinions or knowledge on this?

I have two concerns.
1) I worry that the lack of font smoothing (subpixel antialiasing) on the stretched operators will be noticeable, especially since they will appear right next to properly smoothed glyphs, including sometimes non-stretched versions of the same operators. It’s hard to judge how serious of an issue this is without seeing an example or a simulation of what this would look like.
2) I think that it’s bad to ignore the font property on operators. It will limit authors’ ability to use different fonts in MathML.
------- Comment #5 From 2012-10-18 12:37:31 PST -------
1) Good idea, I'll try to create a mock-up so we can look at it.

2) The font property would still apply to unstretched operators. It doesn't really apply much to stretched operators in either scheme, since there are a very limited number of fonts with glyphs for the extension characters. I guess this could change eventually.
------- Comment #6 From 2012-10-19 10:46:26 PST -------
Hi,

[Trying to jump right in there.]

Eric Seidel asked how MathJax deals with stretchy characters. By default, MathJax will use locally installed STIX or MathJax fonts. If those aren't available, MathJax will download the MathJax webfonts dynamically -- which include stretchy characters. But as Dave Barton pointed out, MathJax has an SVG-output mode. That mode uses outlines derived from the MathJax fonts. Some people dislike the visual of the SVG-output; this is probably due to differences between svg and font antialiasing (you can take a look at [4]).

Dave Barton  suggested to ship STIX fonts. I saw that the STIX license (SIL OFL) has come up on the webkit-dev list a while ago [2] (with no conclusion). As you probably know, OFL is considered free by FSF but flexible for commercial needs (e.g., OSX ships with them), see also [1]. An alternative/addition could be to ship the MathJax fonts which are licensed under Apache. If that isn't working we could certainly discuss options. 

Mozilla suggests a couple of options for installing suitable fonts at [3] -- but Fred Wang will know more.

Peter.


[1] https://en.wikipedia.org/wiki/SIL_Open_Font_License
[2] http://www.mail-archive.com/webkit-dev@lists.webkit.org/msg12064.html
[3] https://developer.mozilla.org/en-US/docs/Mozilla_MathML_Project/Fonts
[4] http://www.mathjax.org/demos/mathml-samples/
------- Comment #7 From 2012-10-21 13:58:03 PST -------
I discussed all this with Fred Wang. He pointed out that according to the MathML 3 spec, a lot of other characters like arrows are stretchy by default, plus the user can give a stretchy="true" attribute to any operator. So it seems we need to implement scaling by a transform matrix as a fallback in any case, even if it's not optimal visually.

I also tracked down the vertical gaps between extenders problem, in bug 99921. After fixing that and simplifying the code, I am more inclined to use STIXSizeOneSym or whatever font the user prefers for extender glyphs. As mitz points out, this gives the user the most control (via CSS), as well as the best anti-aliased results.

On the other hand, I have to say that Peter's MathJax SVG output looks fine to me. (Thanks, Peter!)

Leaving aside licensing for the moment, Fred also made several technical points about accessing fonts for extender and other special mathematical characters. Even if we want to use downloadable webfonts, apparently MathJax at least has had problems with timing issues and detecting when the font has been downloaded (so that layout can be triggered after that). Also you'd really like the user to be able to use the same mathematical font(s) outside WebKit, e.g. in a word processor. On OS X Lion or later, STIX is included. On Windows Vista or later, or Microsoft Office 7, Cambria Math is included. However, this font has issues - see my next comment.
------- Comment #8 From 2012-10-21 14:38:41 PST -------
Created an attachment (id=169812) [details]
Example page with Cambria Math font, extender glyphs for large operators

It's instructive to view this with a variety of browsers, especially on Windows. Cambria Math, like STIX, includes some tall glyphs, so its default line height is very large. Only Internet Explorer knows enough to compensate for this. However, recent webkit builds seem ok using { -webkit-line-box-contain: glyphs }, which MathML uses. There are also issues with finding extender and other characters at their Unicode code points, since OpenType fonts sometimes use glyph ids and private internal tables. (Fred Wang knows more about this than I do.) Also, italic and bold are reportedly not font variants, but rather use Unicode code points > 2^16, though they seem to work in the attached page, even in Safari on Windows (which did not used to support Unicode characters > 2^16). See https://bugzilla.mozilla.org/show_bug.cgi?id=372351 and related pages for (old?) Firefox issues with Cambria Math, for instance.

In summary, this page is being handled better in webkit than it used to be, though there are still issues. I will investigate further.
------- Comment #9 From 2012-10-21 15:44:40 PST -------
(In reply to comment #7)
> I discussed all this with Fred Wang. He pointed out that according to the MathML 3 spec, a lot of other characters like arrows are stretchy by default, plus the user can give a stretchy="true" attribute to any operator. So it seems we need to implement scaling by a transform matrix as a fallback in any case, even if it's not optimal visually.


Scaling may be the best option in a browser context (although it gives the TeX user in me the shivers) however "need" may be be putting it too strongly here.
The render is allowed to ignore the stretchy attribute on characters it can't (or choose not to) stretch. The base (La)TeX math support for example never applies scaling to operators (although users are known to apply scaling anyway). LaTeX only stretches arrows (and other symbols) where the stretching can be accomplished by inserting multiple (negatively kerned) - or = to extend the arrow.

That said, whatever it says in the spec, user expectation may be that the browser stretches any symbol when this is requested, especially as that's what firefox now does.

The mathml spec says:

In practice, typical renderers will only be able to stretch a small set of characters, and quite possibly will only be able to generate a discrete set of character sizes.
------- Comment #10 From 2012-10-26 02:58:11 PST -------
I just came back to Paris. Dave & I discussed the options during my visit in California and basically Webkit could just use the existing tables (operator dictionary and stretchy constructions) that have been written for Gecko as well as the scaling fallback. Ideally in the future, add support for the Open Type Math table. I think Davide gave his point of view on this in some mail exchange and he agrees that the Bezier method is not necessarily better. To answer a question from Sudarshan in another mail exchange, yes Gecko uses hash tables to map a given unicode character to its operator dictionary entry and to its stretchy constructions. I think we should have a meeting on irc (Mozilla has a #mathml channel, but I believe we can find a channel for Webkit devs too) with anyone interested in this issue. We can discuss all of this more concretely, pointing to Webkit/Mozilla source code, font data etc. That would be more efficient than mail exchanges and bugzilla discussions, I guess.
------- Comment #11 From 2012-10-29 18:27:08 PST -------
Good idea, I would be happy to chat on any #mathml channel.

In the meantime, I have investigated the Cambria Math font some more, since it is the one mathematical font always present on recent Windows machines. I think it has an evolving/improving history.

If you view the page I attached to this bug in Windows Vista, the Cambria Math glyphs are only used for the extender characters when that font-family is specifically named in CSS. Otherwise, you can get missing glyphs (boxes). By Windows 7, this seems to have been corrected. (Since mathml.css does name Cambria Math explicitly, we seem to be ok even on Vista.)

If you view http://www.fileformat.info/info/unicode/font/cambria_math/grid.htm (beware - that site is intermittent) on a recent Windows machine (so Cambria Math is installed) in Chrome Canary, you'll see that all the glyphs render, but 6 of our extender glyphs are very tall (U+ 239B, 239D, 239E, 23A0, 23A8, 23AC) and 2 are pretty tall (U+ 2320, 2321).

Overall, Cambria Math has a very large font ascent & descent, like STIXGeneral, because it contains a lot of symbols, and a few are very tall, as I stated in comment 8.

Cambria Math also contains OpenType Math tables, as Fred Wang told me, to get to extra non-Unicode glyphs. However, I think these glyphs now include things like entire parentheses and brackets at a few larger sizes, like the STIXSizeTwoSym through STIXSizeFiveSym fonts. For extender glyphs that we can use to build stretched parentheses and brackets at any size, we do have Unicode code points, e.g. U+239B to U+23AE.

So I think the bottom line is that WebKit as it stands today creates readable MathML on all recent Windows machines, including the standard stretched operators, without requiring users to download e.g. the STIX fonts. If any Windows users want to verify that, e.g. using Chrome Canary, that would be groovy.

In the future, it's possible that more fonts with OpenType Math tables will be created, and maybe we could improve our layout & rendering by using that information. This may have a higher priority for desktop publishing software than for web browsers, however. We can at least easily use new fonts, however, since we call platformBoundsForGlyph to get individual glyph metrics, instead of hard coding them in our own private tables.

Some good reading on Cambria Math and OpenType Math fonts:
http://typophile.com/node/81098 - especially the later posts
http://latex-alive.tumblr.com/post/9792202222

Finally, note that Cambria Math has grown in other ways as well. Version 5.0 reportedly had 4722 glyphs, whereas version 6.80, shipped with Windows 8, has 7320 glyphs. I think support for bold and italic characters has improved as well, as I also mentioned in comment 8.

David Carlisle, should we ask Murray Sargent to comment on all this?
------- Comment #12 From 2012-10-30 02:59:28 PST -------
(In reply to comment #11)

> In the future, it's possible that more fonts with OpenType Math tables will be created, and maybe we could improve our layout & rendering by using that information. 



Note there is already a version of STIX with these tables (which are needed for the unicode aware TeX flavours xetex and luatex, as well as for Word) 
see


https://github.com/khaledhosny/xits-math

actually I think the offical stix 1.1 release has fonts with these tables too, I must check...



> Finally, note that Cambria Math has grown in other ways as well. Version 5.0 reportedly had 4722 glyphs, whereas version 6.80, shipped with Windows 8, has 7320 glyphs. I think support for bold and italic characters has improved as well, as I also mentioned in comment 8.
> 
> David Carlisle, should we ask Murray Sargent to comment on all this?

Yes I can ask him if you like, I know at some point we discussed Microsoft releasing "offical" documentation of the opentype math font tables and Cambria Math glyph coverage.
------- Comment #13 From 2012-10-30 10:24:09 PST -------
OK, so essentially what you are saying is that Webkit should be able to render the Unicode constructions (or maybe only the few constructions currently implemented) when the unicode fonts are installed. In Mozilla, the Unicode constructions are defined by this table:

http://dxr.mozilla.org/mozilla-central/layout/mathml/mathfontUnicode.properties

and the CSS font-family rule to specify the fonts to use for Unicode constructions (or to pick Unicode characters used in mathematics in general) is here:

http://dxr.mozilla.org/mozilla-central/layout/mathml/mathml.css#l12

See also the note here:

https://developer.mozilla.org/en-US/docs/Mozilla_MathML_Project/Fonts#Other_fonts

I guess this bug is more general than just the limited support provided by Unicode constructions. When you are done with the security bugs I guess we can meet on irc.mozilla.org #mathml or on the #webkit channel.
------- Comment #14 From 2014-02-14 08:13:10 PST -------
On bug 122297, I've attached an experimental patch that shows how to read the OpenType MATH table to determine the default linethickness of fraction bars. It will be under an OPENTYPE_MATH flag to start with, since the libraries to read OpenType tables only seem available on the GTK/EFL ports at the moment. I'm now considering how to read the GlyphAssembly table for vertical stretching. 

Currently, I think everything is done in RenderMathMLOperator::paintCharacter where the UChar character is used to get metrics (via glyphDataForCharacter) and to paint it (via drawText). 

It seems that there is a glyphDataForIndex function in GlyphPage and that
most platforms have a Font::drawGlyphs function to draw glyphs. So perhaps we can switch from Unicode code point to glyph index. For GTK/EFL we would read the index directly from the MATH table and for the other ports we would have to use hardcoded tables of glyph indexes and constructions (e.g. for the STIX fonts).