Bug 44208 - Multi-character and some Unicode <mi>s italic in WebKit but not Firefox
Summary: Multi-character and some Unicode <mi>s italic in WebKit but not Firefox
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: MathML (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Frédéric Wang (:fredw)
URL: http://golem.ph.utexas.edu/category/2...
Keywords:
Depends on:
Blocks: 129097 mathml-in-mathjax 99623 115789 124838
  Show dependency treegraph
 
Reported: 2010-08-18 15:32 PDT by Randall Farmer
Modified: 2014-02-20 05:01 PST (History)
19 users (show)

See Also:


Attachments
testcase (227 bytes, text/html)
2013-12-19 08:42 PST, Frédéric Wang (:fredw)
no flags Details
Patch V1 (30.06 KB, patch)
2014-02-06 05:42 PST, Frédéric Wang (:fredw)
no flags Details | Formatted Diff | Diff
Patch V1 (30.17 KB, patch)
2014-02-06 05:46 PST, Frédéric Wang (:fredw)
no flags Details | Formatted Diff | Diff
Patch V2 (30.67 KB, patch)
2014-02-06 07:00 PST, Frédéric Wang (:fredw)
no flags Details | Formatted Diff | Diff
Patch V3 (30.58 KB, patch)
2014-02-06 10:40 PST, Frédéric Wang (:fredw)
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Randall Farmer 2010-08-18 15:32:51 PDT
Firefox's behavior seems to be that if a <mi> contains two or more characters, it's rendered in normal rather than italic style and there's space around it, and if it's a single character, it's italicized and no space.

So, for example, <mi>unitless</mi><mi>parameterization</mi> in the first table in the linked page appears as two non-italic words in Firefox and one italic word in WebKit.

(I'm sorry in advance if some of these bugs are unhelpful or invalid according to the standard; I'm just going through a page that heavily uses MathML and documenting what looks off.  And I absolutely love that WebKit is adding MathML support.)
Comment 1 Randall Farmer 2010-08-18 15:48:32 PDT
Firefox also seems to space and not slant some symbols, like blackboard bold letters.  The first math on the linked page is: "q(s): ℝ→ℝ."  In Firefox, the blackbold bold R's are not slanted; in WebKit, they are.

Firefox may have a whitelist of single characters that it slants in <mi>; for example, it slanted <mi>Π</mi> but not <mi>ü</mi> when I put that text in Firebug.  I suppose the source, or some document or other, likely has the precise answer.
Comment 2 François Sausset 2010-09-02 05:55:11 PDT
About italicized <mi> with more than one character: we were aware of that problem and I'm currently working on it.

For slanted ℝ, thanks to report it.
I'll have a look at the MathML3 specification, but it seems natural that "special" characters shouldn't be italicized.

Every help (testing, bug reports, etc) is welcomed!
Thanks!
Comment 3 François Sausset 2010-09-02 06:14:58 PDT
Here is the MathML3 recommendation answer about special characters:
http://www.w3.org/TR/MathML3/chapter7.html#chars.BMP-SMP
Comment 4 Frédéric Wang (:fredw) 2013-05-04 22:51:36 PDT
Yes, invariant characters should not be italicized:
https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols

See also bug 108778.
Comment 5 Frédéric Wang (:fredw) 2013-12-19 08:42:57 PST
Created attachment 219653 [details]
testcase
Comment 6 Frédéric Wang (:fredw) 2014-02-06 05:42:58 PST
Created attachment 223326 [details]
Patch V1

This patch fixes the most important situation (testcase and referenced URL) with multi-char vs sing-char mi.

There are some known issues (cf tests):

- things like <mi mathvariant="normal">x</mi> seems to produce the wrong spacing (or width?)
- some dynamically created content are not italicized

Since the WebKit's spacing is currently poor and since there are issues with dynamic updates, I think we can ignore these issues for now. Actually this is part of bug 124838 which prepares for the operator dictionary and further improvements ; so it might be worth taking this small piece and get feedback on it now rather than working on the big patch which seems to scare potential reviewers.

Note that instead of CSS style this should rather do some code point remapping inside the text rendering code. However, it's ok for now: Gecko has used that approximation for a long time (see https://bugzilla.mozilla.org/show_bug.cgi?id=114365 and https://bugzilla.mozilla.org/show_bug.cgi?id=930504) and MathJax does not implement mi/mathvariant properly.
Comment 7 Frédéric Wang (:fredw) 2014-02-06 05:46:29 PST
Created attachment 223327 [details]
Patch V1

Sorry, wrong patch.
Comment 8 Frédéric Wang (:fredw) 2014-02-06 07:00:11 PST
Created attachment 223331 [details]
Patch V2

It looks like the changes to CMakelist.txt were lost when I extracted the code for <mi> from attachment 222977 [details]...
Comment 9 chris fleizach 2014-02-06 09:06:16 PST
Comment on attachment 223331 [details]
Patch V2

View in context: https://bugs.webkit.org/attachment.cgi?id=223331&action=review

> Source/WebCore/ChangeLog:8
> +        This test prevents multi-char <mi> to be drawn in italic and prepare

can you explain what further improvements it prepares for

> Source/WebCore/rendering/mathml/RenderMathMLToken.cpp:2
> + * Copyright (C) 2013 Frédéric Wang (fred.wang@free.fr). All rights reserved.

2014

> Source/WebCore/rendering/mathml/RenderMathMLToken.cpp:29
> +#include "RenderMathMLToken.h"

i think this line goes right below config.h
it shouldn't need to be guarded by MATHML, since the header is also guarded by MATHML

> Source/WebCore/rendering/mathml/RenderMathMLToken.cpp:65
> +    m_text = element().textContent().stripWhiteSpace().simplifyWhiteSpace().impl();

Do you think we need to store m_text, or can a method always just return the right value, by doing

::text() {
return element().textContent().stripWhiteSpace().simplifyWhiteSpace().impl();;
}

> Source/WebCore/rendering/mathml/RenderMathMLToken.cpp:75
> +    if (tokenElement.hasTagName(MathMLNames::miTag)) {

You can probably use an early return here instead.

You might also want to assert that it has this tagName. It looks like this element is only created with a miTag

> Source/WebCore/rendering/mathml/RenderMathMLToken.h:2
> + * Copyright (C) 2010 Frédéric Wang (fred.wang@free.fr). All rights reserved.

2014

> Source/WebCore/rendering/mathml/RenderMathMLToken.h:45
> +    virtual bool isChildAllowed(const RenderObject&, const RenderStyle&) const override  { return true; };

extra space after "override"

> Source/WebCore/rendering/mathml/RenderMathMLToken.h:51
> +    String m_text;

Seems like m_text should be private. I don't see it referenced outside the .cpp file
Comment 10 Frédéric Wang (:fredw) 2014-02-06 09:13:29 PST
@Chris: this prepares for the cleanup of <mo> & <mfenced> frames and implementation of the operator dictionary (see bug 124838 and bug 99620 and all the bugs they block) and especially I try to do a clean thing with anonymous styles (although I'm not sure that's the case). I have only extracted the <mi> part, so that's why the class is called RenderMathMLToken and test <mi> and probably why I store m_text (because it was more convenient for <mo> IIRC).
Comment 11 Frédéric Wang (:fredw) 2014-02-06 10:40:28 PST
Created attachment 223346 [details]
Patch V3
Comment 12 WebKit Commit Bot 2014-02-06 11:55:28 PST
Comment on attachment 223346 [details]
Patch V3

Clearing flags on attachment: 223346

Committed r163553: <http://trac.webkit.org/changeset/163553>
Comment 13 WebKit Commit Bot 2014-02-06 11:55:34 PST
All reviewed patches have been landed.  Closing bug.