Bug 118053 - Parsing of MathML length
Summary: Parsing of MathML length
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: MathML (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Frédéric Wang (:fredw)
URL:
Keywords: WebExposed
Depends on:
Blocks: 115610 117305
  Show dependency treegraph
 
Reported: 2013-06-26 03:10 PDT by Frédéric Wang (:fredw)
Modified: 2014-03-10 12:26 PDT (History)
8 users (show)

See Also:


Attachments
Patch V1 (16.81 KB, patch)
2013-06-26 05:17 PDT, Frédéric Wang (:fredw)
cfleizach: review-
Details | Formatted Diff | Diff
Patch V2 (14.72 KB, patch)
2013-06-27 04:36 PDT, Frédéric Wang (:fredw)
no flags Details | Formatted Diff | Diff
Patch Final Version (14.23 KB, patch)
2013-06-27 13:25 PDT, Frédéric Wang (:fredw)
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Frédéric Wang (:fredw) 2013-06-26 03:10:43 PDT
See http://www.w3.org/TR/MathML/chapter2.html#type.length
Comment 1 Frédéric Wang (:fredw) 2013-06-26 05:17:48 PDT
Created attachment 205475 [details]
Patch V1

OK, I finally had time to come back to WebKit development today. I extracted the parsing changes from my mspace patch and added more tests for mfrac@linethickness.
Comment 2 chris fleizach 2013-06-26 12:48:01 PDT
Comment on attachment 205475 [details]
Patch V1

View in context: https://bugs.webkit.org/attachment.cgi?id=205475&action=review

thanks!

> LayoutTests/ChangeLog:12
> +        * mathml/presentation/mfrac-linethickness1-expected-mismatch.html: Added.

none of the tests have expected results

> LayoutTests/mathml/presentation/mfrac-linethickness1-expected-mismatch.html:8
> +  <div style="position: absolute; top: 0; left: 0; width: 500px; height: 150px; background: blue;"></div>

what is this testing? there is nothing about math in this test

> LayoutTests/mathml/presentation/mfrac-linethickness1.html:11
> +      <mfrac linethickness="200px">

these thickness tests should be in one line thickness test. i don't think we need separate files for each one

> Source/WebCore/rendering/mathml/RenderMathMLBlock.cpp:243
> +    // see if the negative sign is there

full sentence required for comment

> Source/WebCore/rendering/mathml/RenderMathMLBlock.cpp:256
> +            return false; // two dots encountered

comments should go on the same line. you should put this comment above the if line and make it a full sentence

> Source/WebCore/rendering/mathml/RenderMathMLBlock.cpp:257
> +        if (c == '.') {

one line if statements shouldn't use brackets

> Source/WebCore/rendering/mathml/RenderMathMLBlock.cpp:261
> +            // some authors leave blanks before the unit, but that shouldn't

comments should start with capitalized letters like full sentences

> Source/WebCore/rendering/mathml/RenderMathMLBlock.cpp:322
> +bool RenderMathMLBlock::ParseNamedSpace(const String& string, float& lengthValue, bool allowNegative)

length should probably be the return value, and there should be a parameter to confirm if it succeeded, like toIntStrict()

> Source/WebCore/rendering/mathml/RenderMathMLBlock.cpp:324
> +    float length = 0.;

don't append . for floating point literals
see "Floating point literals"
http://www.webkit.org/coding/coding-style.html

> Source/WebCore/rendering/mathml/RenderMathMLBlock.h:73
> +    bool ParseNamedSpace(const String&, float&, bool allowNegative = true);

i think both of these can be functions that are outside the class block. They don't really belong as instance methods of the class
also the names should start with lowercase

it looks like ParseNamedSpace is only used inside ParseLength… that method should be static inside the class file

> Source/WebCore/rendering/mathml/RenderMathMLFraction.cpp:82
> +        ParseLength(thickness, m_lineThickness, false);

you might want to check if the conversion succeeded before setting the value
Comment 3 Frédéric Wang (:fredw) 2013-06-26 13:04:34 PDT
(In reply to comment #2)
> (From update of attachment 205475 [details])
> none of the tests have expected results
>
> what is this testing? there is nothing about math in this test
> 
> these thickness tests should be in one line thickness test. i don't think we need separate files for each one

Thanks Chris. I don't know if you've seen my comments on bug 115610, but it seems that we have a misunderstanding about what these tests are. You seem to expect some other particular unit test format but note that I have used reftests:

http://trac.webkit.org/wiki/Writing%20Reftests

So basically the two pages

   mathml/presentation/mfrac-linethickness1.html
   mathml/presentation/mfrac-linethickness1-expected-mismatch.html
  
form one != reftests that is the test passes iff they don't render the same while the pair

   mathml/presentation/mfrac-linethickness2.html
   mathml/presentation/mfrac-linethickness2-expected-mismatch.html

forms one == reftest that is the test passes iff the two pages render exactly the same. Similarly for -3 -4 and -5.

Also note that by design it is clear that we can not group == and != reftests in the same test. It's perhaps less obvious, but we can not group != reftests in the same test either. However, I can group the == reftests in the same test if you want.
Comment 4 Frédéric Wang (:fredw) 2013-06-26 13:38:07 PDT
(In reply to comment #2)
> it looks like ParseNamedSpace is only used inside ParseLength… that method should be static inside the class file

As indicated in bug 115610 comment 5, I've made two separated functions because of mpadded attribute parsing which require calling the two functions plus doing some other stuff with pseudo-units. However, this is not implemented in WebKit yet, so I can make ParseNamedSpace static for the moment.

> 
> > Source/WebCore/rendering/mathml/RenderMathMLFraction.cpp:82
> > +        ParseLength(thickness, m_lineThickness, false);
> 
> you might want to check if the conversion succeeded before setting the value

Not sure what you mean here... m_lineThickness = gLineMedium sets m_lineThickness to the default value. If ParseLength fails, then the attribute is ignored and the default value is used. If ParseLength succeeds, then m_lineThickness is set to the parsed value (perhaps a multiple of the specified default for % and unitless). So the boolean success result is not really used here, but I think it may be useful in other situations.

I'd like to have clarification on what you expect exactly for the tests and about this last point before submitting a new version.
Comment 5 chris fleizach 2013-06-26 13:59:59 PDT
(In reply to comment #4)
> (In reply to comment #2)
> > it looks like ParseNamedSpace is only used inside ParseLength… that method should be static inside the class file
> 
> As indicated in bug 115610 comment 5, I've made two separated functions because of mpadded attribute parsing which require calling the two functions plus doing some other stuff with pseudo-units. However, this is not implemented in WebKit yet, so I can make ParseNamedSpace static for the moment.
> 
> > 
> > > Source/WebCore/rendering/mathml/RenderMathMLFraction.cpp:82
> > > +        ParseLength(thickness, m_lineThickness, false);
> > 
> > you might want to check if the conversion succeeded before setting the value
> 
> Not sure what you mean here... m_lineThickness = gLineMedium sets m_lineThickness to the default value. If ParseLength fails, then the attribute is ignored and the default value is used. If ParseLength succeeds, then m_lineThickness is set to the parsed value (perhaps a multiple of the specified default for % and unitless). So the boolean success result is not really used here, but I think it may be useful in other situations.
> 
> I'd like to have clarification on what you expect exactly for the tests and about this last point before submitting a new version.

(In reply to comment #3)
> (In reply to comment #2)
> > (From update of attachment 205475 [details] [details])
> > none of the tests have expected results
> >
> > what is this testing? there is nothing about math in this test
> > 
> > these thickness tests should be in one line thickness test. i don't think we need separate files for each one
> 
> Thanks Chris. I don't know if you've seen my comments on bug 115610, but it seems that we have a misunderstanding about what these tests are. You seem to expect some other particular unit test format but note that I have used reftests:
> 
> http://trac.webkit.org/wiki/Writing%20Reftests
> 
> So basically the two pages
> 
>    mathml/presentation/mfrac-linethickness1.html
>    mathml/presentation/mfrac-linethickness1-expected-mismatch.html
> 
> form one != reftests that is the test passes iff they don't render the same while the pair
> 
>    mathml/presentation/mfrac-linethickness2.html
>    mathml/presentation/mfrac-linethickness2-expected-mismatch.html
> 
> forms one == reftest that is the test passes iff the two pages render exactly the same. Similarly for -3 -4 and -5.
> 
> Also note that by design it is clear that we can not group == and != reftests in the same test. It's perhaps less obvious, but we can not group != reftests in the same test either. However, I can group the == reftests in the same test if you want.

I see. 

I don't know if this is the best way to test these features. For example, the veryverythinspace would pass as long as it's different from the default thickness. 

But if the parsing logic changed and this value became different somehow, it could still pass as long as it was different.

I feel like having expected results for these will result in more strict tests
Comment 6 chris fleizach 2013-06-26 14:04:51 PDT
(In reply to comment #4)
> (In reply to comment #2)
> > it looks like ParseNamedSpace is only used inside ParseLength… that method should be static inside the class file
> 
> As indicated in bug 115610 comment 5, I've made two separated functions because of mpadded attribute parsing which require calling the two functions plus doing some other stuff with pseudo-units. However, this is not implemented in WebKit yet, so I can make ParseNamedSpace static for the moment.

That's fine then to leave as is

> 
> > 
> > > Source/WebCore/rendering/mathml/RenderMathMLFraction.cpp:82
> > > +        ParseLength(thickness, m_lineThickness, false);
> > 
> > you might want to check if the conversion succeeded before setting the value
> 
> Not sure what you mean here... m_lineThickness = gLineMedium sets m_lineThickness to the default value. If ParseLength fails, then the attribute is ignored and the default value is used. If ParseLength succeeds, then m_lineThickness is set to the parsed value (perhaps a multiple of the specified default for % and unitless). So the boolean success result is not really used here, but I think it may be useful in other situations.

Yes I see that. the question is whether the function should return the result  or return success.

I think it could go either way, but I see that other methods return the result instead of success.

As to not checking the result value -- it obviously works, but it's possible someone in the future might set the value to zero on a failure and this might fail, so if the method returns a result you should probably check it for completeness

> 
> I'd like to have clarification on what you expect exactly for the tests and about this last point before submitting a new version.
Comment 7 Frédéric Wang (:fredw) 2013-06-26 14:23:14 PDT
> I see. 
> 
> I don't know if this is the best way to test these features. For example, the veryverythinspace would pass as long as it's different from the default thickness. 
> 
> But if the parsing logic changed and this value became different somehow, it could still pass as long as it was different.
> 
> I feel like having expected results for these will result in more strict tests

Here veryverythinspace is a == test so that should not happen, although what you say is true with != tess. What can happen is if the linethickness parsing is if completely broken (e.g. always use the default value) then == tests will always pass. However, the idea is to combine several != and == reftests and to rely on other features that are expected to be stable (e.g. CSS or HTML). This generally gives a good test coverage and using image reference is not needed. The tests for mspace would be important here as they test against CSS/HTML.

Strict tests are sometimes painful, especially for MathML where the exact pixel rendering is not defined as opposed to e.g. SVG or CSS. I've followed the WebKit MathML developments in the last few years and basically each time some tweaking is done on the rendering (e.g. modify the spacing in formulas), it breaks all the tests that have an image reference. In that situation, reftests are much more robust. The Mozilla and MathJax projects really only use these formats for layout tests.
Comment 8 Frédéric Wang (:fredw) 2013-06-26 14:29:21 PDT
> Yes I see that. the question is whether the function should return the result  or return success.
> 
> I think it could go either way, but I see that other methods return the result instead of success.
> 

The point is that the default value is passed as input (as it is used for % and unitless) and replaced by the computed value. If you use it as a return value you need one additional parameter. For consistency, I've kept the same signature for ParseNamedSpace, although it does not use the default value.

> As to not checking the result value -- it obviously works, but it's possible someone in the future might set the value to zero on a failure and this might fail, so if the method returns a result you should probably check it for completeness

So that would mean doing a noop by setting the variable to the default value again. Well, I can do that if you really want but the point was to save some code since the default value is used by % and unitless anyway. Or I can also just remove the boolean success. In Mozilla it is used to display parsing errors in the console, but I'm not sure if I'll need it in WebKit at the moment.
Comment 9 chris fleizach 2013-06-26 14:34:52 PDT
(In reply to comment #8)
> > Yes I see that. the question is whether the function should return the result  or return success.
> > 
> > I think it could go either way, but I see that other methods return the result instead of success.
> > 
> 
> The point is that the default value is passed as input (as it is used for % and unitless) and replaced by the computed value. If you use it as a return value you need one additional parameter. For consistency, I've kept the same signature for ParseNamedSpace, although it does not use the default value.

Yes i understand. I think we're trying to achieve consistency with other code in WebKit in this instance in regards to result in pass-by-value or result in result.

> 
> > As to not checking the result value -- it obviously works, but it's possible someone in the future might set the value to zero on a failure and this might fail, so if the method returns a result you should probably check it for completeness
> 
> So that would mean doing a noop by setting the variable to the default value again. Well, I can do that if you really want but the point was to save some code since the default value is used by % and unitless anyway. Or I can also just remove the boolean success. In Mozilla it is used to display parsing errors in the console, but I'm not sure if I'll need it in WebKit at the moment.

It's probably sufficient to note in the comment that if parsing fails then value will not be altered
Comment 10 chris fleizach 2013-06-26 14:35:28 PDT
(In reply to comment #7)
> > I see. 
> > 
> > I don't know if this is the best way to test these features. For example, the veryverythinspace would pass as long as it's different from the default thickness. 
> > 
> > But if the parsing logic changed and this value became different somehow, it could still pass as long as it was different.
> > 
> > I feel like having expected results for these will result in more strict tests
> 
> Here veryverythinspace is a == test so that should not happen, although what you say is true with != tess. What can happen is if the linethickness parsing is if completely broken (e.g. always use the default value) then == tests will always pass. However, the idea is to combine several != and == reftests and to rely on other features that are expected to be stable (e.g. CSS or HTML). This generally gives a good test coverage and using image reference is not needed. The tests for mspace would be important here as they test against CSS/HTML.
> 
> Strict tests are sometimes painful, especially for MathML where the exact pixel rendering is not defined as opposed to e.g. SVG or CSS. I've followed the WebKit MathML developments in the last few years and basically each time some tweaking is done on the rendering (e.g. modify the spacing in formulas), it breaks all the tests that have an image reference. In that situation, reftests are much more robust. The Mozilla and MathJax projects really only use these formats for layout tests.

Fair enough.
Comment 11 Frédéric Wang (:fredw) 2013-06-26 23:13:10 PDT
I can do whatever to preserve consistency with other code in WebKit but my point is that this MathML parsing is a bit special because of how the default value is handled. So basically do we want to do

float v;
float v2 = ParseLength(s, default value, b);
if (b) {
  v = v2;
  [...]
} [ else {
  ...
}]

or just

float v = default value;
[if] ParseLength(s, v, b) [... else ...]

where the brackets are optional parts when we really do something with the boolean success/failure. I personally prefer the concise form, especially if we are to use the function everywhere in the MathML code. I just did a grep on Gecko, and it's true that it is not used that much but that's still about 20 times (and more will probably come with implementation of e.g. elementary math):

nsMathMLFrame.cpp:nsMathMLFrame::ParseNumericValue(const nsString&   aString,
nsMathMLFrame.cpp:  if (!nsMathMLElement::ParseNumericValue(aString, cssValue, aFlags,
nsMathMLmfracFrame.cpp:      ParseNumericValue(aThicknessAttribute, &lineThickness,
nsMathMLmmultiscriptsFrame.cpp:    ParseNumericValue(value, &mSubScriptShift, 0, PresContext(),
nsMathMLmmultiscriptsFrame.cpp:    ParseNumericValue(value, &mSupScriptShift, 0, PresContext(),
nsMathMLmoFrame.cpp:    if (nsMathMLElement::ParseNumericValue(value, cssValue, 0,
nsMathMLmoFrame.cpp:    if (nsMathMLElement::ParseNumericValue(value, cssValue, 0,
nsMathMLmoFrame.cpp:    if (nsMathMLElement::ParseNumericValue(value, cssValue,
nsMathMLmoFrame.cpp:    if (nsMathMLElement::ParseNumericValue(value, cssValue,
nsMathMLmpaddedFrame.cpp:    if (nsMathMLElement::ParseNumericValue(number, aCSSValue, 
nsMathMLmspaceFrame.cpp:    ParseNumericValue(value, &mWidth,
nsMathMLmspaceFrame.cpp:    ParseNumericValue(value, &mHeight, 0,
nsMathMLmspaceFrame.cpp:    ParseNumericValue(value, &mDepth, 0,
nsMathMLmsubFrame.cpp:    ParseNumericValue(value, &subScriptShift, 0, PresContext(), mStyleContext);
nsMathMLmsubsupFrame.cpp:    ParseNumericValue(value, &subScriptShift, 0, PresContext(), mStyleContext);
nsMathMLmsubsupFrame.cpp:    ParseNumericValue(value, &supScriptShift, 0, PresContext(), mStyleContext);
nsMathMLmsupFrame.cpp:    ParseNumericValue(value, &supScriptShift, 0, PresContext(), mStyleContext);
nsMathMLElement.cpp:      ParseNumericValue(value->GetStringValue(), *scriptMinSize,
nsMathMLElement.cpp:      if (!ParseNumericValue(str, *fontSize, PARSE_SUPPRESS_WARNINGS |
nsMathMLElement.cpp:        ParseNumericValue(value->GetStringValue(), *width, 0,
Comment 12 Frédéric Wang (:fredw) 2013-06-26 23:18:41 PDT
Actually it is even

bool b;
float v;
float v2 = ParseLength(s, default value, b);
if (b) {
  v = v2;
  [...]
} [ else {
  ...
}]

or just

float v = default value;
[if] ParseLength(s, v) [... else ...]
Comment 13 chris fleizach 2013-06-26 23:24:21 PDT
(In reply to comment #12)
> Actually it is even
> 
> bool b;
> float v;
> float v2 = ParseLength(s, default value, b);
> if (b) {
>   v = v2;
>   [...]
> } [ else {
>   ...
> }]
> 
> or just
> 
> float v = default value;
> [if] ParseLength(s, v) [... else ...]

    Right, this isn't that bad. 
I'm ok with having it return success. If you think it will cut down on code bloat.

This method name could probably be more descriptive.

parseMathMLUnitLength() or something to that effect
Comment 14 Frédéric Wang (:fredw) 2013-06-27 04:36:29 PDT
Created attachment 205585 [details]
Patch V2
Comment 15 chris fleizach 2013-06-27 12:24:39 PDT
Comment on attachment 205585 [details]
Patch V2

View in context: https://bugs.webkit.org/attachment.cgi?id=205585&action=review

otherwise this looks good

> Source/WebCore/rendering/mathml/RenderMathMLBlock.h:131
> +//

this comment can likely be removed. you can probably name the parameters and their uses will be self evident. certainly style and allowNegative need no introduction

> Source/WebCore/rendering/mathml/RenderMathMLBlock.h:146
> +bool parseNamedSpace(const String&, float&, const RenderStyle*, bool allowNegative = true);

parseNamedSpace should probably be named similar to parseMathMLLength (parseMathMLNamedSpaceUnit or something)
Comment 16 Frédéric Wang (:fredw) 2013-06-27 13:25:11 PDT
Created attachment 205626 [details]
Patch Final Version
Comment 17 WebKit Commit Bot 2013-06-27 17:15:30 PDT
Comment on attachment 205626 [details]
Patch Final Version

Clearing flags on attachment: 205626

Committed r152140: <http://trac.webkit.org/changeset/152140>
Comment 18 WebKit Commit Bot 2013-06-27 17:15:33 PDT
All reviewed patches have been landed.  Closing bug.
Comment 19 Benjamin Poulain 2013-06-30 23:37:27 PDT
Comment on attachment 205626 [details]
Patch Final Version

View in context: https://bugs.webkit.org/attachment.cgi?id=205626&action=review

I am unfamiliar with the MathML codebase, but this code is really weird.

The parsing is non-standard (for something that looks a lot like our CSS parsing). But the weirdest thing is having parsing done by a _render object_!
Why is this made like that?

> Source/WebCore/rendering/mathml/RenderMathMLBlock.cpp:233
> +    int stringLength = s.length();

This should be unsigned.
Comment 20 Frédéric Wang (:fredw) 2013-07-18 06:41:24 PDT
(In reply to comment #19)
> 
> I am unfamiliar with the MathML codebase, but this code is really weird.
> 
> The parsing is non-standard (for something that looks a lot like our CSS parsing). But the weirdest thing is having parsing done by a _render object_!
> Why is this made like that?

I'm not sure what you mean by non-standard, this is how it is defined in the MathML3 spec and how it is implemented in Gecko and MathJax. A special parsing function is needed anyway for MathML particularities: the interpretation of % as a multiple of the default value, the unitless notation and namedspace values.

I wanted to reuse the existing code for the things that are in common, but the MathML code in WebKit used float and it was not clear to me how to handle that. After bug 118601, the length are now LayoutUnits so I guess that would be possible to directly specify the unit rather than doing some computations.

I don't think that having the parsing functions on the renderer object is fundamental, they can be moved to e.g. WebCore/mathml/ I guess. However, some of the MathML attributes are not mapped to CSS properties so I think the parsing functions must be called by the renderer objects anyway.

As a comparison, Gecko uses specific parsing function for MathML, store them on the content/ side, call them from the layout/ side (or from the content/ side for those mapped to CSS properties) and returns the result as a C++ object representing CSS units.
Comment 21 Frédéric Wang (:fredw) 2013-07-19 03:10:23 PDT
So I looked into this more carefully and LayoutUnit is actually essentially int so that's not what I thought (value + unit). I think the MathML parsing functions could be moved to CSSPrimitiveValue.cpp so that they can call conversionToCanonicalUnitsScaleFactor.
Comment 22 Benjamin Poulain 2013-07-19 03:18:04 PDT
(In reply to comment #20)
> (In reply to comment #19)
> > 
> > I am unfamiliar with the MathML codebase, but this code is really weird.
> > 
> > The parsing is non-standard (for something that looks a lot like our CSS parsing). But the weirdest thing is having parsing done by a _render object_!
> > Why is this made like that?
> 
> I'm not sure what you mean by non-standard, this is how it is defined in the MathML3 spec and how it is implemented in Gecko and MathJax. A special parsing function is needed anyway for MathML particularities: the interpretation of % as a multiple of the default value, the unitless notation and namedspace values.
> 
> I wanted to reuse the existing code for the things that are in common, but the MathML code in WebKit used float and it was not clear to me how to handle that. After bug 118601, the length are now LayoutUnits so I guess that would be possible to directly specify the unit rather than doing some computations.

Oh, I did not mean against the W3C standards.

Non-standard in the sense that it does not use WebKit's standard primitive - nor the architecture WebKit uses for parsers.

> I don't think that having the parsing functions on the renderer object is fundamental, they can be moved to e.g. WebCore/mathml/ I guess. However, some of the MathML attributes are not mapped to CSS properties so I think the parsing functions must be called by the renderer objects anyway.

That is fine. The render object can know about its Node, and it could query the line thickness on it.

I'd appreciate if you do follow ups to clean this change.
Comment 23 Benjamin Poulain 2013-07-19 03:18:47 PDT
(In reply to comment #21)
> So I looked into this more carefully and LayoutUnit is actually essentially int so that's not what I thought (value + unit). I think the MathML parsing functions could be moved to CSSPrimitiveValue.cpp so that they can call conversionToCanonicalUnitsScaleFactor.

That sounds even better if that can be made in a clean way.
Comment 24 Frédéric Wang (:fredw) 2014-03-10 12:26:03 PDT
Mass change: add WebExposed keyword to help MDN documentation.