# Bug 78617

Summary: MathML internals - embellished operators, getBase() accessor functions
Product: WebKit Reporter: Dave Barton <dbarton>
Component: MathMLAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED
Severity: Normal CC: darin, eric, fred.wang, mitz, webkit.review.bot
Priority: P2
Version: 528+ (Nightly build)
Hardware: Unspecified
OS: Unspecified
Attachments:
Description Flags
Patch
none
Patch none

 Dave Barton 2012-02-14 10:20:52 PST MathML internals - embellished operators, getBase() accessor functions Dave Barton 2012-02-14 10:31:17 PST Created Patch Eric Seidel 2012-02-14 11:12:40 PST Comment on Patch View in context: https://bugs.webkit.org/attachment.cgi?id=126997&action=review > Source/WebCore/rendering/mathml/RenderMathMLBlock.h:54 > + virtual const RenderMathMLOperator* unembellishedOperator() const { return 0; } I don't think a const pointer will help you much here. Since you won't be able to retain it... For RefCounted objects (like Node) we normally don't bother with const much. > Source/WebCore/rendering/mathml/RenderMathMLSubSup.cpp:57 > +RenderBoxModelObject* RenderMathMLSubSup::getBase() const Normaly we don't use "get" in function names. Since base() here would convey the same meaning. It's unclear to me what "base" is here? Also, it looks like you used it irght, but just to be clear, RenderBoxModelObject is the base class for all CSS-box-model renderers, this is in contrast to SVG-model renderers (which although uses CSS, does not participate in the CSS box model), which use RenderSVGModelObject. Sometimes folks want RenderObject (to mean any kind of renderer) and sometimes they want RenderBoxModelObject (to mean only CSS-box-model renderers). Frédéric Wang (:fredw) 2012-02-14 11:14:42 PST I don't know the webkit code, but here are some comments. Be sure to consider all the elements that can be embellished operators: http://www.w3.org/TR/MathML/chapter3.html#id.3.2.5.7.3 Some reftests for embellished op (I think Webkit supports reftest?): http://devel.mathjax.org/testing/testsuite/MathMLToDisplay/Topics/EmbellishedOp/ In Mozilla, instead of using functions, we store pointers on the MathML frames (pointing to the core frame). Actually, we store a structure with such a pointer and other info relevant to operator stretching. I don't know why that was done that way, but the stretchy data may depend on several elements/attributes in the MathML tree, so it may be expensive to have to compute them if they are used several times. I don't know what is the best for Webkit. Eric Seidel 2012-02-14 12:26:52 PST Translation: "frame" in mozilla is basically "renderer" in WebKit. Dave Barton 2012-02-14 19:25:00 PST Comment on Patch Thanks for all the expert comments! My wife seems to think it's Valentine's Day so I'll say more tomorrow. :-) I'll submit a revised patch after that. WebKit Review Bot 2012-02-15 01:21:07 PST did not pass style-queue: Failed to run "['Tools/Scripts/update-webkit']" exit_code: 9 Updating OpenSource From git://git.webkit.org/WebKit + 2ce77d7...a9730c4 master -> origin/master (forced update) First, rewinding head to replay your work on top of it... Applying: [Mac][Win][WK2] Switch to RFC 6455 protocol for WebSockets Using index info to reconstruct a base tree... :1578: trailing whitespace. :1647: trailing whitespace. :1657: trailing whitespace. :1672: trailing whitespace. return 0; :1674: trailing whitespace. warning: squelched 7 whitespace errors warning: 12 lines add whitespace errors. Falling back to patching base and 3-way merge... warning: too many files (created: 168753 deleted: 3), skipping inexact rename detection Auto-merging LayoutTests/ChangeLog CONFLICT (content): Merge conflict in LayoutTests/ChangeLog Auto-merging LayoutTests/platform/wk2/Skipped Auto-merging Source/WebCore/ChangeLog Auto-merging Source/WebCore/css/CSSCalculationValue.cpp Auto-merging Source/WebCore/css/CSSCalculationValue.h Auto-merging Source/WebCore/css/CSSParser.cpp Auto-merging Source/WebKit/mac/ChangeLog CONFLICT (content): Merge conflict in Source/WebKit/mac/ChangeLog Auto-merging Source/WebKit2/ChangeLog CONFLICT (content): Merge conflict in Source/WebKit2/ChangeLog Auto-merging Tools/ChangeLog CONFLICT (content): Merge conflict in Tools/ChangeLog Failed to merge in the changes. Patch failed at 0001 [Mac][Win][WK2] Switch to RFC 6455 protocol for WebSockets When you have resolved this problem run "git rebase --continue". If you would prefer to skip this patch, instead run "git rebase --skip". To restore the original branch and stop rebasing run "git rebase --abort". rebase refs/remotes/origin/master: command returned error: 1 Died at Tools/Scripts/update-webkit line 164. If any of these errors are false positives, please file a bug against check-webkit-style. Frédéric Wang (:fredw) 2012-02-15 07:57:24 PST FYI, here is how it is implemented in Mozilla. I hope that can help you to implement it in Webkit. First we have a data structure to describe embellished operators: http://mxr.mozilla.org/mozilla-central/source/layout/mathml/nsIMathMLFrame.h#239 This embellish data is initialized for each frame (renderer) from bottom to top using TransmitAutomaticData. For example they are initialized for the , based on various info (attributes, operator dictionary etc). An element takes the EmbellishData from its base (first child): http://mxr.mozilla.org/mozilla-central/source/layout/mathml/nsMathMLmsupFrame.cpp#65 and similarly for the other cases described in the MathML REC: http://www.w3.org/TR/MathML/chapter3.html#id.3.2.5.7.3 The non-trivial case is for mrow-like elements (mrow, mstyle, mphantom...). To do that, first we store on the frame a boolean property indicating whether the element is "space-like": http://www.w3.org/TR/MathML/chapter3.html#id.3.2.7.4 which is also initialized from bottom to top. So for example mtext and mspace are initialized to be space-like per the MathML REC. Finally, for mrow-like elements the EmbellishData and the space-like property are initialized at the same time using TransmitAutomaticDataForMrowLikeElement: http://mxr.mozilla.org/mozilla-central/source/layout/mathml/nsMathMLContainerFrame.cpp#1482 Dave Barton 2012-02-15 10:27:55 PST I'll remove the "const" and "get", and add comments to the code summarizing or referring to our discussions here. "base" here omits a subscript and/or superscript, or an underscript and/or overscript. In legal MathML, the result will still be MathML, hence a RenderMathMLBlock or RenderInline, both of which derive from RenderBoxModelObject. We do need to be able to work with the base, e.g. compute its offsetHeight. I'm not an SVG expert, but you raise a basic design question for MathML in WebKit. Currently RenderMathMLBlock, which is used by most MathML elements, derives from RenderBlock. We get a lot of horizontal and vertical formatting for free that way, hit-testing, probably a lot of other things(?). Also MathML elements are probably most like inline-block elements, which are implemented by RenderBlock. However, one might argue it'd be cleaner to not derive from RenderBlock, and reimplement layout() and other functionality for some base RenderMathML class. This effort is probably beyond the time I have available though. Even more fundamental is the question of whether MathML objects obey the CSS box model, or whether they should. The MathML spec suggests that one ought to be able to style MathML with CSS in environments that support CSS, presumably including using the box model. For now, I am indeed using a simplified definition of "embellished operator". I think it's best to simplify the current RenderMathML* classes before we add things to them. Also, I don't agree with larger definition, FWIW. Is there a reference explaining more fully the rationale for always treating an of one argument, and without attributes, as a special case? I've written a MathML generator, jqMath, and it was actually simpler to not do this. Also checking specially for space-like elements adds complications and an unnecessary inconsistency, in my opinion. This requirement pre-dates CSS. Just as many presentational attributes in HTML were deprecated and replaced by CSS, I think many in MathML should be also, and the definition of "embellished operator" should be simplified as well. But having said all that, if the spec isn't changed, then I agree that WebKit MathML should eventually implement a mechanism similar to Mozilla's. Thanks for the pointers to the tests. We definitely should include these tests, and/or tests from the w3c mathml committee, at some point. This careful review from you experts is invaluable. Dave Barton 2012-02-15 13:18:18 PST Created Patch Dave Barton 2012-02-17 10:38:38 PST I asked Neil Soiffer, arguably the world's #1 authority on these matters (MathML committee major author for 15 years, Mathematica front end & IE MathPlayer implementer, authority for Frédéric Wang and others on w3c's mathml mailing list), about the rules for mrow and embellished operators in MathML and got this response: It's hard to remember back why all of the things that are in MathML are in there. Some have good solid technical reasons and some were political compromises -- I hate mfenced, which is duplicates other things and isn't as powerful and causes problems for things like CSS because drawn items are buried in attributes. Anyway... mrow with one arg were added because authoring tool editors felt that it was more natural to always have mrows around subexpression, such as a numerator of a fraction. Hand authoring, which was still popular (and remains popular by HTML developers to this day) pushed towards not always requiring the mrows. That's also the reason for the implicit mrows in some places such as msqrt. In the implementations I've done, I've typically "normalized" them away so the rendering code never sees it. This normalization is either explicit (tree is changed) or virtual via a "getChild" method that checks if the child is an mrow with a single child, and if so, calls itself recursively. The embellished operators rules are a bit complicated, but grew out of use cases. I think (not 100% positive -- I'd need to check the code), that like mrow, embellished operators were virtualized in my code in the sense that there is a "isOperator" call that checks to see if something is a script (etc), and if so, calls itself on the base of the script. In that way, they turned out not to be much of an exception to the layout logic -- any value you ask for of an operator could be asked of an embellished operator without needing to know about details of the embellished operator. I will admit that in my linebreaking code, I repeatedly forgot to call the higher level function and that led to many bugs, all of which were trivial to fix but still bugs nonetheless. I do remember there being a lot of discussion about the embellished operator/mrow/space-like children rule, but it was too long ago to remember the rationale. If needed, I could go through the list archives...'' So it sounds like Neil's probably gone the virtual function route for tracking embellished operators, instead of the Mozilla method of adding extra data. The extra data has the drawback of needing to be updated whenever the DOM changes. I'm not sure it really saves much time in practice. I find all this fascinating. The whole of one argument must be equivalent in all ways to the argument itself seems crazy to me. This would be like saying that of one argument must behave like the argument alone in all ways, e.g. CSS child selector rules. This would be nuts, IMHO. To justify it by saying that the numerator argument to should always be an in some editor is like saying everything inside a