Bug 160241 - Parse the operator text on the MathMLOperatorElement class
Summary: Parse the operator text on the MathMLOperatorElement class
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: MathML (show other bugs)
Version: WebKit Nightly Build
Hardware: All All
: P2 Normal
Assignee: Frédéric Wang (:fredw)
URL:
Keywords:
Depends on: 160034
Blocks: 156537 160190 160239 160245
  Show dependency treegraph
 
Reported: 2016-07-27 05:46 PDT by Frédéric Wang (:fredw)
Modified: 2016-07-29 00:23 PDT (History)
8 users (show)

See Also:


Attachments
Patch (15.77 KB, patch)
2016-07-27 06:04 PDT, Frédéric Wang (:fredw)
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Frédéric Wang (:fredw) 2016-07-27 05:46:33 PDT
Another step towards bug 156537. The MathML Operator UChar is necessary to find its entries in the operator dictionary and determine properties.

The case of anonymous mfenced operators still need to be handled specially. The idea will be to move them in a separate RenderMathMLFencedOperator class deriving from RenderMathMLOperator in a follow-up bug.
Comment 1 Frédéric Wang (:fredw) 2016-07-27 06:04:18 PDT
Created attachment 284688 [details]
Patch
Comment 2 Javier Fernandez 2016-07-28 00:52:57 PDT
Comment on attachment 284688 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=284688&action=review

> Source/WebCore/rendering/mathml/RenderMathMLOperator.h:60
> +    virtual UChar textContent() const;

Why virtual ? As far as I could see, this function is not override by any subclass.
Comment 3 Frédéric Wang (:fredw) 2016-07-28 01:01:19 PDT
(In reply to comment #2)
> Comment on attachment 284688 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=284688&action=review
> 
> > Source/WebCore/rendering/mathml/RenderMathMLOperator.h:60
> > +    virtual UChar textContent() const;
> 
> Why virtual ? As far as I could see, this function is not override by any
> subclass.

Yes, this is a mistake. I was already thinking about the next patch where it is overridden (bug 160245) but did not wanted to do it here to make the patch too big.
Comment 4 Javier Fernandez 2016-07-28 01:01:39 PDT
Comment on attachment 284688 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=284688&action=review

> Source/WebCore/rendering/mathml/RenderMathMLOperator.h:76
> +    bool isInvisibleOperator() { return 0x2061 <= textContent() && textContent() <= 0x2064; }

Perhaps not the time to think about performance, but I assume this change would make isInvisibleOperator considerably slower; instead of directly accessing an attribute, it will have to check for anonymity, accessing the associated html element and finally getting its attribute value (assumed not dirty). I make this observation just in case we may be interested on alternative implementation for this function, caching some values and making assumptions regarding anonymity.
Comment 5 Frédéric Wang (:fredw) 2016-07-28 01:13:26 PDT
(In reply to comment #4)
> Comment on attachment 284688 [details]
> Perhaps not the time to think about performance, but I assume this change
> would make isInvisibleOperator considerably slower; instead of directly
> accessing an attribute, it will have to check for anonymity, accessing the
> associated html element and finally getting its attribute value (assumed not
> dirty). I make this observation just in case we may be interested on
> alternative implementation for this function, caching some values and making
> assumptions regarding anonymity.

I believe for this and other functions in RenderMathMLOperator calling textContent() the actual parsing will only be done once during the layout (if the content changes then the whole layout should be redone anyway). The idea here and later for other operator attributes will be to cache the data on the element class rather than having them on the render class. But yes, there may be a small performance drop do to more function calls. Maybe we should use local variables to reduce them, but I did not want to change too much code in this patch.
Comment 6 Frédéric Wang (:fredw) 2016-07-28 02:00:57 PDT
(In reply to comment #4)
> 
> Perhaps not the time to think about performance, but I assume this change
> would make isInvisibleOperator considerably slower; instead of directly
> accessing an attribute, it will have to check for anonymity, accessing the
> associated html element and finally getting its attribute value (assumed not
> dirty). I make this observation just in case we may be interested on
> alternative implementation for this function, caching some values and making
> assumptions regarding anonymity.

I just realized why you meant with this anonymity. As I said in comment 3, the plan is to move the code specific to anonymous fenced operator in a derived class so we will remove all the isAnonymous() check at the end.
Comment 7 Darin Adler 2016-07-28 16:16:06 PDT
Comment on attachment 284688 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=284688&action=review

> Source/WebCore/mathml/MathMLOperatorElement.cpp:57
> +    // We collapse the whitespace and replace the hyphens by minus signs.
> +    AtomicString textContent = string.stripWhiteSpace().simplifyWhiteSpace().replace(hyphenMinus, minusSign).impl();
> +
> +    // We verify whether the operator text can be represented by a single UChar.
> +    // FIXME: This does not handle surrogate pairs (https://bugs.webkit.org/show_bug.cgi?id=122296).
> +    // FIXME: This does not handle <mo> operators with multiple characters (https://bugs.webkit.org/show_bug.cgi?id=124828).
> +    return textContent.length() == 1 ? textContent[0] : 0;

This is a really inefficient way to extract a character from a string. Skipping initial whitespace does not require calling stripWhiteSpace. Calling simplifyWhitespace has no effect here. Turning a hyphen into a minus sign would be more efficient after a single character is extracted. Handling surrogate pairs correctly is almost trivial if you use a StringView instead of a String, using the codePoints() function, but the return typeof this function would have to be UChar32, not Char.

I guess, though, that this code was just moved here, so none of those issues are new.

> Source/WebCore/mathml/MathMLOperatorElement.h:49
> +    struct OperatorText {
> +        UChar value;
> +        bool dirty { true };
> +    };
> +    OperatorText m_operatorText;

It would probably be cleaner to use Optional<UChar> here. No value means it needs to be computed. But also, should be UChar32, not Char.
Comment 8 Frédéric Wang (:fredw) 2016-07-29 00:21:31 PDT
(In reply to comment #7)
> Comment on attachment 284688 [details]
> I guess, though, that this code was just moved here, so none of those issues
> are new.

That's true. The goal in my current set of patches is to move code from rendering to DOM classes for now, so I'll just add a FIXME comment and postpone this for a latter bug entry. Now that code is in its own parsing function so it should not interfere too much with the rest of my patches.
Comment 9 Frédéric Wang (:fredw) 2016-07-29 00:23:18 PDT
Committed r203896: <http://trac.webkit.org/changeset/203896>