Bug 160462

Summary: Share and improve extraction of character for operator and token elements
Product: WebKit Reporter: Frédéric Wang (:fredw) <fred.wang>
Component: MathMLAssignee: Frédéric Wang (:fredw) <fred.wang>
Status: RESOLVED FIXED    
Severity: Normal CC: bfulgham, commit-queue, darin, dbarton, esprehn+autocc, glenn, kondapallykalyan
Priority: P2    
Version: WebKit Nightly Build   
Hardware: All   
OS: All   
Bug Depends on: 160172    
Bug Blocks: 122296, 153984    
Attachments:
Description Flags
Patch
darin: review+
Patch none

Description Frédéric Wang (:fredw) 2016-08-02 13:16:02 PDT
See bug 160241 comment 7

MathMLOperatorElement.cpp:    AtomicString textContent = string.stripWhiteSpace().simplifyWhiteSpace().replace(hyphenMinus, minusSign).impl();
RenderMathMLToken.cpp:    AtomicString textContent = element().textContent().stripWhiteSpace().simplifyWhiteSpace();
Comment 1 Frédéric Wang (:fredw) 2016-08-03 03:59:02 PDT
Created attachment 285217 [details]
Patch
Comment 2 Darin Adler 2016-08-05 23:27:03 PDT
Comment on attachment 285217 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=285217&action=review

> Source/WebCore/mathml/MathMLElement.cpp:401
> -static inline StringView skipLeadingAndTrailingWhitespace(const StringView& stringView)
> +StringView MathMLElement::skipLeadingAndTrailingWhitespace(const StringView& stringView)

I don’t think this is the right thing to do. We should move this into the shared HTMLParserIdioms header instead. Please come back and do that eventually.

Should be named strip, not skip.

> Source/WebCore/mathml/MathMLOperatorElement.cpp:57
> +            UChar character = static_cast<UChar>(codePoint.value());

This should compile without the typecast.

> Source/WebCore/mathml/MathMLTextElement.cpp:100
> +    auto codePoints = skipLeadingAndTrailingWhitespace(string).codePoints();

Why the need to strip whitespace? Does the specification really call for this?

> Source/WebCore/mathml/MathMLTextElement.cpp:104
> +    Optional<UChar32> character = *iterator;

Should just use UChar32 or auto here. I think the code below will compile without this already being an optional.
Comment 3 Frédéric Wang (:fredw) 2016-08-21 03:18:48 PDT
(In reply to comment #2)
> I don’t think this is the right thing to do. We should move this into the
> shared HTMLParserIdioms header instead. Please come back and do that
> eventually.

Yes, that was my plan. Making this depends on bug 160172.


> > Source/WebCore/mathml/MathMLTextElement.cpp:100
> > +    auto codePoints = skipLeadingAndTrailingWhitespace(string).codePoints();
> 
> Why the need to strip whitespace? Does the specification really call for
> this?

Yes, unfortunately. Again see https://lists.w3.org/Archives/Public/www-math/2016Aug/0000.html

Note that at the moment we have inconsistent behavior: we trim whitespace for <mi> x </mi> (this case) but not for <mi> sin </mi> (bug 125628). As I said on the Math WG mailing list I think this requirement is useless ; nobody adds such spaces in MathML formulas used in practice. I would suggest to just depart from the spec here and ignore this requirement.
Comment 4 Frédéric Wang (:fredw) 2016-08-23 05:29:02 PDT
(In reply to comment #2)
> > Source/WebCore/mathml/MathMLTextElement.cpp:104
> > +    Optional<UChar32> character = *iterator;
> 
> Should just use UChar32 or auto here. I think the code below will compile
> without this already being an optional.

On Linux, I gcc complains with

Source/WebCore/mathml/MathMLTokenElement.cpp:99:41: error: operands to ?: have different types 'UChar32 {aka int}' and 'const WTF::NulloptTag'
Comment 5 Frédéric Wang (:fredw) 2016-08-23 05:31:06 PDT
Created attachment 286704 [details]
Patch
Comment 6 Frédéric Wang (:fredw) 2016-08-23 06:38:15 PDT
Committed r204830: <http://trac.webkit.org/changeset/204830>