Bug 161045

Summary: Improve parsing of the menclose notation attribute value
Product: WebKit Reporter: Frédéric Wang (:fredw) <fred.wang>
Component: MathMLAssignee: Frédéric Wang (:fredw) <fred.wang>
Status: RESOLVED FIXED    
Severity: Normal CC: alex, bfulgham, commit-queue, darin, dbarton, rego
Priority: P2    
Version: WebKit Nightly Build   
Hardware: All   
OS: All   
Bug Depends on: 160461    
Bug Blocks:    
Attachments:
Description Flags
Patch darin: review+

Description Frédéric Wang (:fredw) 2016-08-22 09:11:10 PDT
Follow-up of bug 160461. We should also:

* parse the list of notations without allocating a vector.
* accept any whitespace as separators.
Comment 1 Frédéric Wang (:fredw) 2016-08-23 03:02:02 PDT
Created attachment 286694 [details]
Patch
Comment 2 Darin Adler 2016-08-27 18:03:47 PDT
Comment on attachment 286694 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=286694&action=review

This is a fantastic change. The only thing I regret is that we should really have a helper function for this since this kind of thing is a common idiom.

> Source/WebCore/mathml/MathMLMencloseElement.cpp:109
> +    String value = attributeWithoutSynchronization(notationAttr);

This causes unnecessary reference count churn. The return value is actually "const AtomicString&", a reference to the actual string stored in the object. A better type to use here is StringView. Putting the const AtomicString& into one of those will give us a pointer into the characters stored in the attribute string without any copying or reference count churn at all.

> Source/WebCore/mathml/MathMLMencloseElement.cpp:120
> +        addNotationFlags(value.substring(start, end - start));

If value is a StringView, then value.substring also will be, saving us a lot here: 1) no copying of the characters of each of these substrings, 2) no allocating a memory block to hold the characters and the rest of the StringImpl, 3) no reference counting overhead to create and then decide to destroy the local string.
Comment 3 Frédéric Wang (:fredw) 2016-08-28 01:36:12 PDT
(In reply to comment #2)
> This is a fantastic change. The only thing I regret is that we should really
> have a helper function for this since this kind of thing is a common idiom.
> 

Yes. I see that we need some flexibility, for example some menclose notations like "box" maps to 4 bits (top, left, right, bottom) and other tabular attributes are really converted into a list (bug 160075). So I guess this would mean passing "addNotationFlags" (and maybe "isHTMLSpace") as some functions.

> > Source/WebCore/mathml/MathMLMencloseElement.cpp:109
> > +    String value = attributeWithoutSynchronization(notationAttr);
> 
> This causes unnecessary reference count churn. The return value is actually
> "const AtomicString&", a reference to the actual string stored in the
> object. A better type to use here is StringView. Putting the const
> AtomicString& into one of those will give us a pointer into the characters
> stored in the attribute string without any copying or reference count churn
> at all.
> 

OK, I think this is because I tried StringView value = ... instead of StringView value(...). The latter will work better.
Comment 4 Darin Adler 2016-08-28 01:43:09 PDT
(In reply to comment #3)
> (In reply to comment #2)
> > This is a fantastic change. The only thing I regret is that we should really
> > have a helper function for this since this kind of thing is a common idiom.
> 
> Yes. I see that we need some flexibility, for example some menclose
> notations like "box" maps to 4 bits (top, left, right, bottom) and other
> tabular attributes are really converted into a list (bug 160075). So I guess
> this would mean passing "addNotationFlags" (and maybe "isHTMLSpace") as some
> functions.

The way I’d want to do this is to build a class that lets you iterate through a StringView in this fashion with a for loop. The way C++ for loops are written, we can make functions that return objects that use iterators like the way we do it in StringView::codeUnits(), although the function does not need to be a member of the StringView class. If we package it that way then we can write the main body of this as a simple for loop; the iterators need know nothing about addNotationFlags.

Could do one with isHTMLSpace built in, or one that takes a function as an argument.

> OK, I think this is because I tried StringView value = ... instead of
> StringView value(...).

Strange. I would have expected the former to work too, since the relevant constructor of StringView is not marked "explicit". I’d like to know what kind of error we get when we try to use that form.
Comment 5 Frédéric Wang (:fredw) 2016-08-28 01:48:10 PDT
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > This is a fantastic change. The only thing I regret is that we should really
> > > have a helper function for this since this kind of thing is a common idiom.
> > 
> > Yes. I see that we need some flexibility, for example some menclose
> > notations like "box" maps to 4 bits (top, left, right, bottom) and other
> > tabular attributes are really converted into a list (bug 160075). So I guess
> > this would mean passing "addNotationFlags" (and maybe "isHTMLSpace") as some
> > functions.
> 
> The way I’d want to do this is to build a class that lets you iterate
> through a StringView in this fashion with a for loop. The way C++ for loops
> are written, we can make functions that return objects that use iterators
> like the way we do it in StringView::codeUnits(), although the function does
> not need to be a member of the StringView class. If we package it that way
> then we can write the main body of this as a simple for loop; the iterators
> need know nothing about addNotationFlags.

Yes, that sounds better. I guess this should be a separate bug entry, though.

> Could do one with isHTMLSpace built in, or one that takes a function as an
> argument.

Using isHTMLSpace built in will be fine, I guess.

> Strange. I would have expected the former to work too, since the relevant
> constructor of StringView is not marked "explicit". I’d like to know what
> kind of error we get when we try to use that form.

../../Source/WebCore/mathml/MathMLMencloseElement.cpp:109:55: error: conversion from ‘const WTF::AtomicString’ to non-scalar type ‘WTF::StringView’ requested
     StringView value = attributeWithoutSynchronization(notationAttr);
Comment 6 Darin Adler 2016-08-28 02:19:48 PDT
(In reply to comment #5)
> I guess this should be a separate bug entry, though.

Yes, agreed.

> ../../Source/WebCore/mathml/MathMLMencloseElement.cpp:109:55: error:
> conversion from ‘const WTF::AtomicString’ to non-scalar type
> ‘WTF::StringView’ requested
>      StringView value = attributeWithoutSynchronization(notationAttr);

I see the problem. I had forgotten that AtomicString does not derive from String. The workaround for now would be to add a call to string():

    StringView value = attributeWithoutSynchronization(notationAttr).string();

A better fix for the long term will be to add another constructor to StringView:

    StringView(const AtomicString&);

It can simply call the AtomicString one:

    inline StringView::StringView(const AtomicString& string)
        : StringView(string.string())
    {
    }

Maybe later someone working on the string classes can streamline this even more in some way within the classes, but this change should be enough to make things great for people using the StringView class.
Comment 7 Darin Adler 2016-08-28 02:20:13 PDT
I said the workaround “for now”, but I meant the workaround if you don’t want to touch the classes inside WTF in this patch.
Comment 8 Frédéric Wang (:fredw) 2016-08-28 02:34:58 PDT
(In reply to comment #7)
> I said the workaround “for now”, but I meant the workaround if you don’t
> want to touch the classes inside WTF in this patch.

Thanks, I'll land with the workaround for now and will open follow-up bugs for the two non-MathML stuff tomorrow.
Comment 9 Frédéric Wang (:fredw) 2016-08-28 02:37:26 PDT
Committed r205101: <http://trac.webkit.org/changeset/205101>