Bug 52983

Summary: Eliminate m_tagHistory pointer from CSSSelector
Product: WebKit Reporter: Antti Koivisto <koivisto>
Component: CSSAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: darin, hyatt, mitz, simon.fraser, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Attachments:
Description Flags
patch for review
darin: review+
better CSSParserSelector, more OwnPtrs darin: review+

Description Antti Koivisto 2011-01-23 14:42:16 PST
Eliminate m_tagHistory pointer from CSSSelector. Tag history can be maintained in an array (in CSSSelectorList) instead. This saves memory (selectors are a major memory user on some pages) and help style selector performance by improving locality.
Comment 1 Antti Koivisto 2011-01-23 15:05:38 PST
Created attachment 79876 [details]
patch for review
Comment 2 Darin Adler 2011-01-23 16:20:27 PST
Comment on attachment 79876 [details]
patch for review

View in context: https://bugs.webkit.org/attachment.cgi?id=79876&action=review

For some reason the EWS bot can’t apply this patch. You may want to find out why, because it can be useful to know if any builds are broken.

Basic technique seems good; it’s unfortunate the details get a bit messy. More smart pointer use may help a little even though we’ll need to move things in and out of the smart pointers with adopt and leak.

I think that CSSParserSelector would be better if it had functions for all the things that the grammar wants to do to a selector. There’s no reason the grammar needs to reach through it by calling selector() each time. A few inline functions for the operations done by the grammar wouldn’t be all that hard to write, and generally speaking we want to keep the grammar file as simple as possible and put as much of the complexity in code outside that file.

I’m going to say review+ because after reading all my comments none seems mandatory, but I think there is room for improvement here, and I’d really like to see EWS results.

> Source/WebCore/css/CSSGrammar.y:1218
> +            CSSParserSelector* parserSelector = p->sinkFloatingSelector($4);
> +            CSSSelector* simpleSelector = parserSelector->releaseSelector();
> +            $$->selector()->setSimpleSelector(simpleSelector);
> +            delete parserSelector;

The sinkFloatingSelector function should return a PassOwnPtr<CSSParserSelector>. Similarly, the releaseSelector function should return a PassOwnPtr<CSSSelector> and the setSimpleSelctor function should take a PassOwnPtr. Then we would not need any local variables here. It would just read:

    $$->selector()->setSimpleSelector(p->sinkFloatingSelector($4)->releaseSelector());

And all the ownership transfer would be taken care of.

> Source/WebCore/css/CSSParser.cpp:5706
> -CSSSelector* CSSParser::sinkFloatingSelector(CSSSelector* selector)
> +CSSParserSelector* CSSParser::sinkFloatingSelector(CSSParserSelector* selector)

This should return a PassOwnPtr as I mentioned above.

> Source/WebCore/css/CSSParserValues.cpp:80
> +    , m_tagHistory(0)

I find the name “tag history” confusing. But it’s not like it’s a new term, so not your problem.

> Source/WebCore/css/CSSParserValues.h:98
> +    CSSSelector* releaseSelector() { return m_selector.leakPtr(); }

This should return a PassOwnPtr.

> Source/WebCore/css/CSSParserValues.h:100
> +    void setTagHistory(CSSParserSelector* selector) { m_tagHistory = selector; }

This should take a PassOwnPtr.

> Source/WebCore/css/CSSParserValues.h:104
> +    CSSParserSelector* m_tagHistory;

This should be an OwnPtr too. You can call leakPtr on it in the destructor.

> Source/WebCore/css/CSSSelector.h:256
> +        // AtomicString is really just an AtomicStringImpl* so the cast below is safe.
> +        const AtomicString& value() const { return *reinterpret_cast<const AtomicString*>(m_hasRareData ? &m_data.m_rareData->m_value : &m_data.m_value); }

I wonder if there’s a way to avoid this. I believe we’ve had trouble with this kind of code in the past, with high optimization settings in the compiler. Maybe we can have value() just return an AtomicStringImpl*?

> Source/WebCore/css/CSSSelector.h:272
> +        void setValue(const AtomicString& value)
> +        { 
> +            if (m_hasRareData) {
> +                m_data.m_rareData->m_value = value.impl();
> +                m_data.m_rareData->m_value->ref();
> +                return;
> +            }
> +            // Need to do ref counting manually in the union.
> +            m_data.m_value = value.impl();
> +            m_data.m_value->ref();
> +        }

When a function gets this long I prefer to put the definition outside the class. Maybe we should take a pass through this class and move the function bodies out of the class. I find it makes the class way easier to read.

> Source/WebCore/css/CSSSelector.h:329
> +            RareData(AtomicStringImpl* value)
> +                : m_value(value)
> +                , m_a(0)
>                  , m_b(0)
> -                , m_tagHistory(tagHistory)
>                  , m_attribute(anyQName())
>                  , m_argument(nullAtom)
>              {
>              }
> +            ~RareData()
> +            {
> +                if (m_value)
> +                    m_value->deref();
> +            }

This seems hard to use correctly. The constructor doesn’t ref but the destructor does deref. I have no doubt the code currently handles this correctly, but I suspect it will be easy to screw this up later.

One way to do this is to make the constructor argument be a PassRefPtr<AtomicStringImpl> and then call leakPtr on it. This makes it much harder to use it incorrectly.

> Source/WebCore/css/CSSSelector.h:334
> +            AtomicStringImpl* m_value;

Why not use AtomicString or RefPtr<AtomicStringImpl> here? I don’t think a raw pointer is best. It’s easy to transfer ownership from a raw pointer, by calling adoptRef on it. Even though you need to use a raw pointer in the union, I think it’s better to not use a raw pointer here as well.

> Source/WebCore/css/CSSSelector.h:353
> -            DataUnion() : m_tagHistory(0) { }
> -            CSSSelector* m_tagHistory;
> +            DataUnion() : m_value(0) { }
>              RareData* m_rareData;
> +            AtomicStringImpl* m_value;

The old union used to initialize the first value. Now you’re initializing the second, which is a little stranger. Obviously the order doesn’t matter, but it kind of attracts the attention a tiny bit.

> Source/WebCore/css/CSSSelectorList.cpp:50
> +    const size_t vectorSize = selectorVector.size();
> +    size_t arraySize = 0;

I don’t think it’s all that helpful to distinguish the passed-in vector from the adopted one by calling the latter an array. I know it’s using malloc directly, but that seems an implementation detail. I think the internal one could be called something like “flattened” instead. Or “complete”. Or we could call it the “list” since it’s the actual list.

> Source/WebCore/css/CSSSelectorList.cpp:60
> +        delete selectorVector[0];

This makes it clear that selectorVector should be a Vector<OwnPtr<CSSParserSelector> >.

Doing the deletion by hand is trickier than it needs to be.

> Source/WebCore/css/CSSSelectorList.cpp:64
> +    m_selectorArray = reinterpret_cast<CSSSelector*>(fastMalloc(sizeof(CSSSelector) * arraySize));

This should be static_cast, not reinterpret_cast.

> Source/WebCore/css/CSSSelectorList.cpp:70
> +            memcpy(&m_selectorArray[arrayIndex], selector, sizeof(CSSSelector));

It’s dangerous to move a CSSSelector this way unless the class knows about it; some classes have pointers into themselves or other such things. If we really need this move operation, we should make a function member of the CSSSelector class to do the work. That makes it more likely that someone changing CSSSelector around won’t invalidate this operation. We can just name the function move or moveIntoUninitializedMemory or something like that and make it either a member function or static member function.

> Source/WebCore/css/CSSSelectorList.h:49
> +    static CSSSelector* next(CSSSelector* current) {
> +        while (!current->isLastInTagHistory())
> +            current++;
> +        return current->isLastInSelectorList() ? 0 : current + 1;
> +    }

Same comment about keeping long functions out of the class. I also think that the various places where we do the “+ 1” trick to find adjacent elements need “why” comments.

> Source/WebCore/css/CSSSelectorList.h:50
> +    bool hasOneSelector() const { return m_selectorArray ? !next(m_selectorArray) : false; }

I prefer && over ? : for cases like this, even thought the old code used ? :
Comment 3 Antti Koivisto 2011-01-24 20:05:54 PST
Created attachment 80009 [details]
better CSSParserSelector, more OwnPtrs

Implemented most of Darin's comments and updated to current trunk. Nasty casting trick in CSSSelector::value() is still there as trying to switch the interface to return AtomicStringImpl* resulted in a ballooning refactoring.

Requesting re-review due to significant changes.
Comment 4 WebKit Review Bot 2011-01-24 20:07:26 PST
Attachment 80009 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1

Source/WebCore/css/CSSSelectorList.cpp:68:  Local variables should never be PassOwnPtr (see http://webkit.org/coding/RefPtr.html).  [readability/pass_ptr] [5]
Source/WebCore/css/CSSSelector.cpp:701:  Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side.  [whitespace/operators] [4]
Source/WebCore/css/CSSSelector.h:257:  Extra space before last semicolon. If this should be an empty statement, use { } instead.  [whitespace/semicolon] [5]
Source/WebCore/css/CSSSelector.h:266:  The parameter name "value" adds no information, so it should be removed.  [readability/parameter_name] [5]
Source/WebCore/css/CSSParserValues.h:25:  Alphabetical sorting problem.  [build/include_order] [4]
Total errors found: 5 in 17 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 5 Antti Koivisto 2011-01-24 20:10:48 PST
> > Source/WebCore/css/CSSSelectorList.cpp:70
> > +            memcpy(&m_selectorArray[arrayIndex], selector, sizeof(CSSSelector));
> 
> It’s dangerous to move a CSSSelector this way unless the class knows about it; some classes have pointers into themselves or other such things. If we really need this move operation, we should make a function member of the CSSSelector class to do the work. That makes it more likely that someone changing CSSSelector around won’t invalidate this operation. We can just name the function move or moveIntoUninitializedMemory or something like that and make it either a member function or static member function.

I left this as is for now as CSSSelectorList and CSSSelector are already very tightly coupled.
Comment 6 Darin Adler 2011-01-25 08:46:33 PST
(In reply to comment #5)
> > > Source/WebCore/css/CSSSelectorList.cpp:70
> > > +            memcpy(&m_selectorArray[arrayIndex], selector, sizeof(CSSSelector));
> > 
> > It’s dangerous to move a CSSSelector this way unless the class knows about it; some classes have pointers into themselves or other such things. If we really need this move operation, we should make a function member of the CSSSelector class to do the work. That makes it more likely that someone changing CSSSelector around won’t invalidate this operation. We can just name the function move or moveIntoUninitializedMemory or something like that and make it either a member function or static member function.
> 
> I left this as is for now as CSSSelectorList and CSSSelector are already very tightly coupled.

I’m not convinced. It would be straightforward to do what I asked.

Anyway, I’ll review again.
Comment 7 Darin Adler 2011-01-25 08:57:47 PST
Comment on attachment 80009 [details]
better CSSParserSelector, more OwnPtrs

View in context: https://bugs.webkit.org/attachment.cgi?id=80009&action=review

Patch looks great. I’m going to say review+ even though I have suggestions for further improvement.

> Source/WebCore/css/CSSParser.cpp:5940
> +    CSSParserSelector* elementNameSelector = new CSSParserSelector;
> +    elementNameSelector->setTag(tag);
>      specifiers->setTagHistory(elementNameSelector);

The local variable here should be an OwnPtr<CSSParserSelector> and the call to setTagHistory should have a release in it. That way, the adoptPtr will be right next to the new, which is how we want it. We want an adopt right next to new as much as possible. We could even make CSSParserSelector::create(), which returns a PassOwnPtr.

> Source/WebCore/css/CSSParserValues.cpp:87
> +    Vector<CSSParserSelector*, 16> toDelete;

I’d use OwnPtr here.

> Source/WebCore/css/CSSParserValues.cpp:88
> +    CSSParserSelector* selector = m_tagHistory.leakPtr();

And OwnPtr here, and release instead of leakPtr.

> Source/WebCore/css/CSSParserValues.cpp:91
> +        toDelete.append(selector);
> +        CSSParserSelector* next = selector->m_tagHistory.leakPtr();

And get the value of next into an OwnPtr, using release instead of leakPtr. And use another release on the append line.

> Source/WebCore/css/CSSParserValues.cpp:94
> +        selector = next;

And there would be a release here.

> Source/WebCore/css/CSSParserValues.cpp:96
> +    deleteAllValues(toDelete);

And then no deleteAllValues here.

>> Source/WebCore/css/CSSParserValues.h:25
>> +#include "CSSSelector.h"
> 
> Alphabetical sorting problem.  [build/include_order] [4]

The style bot is right. The includes should go in the reverse order.

>> Source/WebCore/css/CSSSelector.h:257
>> +        const QualifiedName& tag() const { return m_tag; } ;
> 
> Extra space before last semicolon. If this should be an empty statement, use { } instead.  [whitespace/semicolon] [5]

We should remove that last semicolon on the line. No semicolon after a function body.

>> Source/WebCore/css/CSSSelector.h:266
>> +        void setValue(const AtomicString& value);
> 
> The parameter name "value" adds no information, so it should be removed.  [readability/parameter_name] [5]

I agree with the bot.

> Source/WebCore/css/CSSSelector.h:269
>          void setAttribute(const QualifiedName& value);
>          void setArgument(const AtomicString& value);
> -        void setSimpleSelector(CSSSelector* value);
> +        void setSimpleSelector(PassOwnPtr<CSSSelector> value);

No need for these “value” argument names either.

> Source/WebCore/css/CSSSelector.h:325
> +            AtomicStringImpl* m_value; // Plain pointer to keep things uniform with the union.

I still think that RefPtr would make things a little safer, but with the comment this is at least not mysterious.

>> Source/WebCore/css/CSSSelectorList.cpp:68
>> +            PassOwnPtr<CSSSelector> selector = current->releaseSelector();
> 
> Local variables should never be PassOwnPtr (see http://webkit.org/coding/RefPtr.html).  [readability/pass_ptr] [5]

This should be OwnPtr as the stylebot says.

> Source/WebCore/css/CSSSelectorList.cpp:73
> +            memcpy(&m_selectorArray[arrayIndex], selector.get(), sizeof(CSSSelector));
> +            current = current->tagHistory();
> +            // We want to free the memory (which was allocated with fastNew), but we
> +            // don't want the destructor to run since it will affect the copy we've just made.;
> +            fastDeleteSkippingDestructor(selector.leakPtr());

I’d really like to see a single operation that combines the memcpy with the fastDeleteSkippingDestructor, and have it be a member of the CSSSelector class. When the idiom is tricky like this I want it to be as tightly constrained as possible to make it hard to misuse.

> Source/WebCore/css/CSSSelectorList.cpp:82
> +    m_selectorArray[arrayIndex - 1].setLastInSelectorList();

Can the passed-in vector be empty? I see code above for size 1, but what about size 0?

> Source/WebCore/css/CSSSelectorList.h:45
> +    static CSSSelector* next(CSSSelector* current);

I don’t think the argument name “current” adds anything here.

> Source/WebCore/css/CSSSelectorList.h:64
> +inline CSSSelector* CSSSelectorList::next(CSSSelector* current)
> +{
> +    while (!current->isLastInTagHistory())
> +        current++;
> +    return current->isLastInSelectorList() ? 0 : current + 1;
> +}

I think this may need a comment. Skipping the tag history runs is a slightly confusing concept.
Comment 8 Antti Koivisto 2011-01-25 09:23:24 PST
> > Source/WebCore/css/CSSSelectorList.cpp:82
> > +    m_selectorArray[arrayIndex - 1].setLastInSelectorList();
> 
> Can the passed-in vector be empty? I see code above for size 1, but what about size 0?

These cover the case:

ASSERT(flattenedSize);
...
ASSERT(flattenedSize == arrayIndex);
Array[arrayIndex - 1].setLastInSelectorList();
Comment 9 Darin Adler 2011-01-25 09:29:27 PST
(In reply to comment #8)
> These cover the case:
> 
> ASSERT(flattenedSize);
> ...
> ASSERT(flattenedSize == arrayIndex);
> Array[arrayIndex - 1].setLastInSelectorList();

But what guarantees that assertion will be true?
Comment 10 Antti Koivisto 2011-01-25 11:50:24 PST
(In reply to comment #9)
> But what guarantees that assertion will be true?

CSS grammar should guarantee it. I'll throw a return after the assert to be sure.
Comment 11 Darin Adler 2011-01-25 11:59:20 PST
(In reply to comment #10)
> (In reply to comment #9)
> > But what guarantees that assertion will be true?
> 
> CSS grammar should guarantee it.

Good answer.

> I'll throw a return after the assert to be sure.

I don’t think you need that.
Comment 12 Antti Koivisto 2011-01-25 15:32:15 PST
http://trac.webkit.org/changeset/76648