Bug 147979

Summary: Refactor HTMLCollection to be as fast as CachedLiveNodeList
Product: WebKit Reporter: Chris Dumez <cdumez>
Component: DOMAssignee: Chris Dumez <cdumez>
Status: RESOLVED FIXED    
Severity: Normal CC: barraclough, commit-queue, kling, koivisto, rniwa
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 148058    
Bug Blocks: 110611, 147980    
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch none

Description Chris Dumez 2015-08-13 09:22:23 PDT
Refactor HTMLCollection to be as fast as CachedLiveNodeList. This is in preparation of having getElementsByTagName*() / getElementsByClassName() return an HTMLCollection instead of a NodeList, as per the specification. Chrome and Firefox already match the specification in this case.

HTMLCollection currently has a lot of branching which makes iterating over it slower than over a CachedLiveNodeList.
Comment 1 Chris Dumez 2015-08-13 22:04:11 PDT
Created attachment 258979 [details]
Patch
Comment 2 Ryosuke Niwa 2015-08-14 01:21:57 PDT
Comment on attachment 258979 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=258979&action=review

I've just done the first pass through but I'm going to read through it again tomorrow.

> Source/WebCore/ChangeLog:9
> +        preparation of having getElementsByTagName*() / getElementsByClassName()

What's up with * between Name and ()?

> Source/WebCore/dom/ContainerNode.cpp:906
> -    return ensureCachedHTMLCollection(NodeChildren);
> +    return ensureRareData().ensureNodeLists().addCachedCollection<GenericCachedHTMLCollection<CollectionTypeTraits<NodeChildren>::traversalType>>(*this, NodeChildren);

We should make children() faster at some point.

> Source/WebCore/html/CachedHTMLCollection.h:89
> +template <typename HTMLCollectionClass, CollectionTraversalType traversalType>
> +auto CachedHTMLCollection<HTMLCollectionClass, traversalType>::collectionBegin() const -> IteratorType
> +{
> +    return CollectionTraversal<traversalType>::begin(static_cast<const HTMLCollectionClass&>(*this), rootNode());
> +}

Why don't we put these one-liners into the class declaration?
We can avoid duplicating the template boilerplate everywhere.

> Source/WebCore/html/CachedHTMLCollection.h:156
> +static inline bool nameShouldBeVisibleInDocumentAll(HTMLElement& element)

I'd call this IsNameAttrNamedProperty instead (named property is the WebIDL term for named getters).
Comment 3 Chris Dumez 2015-08-14 09:13:09 PDT
Comment on attachment 258979 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=258979&action=review

>> Source/WebCore/ChangeLog:9
>> +        preparation of having getElementsByTagName*() / getElementsByClassName()
> 
> What's up with * between Name and ()?

wild card so it matches getElementsByTagNameNS() as well as  getElementsByTagName().

>> Source/WebCore/html/CachedHTMLCollection.h:89
>> +}
> 
> Why don't we put these one-liners into the class declaration?
> We can avoid duplicating the template boilerplate everywhere.

Hmm, we can. The templates do add a lot of boilerplate.

>> Source/WebCore/html/CachedHTMLCollection.h:156
>> +static inline bool nameShouldBeVisibleInDocumentAll(HTMLElement& element)
> 
> I'd call this IsNameAttrNamedProperty instead (named property is the WebIDL term for named getters).

Note that I merely moved this function in this, it isn't new. Also note that this function is only used for document.all but your name proposal seems very generic.
We basically want something like "shouldHTMLAllCollectionNamedGetterReturnElementByName(element)" I think.
Comment 4 Chris Dumez 2015-08-14 09:40:49 PDT
Created attachment 259004 [details]
Patch
Comment 5 Chris Dumez 2015-08-14 09:43:29 PDT
Created attachment 259005 [details]
Patch
Comment 6 Antti Koivisto 2015-08-14 10:23:24 PDT
Comment on attachment 258979 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=258979&action=review

> Source/WebCore/html/CachedHTMLCollection.h:56
> +    // For CollectionIndexCache; do not use elsewhere.
> +    IteratorType collectionBegin() const;
> +    IteratorType collectionLast() const;
> +    IteratorType collectionEnd() const;
> +    void collectionTraverseForward(IteratorType&, unsigned count, unsigned& traversedCount) const;
> +    void collectionTraverseBackward(IteratorType&, unsigned count) const;
> +    bool collectionCanTraverseBackward() const;
> +    void willValidateIndexCache() const;

Not really this patch but it would be nice to rename willValidateIndexCache to something that follows the pattern (collectionWillValidateIndexCache or similar).

> Source/WebCore/html/CollectionType.h:70
> +enum class CollectionTraversalType { Descendants, ChildrenOnly, CustomForwardOnly };
> +template<CollectionType collectionType>
> +struct CollectionTypeTraits {
> +    static const CollectionTraversalType traversalType = CollectionTraversalType::Descendants;
> +};
> +
> +template<>
> +struct CollectionTypeTraits<NodeChildren> {
> +    static const CollectionTraversalType traversalType = CollectionTraversalType::ChildrenOnly;
> +};
> +
> +template<>
> +struct CollectionTypeTraits<TRCells> {
> +    static const CollectionTraversalType traversalType = CollectionTraversalType::ChildrenOnly;
> +};

Wonder if C++14 constexprs could be used to reduce this sort boilerplate (when that is ok'd).

> Source/WebCore/html/GenericCachedHTMLCollection.cpp:40
> +    switch (Base::type()) {

type() should be sufficient.

> Source/WebCore/html/GenericCachedHTMLCollection.cpp:82
> +    case DocAll:
> +    case DocumentNamedItems:
> +    case FormControls:
> +    case SelectOptions:
> +    case TableRows:
> +    case WindowNamedItems:
> +        break;
> +    }
> +    ASSERT_NOT_REACHED();

It would be good to add a comment that these have non-generic handling for performance reasons and some other types may need it too.

> Source/WebCore/html/HTMLFormControlsCollection.h:46
> +    using Base = CachedHTMLCollection<HTMLFormControlsCollection, CollectionTypeTraits<FormControls>::traversalType>;

Why are these Base typedefs needed? They don't seem to save much (any?) characters and I can't see how they are necessary for functionality. Additional type makes code harder to understand.

If they are needed can it be defined in the CachedHTMLCollection template instead for all subclasses?

> Source/WebCore/html/HTMLNameCollection.h:51
> +    : CachedHTMLCollection<HTMLCollectionClass, traversalType>(document, type)

You could use Base here since defined it.
Comment 7 Antti Koivisto 2015-08-14 10:23:47 PDT
(the comments are for the original patch)
Comment 8 Chris Dumez 2015-08-14 10:28:12 PDT
Comment on attachment 258979 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=258979&action=review

>> Source/WebCore/html/GenericCachedHTMLCollection.cpp:40
>> +    switch (Base::type()) {
> 
> type() should be sufficient.

No, this would not build because the parent class is templated. I need to use Base::type() (or I think this->type() would work as well).
Comment 9 Chris Dumez 2015-08-14 12:20:32 PDT
Created attachment 259021 [details]
Patch
Comment 10 Chris Dumez 2015-08-14 12:22:24 PDT
Comment on attachment 258979 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=258979&action=review

>>> Source/WebCore/html/CachedHTMLCollection.h:89
>>> +}
>> 
>> Why don't we put these one-liners into the class declaration?
>> We can avoid duplicating the template boilerplate everywhere.
> 
> Hmm, we can. The templates do add a lot of boilerplate.

Done in the latest iteration.

>>> Source/WebCore/html/GenericCachedHTMLCollection.cpp:40
>>> +    switch (Base::type()) {
>> 
>> type() should be sufficient.
> 
> No, this would not build because the parent class is templated. I need to use Base::type() (or I think this->type() would work as well).

I use this->type() in the latest iteration.

>> Source/WebCore/html/HTMLFormControlsCollection.h:46
>> +    using Base = CachedHTMLCollection<HTMLFormControlsCollection, CollectionTypeTraits<FormControls>::traversalType>;
> 
> Why are these Base typedefs needed? They don't seem to save much (any?) characters and I can't see how they are necessary for functionality. Additional type makes code harder to understand.
> 
> If they are needed can it be defined in the CachedHTMLCollection template instead for all subclasses?

I dropped them in the latest iteration.
Comment 11 Chris Dumez 2015-08-14 12:26:56 PDT
Created attachment 259023 [details]
Patch
Comment 12 Chris Dumez 2015-08-14 12:29:23 PDT
Created attachment 259026 [details]
Patch
Comment 13 Chris Dumez 2015-08-14 12:30:29 PDT
Comment on attachment 258979 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=258979&action=review

Made a build fix for EFL.

>> Source/WebCore/html/GenericCachedHTMLCollection.cpp:82
>> +    ASSERT_NOT_REACHED();
> 
> It would be good to add a comment that these have non-generic handling for performance reasons and some other types may need it too.

Added comment in latest iteration.
Comment 14 Ryosuke Niwa 2015-08-14 17:41:08 PDT
Comment on attachment 259026 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=259026&action=review

Sorry for the delay. It took me a while to comprehend the entirely of the patch.

> Source/WebCore/dom/LiveNodeList.h:103
> +    using Traversal = CollectionTraversal<CollectionTraversalType::Descendants>;

I find Traversal:: shorthands to be confusing because it looks overly generic.
If anything, we might wanna simply define DescendantsCollectionTraversal instead inside CollectionTraversal.h

> Source/WebCore/html/CachedHTMLCollection.h:38
> +    using Traversal = CollectionTraversal<traversalType>;

Ditto about this shorthand. I think it's fine to repeat CollectionTraversal<traversalType> everywhere instead.

> Source/WebCore/html/CachedHTMLCollection.h:39
> +    using IteratorType = typename Traversal::IteratorType;

"IteratorType" seems overly generic. How about CollectionTraversalIterator?

> Source/WebCore/html/CachedHTMLCollection.h:86
> +    document().registerCollection(const_cast<CachedHTMLCollection<HTMLCollectionClass, traversalType>&>(*this));

It seems like this one-liner can also be defined inside the declaration.

> Source/WebCore/html/CachedHTMLCollection.h:102
> +    // We should call the elementMatches() method directly on the subclass instead for performance.

"We should call" sounds as if this is a FIXME to address.
Not don't we just say "We call" instead?

> Source/WebCore/html/CollectionTraversal.h:96
> +template <typename CollectionClass>
> +inline ElementDescendantIterator CollectionTraversal<CollectionTraversalType::Descendants>::begin(const CollectionClass& collection, ContainerNode& rootNode)

I'd prefer putting these function definitions right beneath each class declaration so that decorations and definitions are paired up.
Otherwise, it's hard to figure out which definitions correspond with which declaration.

> Source/WebCore/html/CollectionTraversal.h:129
> +    auto end = descendants.end();

It's odd to check against end() when we're traversing backwards.
Perhaps this deserves a comment?
Alternatively, we can get begin() and exit the loop once we've hit that.

> Source/WebCore/html/CollectionTraversal.h:141
> +    auto end = children.end();

Ditto.

> Source/WebCore/html/CollectionTraversal.h:161
> +    auto end = collection.collectionEnd();

It's unclear to me why this wouldn't traverse the DOM tree given its definition: Traversal::end(rootNode())
We may want to add a comment or revise the code.

> Source/WebCore/html/CollectionTraversal.h:175
> +    auto end = collection.collectionEnd();

Ditto.

> Source/WebCore/html/HTMLCollection.cpp:137
>      // The pathological case. We need to walk the entire subtree.
>      updateNamedElementCache();

It's a bit strange to keep this fallback in HTMLCollection::namedItem.
Why don't we rename to something like CachedHTMLCollection::namedItemSlowCase and make HTMLCollection::namedItem pure virtual?
Comment 15 Chris Dumez 2015-08-14 21:37:01 PDT
Created attachment 259082 [details]
Patch
Comment 16 WebKit Commit Bot 2015-08-14 22:45:12 PDT
Comment on attachment 259082 [details]
Patch

Clearing flags on attachment: 259082

Committed r188508: <http://trac.webkit.org/changeset/188508>
Comment 17 WebKit Commit Bot 2015-08-14 22:45:17 PDT
All reviewed patches have been landed.  Closing bug.
Comment 18 WebKit Commit Bot 2015-08-15 11:05:17 PDT
Re-opened since this is blocked by bug 148058
Comment 19 Chris Dumez 2015-08-15 20:09:17 PDT
(In reply to comment #18)
> Re-opened since this is blocked by bug 148058

Based on the traces:
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   com.apple.JavaScriptCore      	0x0000000116ac6f17 WTFCrashWithSecurityImplication + 39
1   com.apple.WebCore             	0x0000000118698865 WebCore::dispatchChildRemovalEvents(WebCore::Node&) + 117 (ContainerNode.cpp:818)
2   com.apple.WebCore             	0x0000000118698783 WebCore::ContainerNode::willRemoveChild(WebCore::Node&) + 147 (ContainerNode.cpp:486)
3   com.apple.WebCore             	0x0000000118698576 WebCore::ContainerNode::removeChild(WebCore::Node*, int&) + 422 (ContainerNode.cpp:555)
4   com.apple.WebCore             	0x0000000119a97d18 WebCore::Node::removeChild(WebCore::Node*, int&) + 88 (Node.cpp:466)
5   com.apple.WebCore             	0x000000011958107e WebCore::JSNode::removeChild(JSC::ExecState*) + 78 (JSNodeCustom.cpp:139)
6   com.apple.WebCore             	0x000000011957d64f WebCore::jsNodePrototypeFunctionRemoveChild(JSC::ExecState*) + 383 (JSNode.cpp:671)
7   ???                           	0x00003d3825a01028 0 + 67311358709800
8   com.apple.JavaScriptCore      	0x000000011684b826 llint_entry + 25866
9   com.apple.JavaScriptCore      	0x000000011684b826 llint_entry + 25866
10  com.apple.JavaScriptCore      	0x000000011684b826 llint_entry + 25866
11  com.apple.JavaScriptCore      	0x000000011684b826 llint_entry + 25866
12  com.apple.JavaScriptCore      	0x00000001168450d9 vmEntryToJavaScript + 361
13  com.apple.JavaScriptCore      	0x000000011669915c JSC::JITCode::execute(JSC::VM*, JSC::ProtoCallFrame*) + 252
14  com.apple.JavaScriptCore      	0x000000011667df77 JSC::Interpreter::executeCall(JSC::ExecState*, JSC::JSObject*, JSC::CallType, JSC::CallData const&, JSC::JSValue, JSC::ArgList const&) + 1255
15  com.apple.JavaScriptCore      	0x00000001160abd4e JSC::call(JSC::ExecState*, JSC::JSValue, JSC::CallType, JSC::CallData const&, JSC::JSValue, JSC::ArgList const&) + 190
16  com.apple.JavaScriptCore      	0x00000001160abdb3 JSC::call(JSC::ExecState*, JSC::JSValue, JSC::CallType, JSC::CallData const&, JSC::JSValue, JSC::ArgList const&, WTF::NakedPtr<JSC::Exception>&) + 83
17  com.apple.WebCore             	0x000000011925df0b WebCore::JSMainThreadExecState::call(JSC::ExecState*, JSC::JSValue, JSC::CallType, JSC::CallData const&, JSC::JSValue, JSC::ArgList const&, WTF::NakedPtr<JSC::Exception>&) + 107 (JSMainThreadExecState.h:56)
18  com.apple.WebCore             	0x00000001193dfe12 WebCore::JSEventListener::handleEvent(WebCore::ScriptExecutionContext*, WebCore::Event*) + 1218 (JSEventListener.cpp:130)
19  com.apple.WebCore             	0x0000000118b8b9b4 WebCore::EventTarget::fireEventListeners(WebCore::Event*, WebCore::EventTargetData*, WTF::Vector<WebCore::RegisteredEventListener, 1ul, WTF::CrashOnOverflow, 16ul>&) + 1444 (EventTarget.cpp:256)
20  com.apple.WebCore             	0x0000000118b8b1ee WebCore::EventTarget::fireEventListeners(WebCore::Event*) + 334 (EventTarget.cpp:208)
21  com.apple.WebCore             	0x0000000118ab353b WebCore::DOMWindow::dispatchEvent(WTF::PassRefPtr<WebCore::Event>, WTF::PassRefPtr<WebCore::EventTarget>) + 539 (DOMWindow.cpp:1900)
22  com.apple.WebCore             	0x0000000118abadd5 WebCore::DOMWindow::dispatchLoadEvent() + 293 (DOMWindow.cpp:1858)
23  com.apple.WebCore             	0x000000011897aa3d WebCore::Document::dispatchWindowLoadEvent() + 141 (Document.cpp:4032)
24  com.apple.WebCore             	0x00000001189774ad WebCore::Document::implicitClose() + 557 (Document.cpp:2610)
25  com.apple.WebCore             	0x0000000118d091db WebCore::FrameLoader::checkCallImplicitClose() + 155 (FrameLoader.cpp:891)
26  com.apple.WebCore             	0x0000000118d08eae WebCore::FrameLoader::checkCompleted() + 270 (FrameLoader.cpp:838)
27  com.apple.WebCore             	0x0000000118d09025 WebCore::FrameLoader::loadDone() + 21 (FrameLoader.cpp:770)
28  com.apple.WebCore             	0x00000001185b87f9 WebCore::CachedResourceLoader::loadDone(WebCore::CachedResource*, bool) + 121 (CachedResourceLoader.cpp:930)
29  com.apple.WebCore             	0x000000011a273f65 WebCore::SubresourceLoader::notifyDone() + 293 (SubresourceLoader.cpp:457)
30  com.apple.WebCore             	0x000000011a273b76 WebCore::SubresourceLoader::didFinishLoading(double) + 662 (SubresourceLoader.cpp:382)
31  com.apple.WebKit              	0x00000001138456c7 WebKit::WebResourceLoader::didFinishResourceLoad(double) + 151 (WebResourceLoader.cpp:161)
32  com.apple.WebKit              	0x000000011384a793 void IPC::callMemberFunctionImpl<WebKit::WebResourceLoader, void (WebKit::WebResourceLoader::*)(double), std::__1::tuple<double>, 0ul>(WebKit::WebResourceLoader*, void (WebKit::WebResourceLoader::*)(double), std::__1::tuple<double>&&, std::index_sequence<0ul>) + 163 (HandleMessage.h:17)
33  com.apple.WebKit              	0x000000011384a6e8 void IPC::callMemberFunction<WebKit::WebResourceLoader, void (WebKit::WebResourceLoader::*)(double), std::__1::tuple<double>, std::make_index_sequence<1ul> >(std::__1::tuple<double>&&, WebKit::WebResourceLoader*, void (WebKit::WebResourceLoader::*)(double)) + 88 

This has likely to do with ElementIteratorAssertions. It looks like it can hold a NoEventDispatchAssertion, which would explain the crashes.
Comment 20 Chris Dumez 2015-08-15 21:28:39 PDT
CollectionIndexCache hold an ElementIterator (m_current). When this iterator is valid, it prevents JS events from being fired. If I understand it correctly, this seems overkill in the context of HTMLCollections because we already make sure HTMLCollection caches are invalidated when relevant DOM tree modifications happens.

Consider the following case:
<div>
    <p></p>
    <table id="myTable"><tr></tr></table>
</div>

- If I access myTable.rows(), the returned HTMLCollection's CollectionIndexCache will have a valid m_current iterator.
- If the JS then removes the <p> element, we will NOT invalidate the HTMLCollection returned by myTable.rows() (because there is no need). However, we will fire DOMMutation events. When firing the events, we will hit an assertion because the collection iterator in CollectionIndexCache is holding a NoEventDispatchAssertion.

Antti, any thoughts?
Comment 21 Antti Koivisto 2015-08-16 09:22:27 PDT
Yeah, we should probably add a way to construct ElementIterators without the assertion for this as we have a fine-grained invalidation scheme.
Comment 22 Chris Dumez 2015-08-16 09:40:20 PDT
(In reply to comment #21)
> Yeah, we should probably add a way to construct ElementIterators without the
> assertion for this as we have a fine-grained invalidation scheme.

Ok. Also, I believe I found out why this wasn't happening for CachedLiveNodeList. CachedLiveNodeList is using only ElementDescendantIterator (while CachedHTMLCollection is using both ElementDescendantIterator and ElementChildIterator). ElementDescendantIterator has an m_assertion member, however, its constructor doesn't pass the current Element to the ElementIteratorAssertions constructor. Therefore, the ElementIteratorAssertions never takes a NoEventDispatchAssertion.
Comment 23 Antti Koivisto 2015-08-16 09:43:53 PDT
Yeah. It would be good to fix that too.
Comment 24 Chris Dumez 2015-08-16 10:57:10 PDT
Created attachment 259125 [details]
Patch
Comment 25 Chris Dumez 2015-08-16 11:01:37 PDT
Created attachment 259126 [details]
Patch
Comment 26 Chris Dumez 2015-08-16 11:04:47 PDT
Comment on attachment 259126 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=259126&action=review

Still in the process of running all the layout tests in debug.

> Source/WebCore/dom/ElementDescendantIterator.h:120
> +    , m_assertions(current)

Really enable assertions for ElementDescendantIterator.

> Source/WebCore/dom/ElementDescendantIterator.h:233
> +    , m_assertions(current)

Really enable assertions for ElementDescendantConstIterator.

> Source/WebCore/html/CollectionTraversal.h:65
> +            it.dropAssertions();

Drop iterator assertions before returning it to CollectionIndexCache.

> Source/WebCore/html/CollectionTraversal.h:80
> +            it.dropAssertions();

Drop iterator assertions before returning it to CollectionIndexCache.

> Source/WebCore/html/CollectionTraversal.h:142
> +            it.dropAssertions();

Drop iterator assertions before returning it to CollectionIndexCache.

> Source/WebCore/html/CollectionTraversal.h:158
> +            it.dropAssertions();

Drop iterator assertions before returning it to CollectionIndexCache.
Comment 27 Chris Dumez 2015-08-16 11:48:11 PDT
Comment on attachment 259126 [details]
Patch

Ok, I ran ALL the layout tests in debug this time (not just the fast/ folder XD)

Antti, are you ok with the latest changes?
Comment 28 Antti Koivisto 2015-08-16 11:56:53 PDT
Comment on attachment 259126 [details]
Patch

Looks good though I would have done ElementIterator changes in a separate patch.
Comment 29 Antti Koivisto 2015-08-16 11:57:14 PDT
r=me
Comment 30 Chris Dumez 2015-08-16 11:59:21 PDT
Committed r188520: <http://trac.webkit.org/changeset/188520>