Bug 77502 - Optimize owner form lookup
Summary: Optimize owner form lookup
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: Forms (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Kent Tamura
URL: http://www.haun.org/kent/html5/find-f...
Keywords:
Depends on:
Blocks: 58837
  Show dependency treegraph
 
Reported: 2012-01-31 19:45 PST by Kent Tamura
Modified: 2012-03-30 01:36 PDT (History)
6 users (show)

See Also:


Attachments
Patch (9.12 KB, patch)
2012-02-01 00:48 PST, Kent Tamura
ap: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kent Tamura 2012-01-31 19:45:33 PST
We use HTMLElement::findFormAncestor() to find an owner <form> when we insert a form control to a tree. findFormAncestor() iterates over ancestor nodes until it finds a <form>. So, if the tree doesn't contain no <form> elements, inserting a form control is O(depth).
Comment 1 Kent Tamura 2012-01-31 19:49:11 PST
The page of the URL field shows "106 ms" on my machine.  I confirmed an incoming patch improved it to "8 ms".
Comment 2 Kent Tamura 2012-02-01 00:48:38 PST
Created attachment 124897 [details]
Patch
Comment 3 Alexey Proskuryakov 2012-02-01 10:39:47 PST
Comment on attachment 124897 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=124897&action=review

Does this affect any real web pages? It's too easy to introduce security and correctness bugs by such caching, the benefit should be very material to attempt it. There is obvious memory impact - we already cache form pointer in every HTMLFormControlElement upon creation, and now we will have a separate map too.

I, for one, cannot vouch that ContainerNode::insertedIntoDocument and removedFromDocument cover all cases (are they called recursively? what's the difference with willRemove()?). Having another catch-all callback pair inside Document will be confusing (how will we decide what goes there, and what goes in ContainerNode counterpart?).

We have fairly weak test coverage in this area, especially for form controls that are outside their <form> trees in DOM.

> Source/WebCore/dom/Document.h:521
> +    void didInsertContainerNode(const ContainerNode*);
> +    void didRemoveContainerNode(const ContainerNode*);
> +    HTMLFormElement* nearestAncestorForm(const HTMLElement*);

We don't use const with nodes.
Comment 4 Darin Adler 2012-02-01 10:48:52 PST
Comment on attachment 124897 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=124897&action=review

This makes adding form controls faster by spending a lot of memory and by hurting performance for adding elements that are not inside a form. I’m not sure this is the way to do it. Instead, I think we should look at the possibility of memoizing the information about what form something is in and computing it only when asked.

> Source/WebCore/dom/ContainerNode.cpp:803
> +    document()->didInsertContainerNode(this);

What’s the performance impact of adding this additional work for every container node? Even doing it for every HTML element seems like it would cause a measurable performance slowdown.

Also, since we need this map only for HTML elements, why are we doing this for all other container nodes?

> Source/WebCore/dom/Document.cpp:4625
> +    if (!parent)
> +        return;

How did you test this code path? When is this 0?

> Source/WebCore/dom/Document.cpp:4631
> +        m_ancestorFormMap.set(node, form);

This seems like a vast amount of extra memory use. An entry in as hash table for every single element inside a form!
Comment 5 Kent Tamura 2012-02-01 20:46:16 PST
Thank you for comments.

I think the performance of HTMLElement::findFormAncestor() is not important for now. If we have a deep DOM tree and a lot of form controls, we must have other performance bottlenecks.  I haven't seen real bugs caused by HTMLElement::findFormAncestor() performance.

However, we need to add ancestor lookup for <fieldset> and <datalist> in addition to <form>.
 * https://bugs.webkit.org/show_bug.cgi?id=58837
 * http://www.whatwg.org/specs/web-apps/current-work/multipage/the-button-element.html#the-datalist-element  (See Constraint Validation paragraph)

When we implement these features, the ancestor lookup performance can be a show-stopper. Actually Bug 58837 was blocked by this issue.  I'd like to propose a way to resolve such ancestor lookup performance issue in this bug.
Comment 6 Eric Seidel (no email) 2012-02-16 13:41:14 PST
Comment on attachment 124897 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=124897&action=review

>> Source/WebCore/dom/Document.h:521
>> +    HTMLFormElement* nearestAncestorForm(const HTMLElement*);
> 
> We don't use const with nodes.

.... because they're RefCounted (and thus passing them around as const means they can't be put into a RefPtr).
Comment 7 Alexey Proskuryakov 2012-02-28 12:56:43 PST
Comment on attachment 124897 [details]
Patch

My and Darin's comments sound like r-.
Comment 8 yosin 2012-02-28 20:39:11 PST
Just dumb thought, can we calculation of owner form and HTMLFormElement:m_associatedElements on demand?

Something like:
class FormAssociatedElement ... {
  ...
  enum { Ready, NotReady } m_formState;
  HTMLFormElement* m_form;
};

class HTMLFormElement ... {
  enum { Dirty, NotReady, Ready } m_associateElementsState;
  ...
};

= PROS =
* No slowness for this test. :-)
* Don't need to compute unused m_form and m_associatedElements, e.g. I usually ignore survey form.

= CONS =
* Changes spread into 11 files.
* Total calculation time isn't changed if no dynamic form associated element insertion/deletion.
* Dynamic form associated element addition/deletion requires another calculation of m_associatedElements.
Comment 9 Kent Tamura 2012-02-28 23:04:11 PST
(In reply to comment #8)
> Just dumb thought, can we calculation of owner form and HTMLFormElement:m_associatedElements on demand?

I don't think it will have performance benefit because updating m_associatedElements requires DOM traversal, but it might simplify the code.  FormAssociatedElement.cpp will be smaller, and the HTML parser won't need to track a parent <form>.
Comment 10 yosin 2012-02-28 23:12:33 PST
I suspected slowness of Vector::append when adding 4000 form associate elements. But, it isn't. It seems Vector::append isn't slow.

There are 25 times of Vector::expandCapcity and copy of elements(= 19,423 x 4 = 77,692 byte on 32bit platform).

Here is result:
1: grow to 16
2: grow to 22
3: grow to 29
4: grow to 38
5: grow to 49
6: grow to 63
7: grow to 81
8: grow to 103
9: grow to 131
10: grow to 166
11: grow to 209
12: grow to 263
13: grow to 331
14: grow to 416
15: grow to 522
16: grow to 654
17: grow to 819
18: grow to 1026
19: grow to 1284
20: grow to 1607
21: grow to 2011
22: grow to 2516
23: grow to 3147
24: grow to 3936
25: grow to 4922
Comment 11 Kent Tamura 2012-03-30 01:36:59 PDT
I'll withdraw this bug because the idea in Bug 58837 (searching ancestors for <fieldset> and <legend> in single traversal) looks enough.

Comment #8 - #10 is another issue.  We should discuss it in another bug.