Bug 141192 - Add removeFirst(value) / removeAll(value) methods to WTF::Vector
Summary: Add removeFirst(value) / removeAll(value) methods to WTF::Vector
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Template Framework (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Chris Dumez
URL:
Keywords:
Depends on:
Blocks: 141321
  Show dependency treegraph
 
Reported: 2015-02-02 22:33 PST by Chris Dumez
Modified: 2015-02-05 20:23 PST (History)
7 users (show)

See Also:


Attachments
Patch (26.87 KB, patch)
2015-02-03 12:33 PST, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (26.10 KB, patch)
2015-02-03 13:06 PST, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (26.15 KB, patch)
2015-02-03 13:52 PST, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (26.98 KB, patch)
2015-02-03 20:53 PST, Chris Dumez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Dumez 2015-02-02 22:33:40 PST
Add removeOne(value) / removeAll(value) convenience methods to WTF::Vector to reduce complexity a bit at call sites.

The naming is inspired by QVector:
http://doc.qt.io/qt-5/qvector.html#removeOne
http://doc.qt.io/qt-5/qvector.html#removeAll

I am planning to add overloads for these taking a lambda function for matching in a follow-up patch as well. I have seen quite a few Vector::remove(index) call sites that would benefit from it.
Comment 1 Chris Dumez 2015-02-03 12:33:39 PST
Created attachment 245954 [details]
Patch
Comment 2 Chris Dumez 2015-02-03 13:06:20 PST
Created attachment 245959 [details]
Patch
Comment 3 Chris Dumez 2015-02-03 13:52:25 PST
Created attachment 245963 [details]
Patch
Comment 4 Chris Dumez 2015-02-03 16:08:02 PST
I compared the performance of Vector::removeAll() compared to the old way of index-based traversal + Vector::remove(index).

I used the following input:
Vector<int> speedV = {3, 1, 2, 1, 2, 1, 2, 2, 1, 1, 1, 3, 3, 1, 2, 1, 2, 1, 2, 2, 1, 1, 1, 3, 3, 1, 2, 1, 2, 1, 2, 2, 1, 1, 1, 3, 3, 1, 2, 1, 2, 1, 2, 2, 1, 1, 1, 3, 3, 1, 2, 1, 2, 1, 2, 2, 1, 1, 1, 3, 3, 1, 2, 1, 2, 1, 2, 2, 1, 1, 1, 3, 3, 1, 2, 1, 2, 1, 2, 2, 1, 1, 1, 3};

I did 10000000 iterations to removed all the 1's.

The results look really good:
Manual: 2478.31 ms
RemoveAll: 2129.9 ms (~17% faster)
Comment 5 Chris Dumez 2015-02-03 18:48:49 PST
(In reply to comment #4)
> I compared the performance of Vector::removeAll() compared to the old way of
> index-based traversal + Vector::remove(index).
> 
> I used the following input:
> Vector<int> speedV = {3, 1, 2, 1, 2, 1, 2, 2, 1, 1, 1, 3, 3, 1, 2, 1, 2, 1,
> 2, 2, 1, 1, 1, 3, 3, 1, 2, 1, 2, 1, 2, 2, 1, 1, 1, 3, 3, 1, 2, 1, 2, 1, 2,
> 2, 1, 1, 1, 3, 3, 1, 2, 1, 2, 1, 2, 2, 1, 1, 1, 3, 3, 1, 2, 1, 2, 1, 2, 2,
> 1, 1, 1, 3, 3, 1, 2, 1, 2, 1, 2, 2, 1, 1, 1, 3};
> 
> I did 10000000 iterations to removed all the 1's.
> 
> The results look really good:
> Manual: 2478.31 ms
> RemoveAll: 2129.9 ms (~17% faster)

I also tried a Vector of CString because we currently don't use memcpy/memmove for those:
Vector<CString> speedV = {"3", "1", "2", "1", "2", "1", "2", "2", "1", "1", "1", "3", "3", "1", "2", "1", "2", "1", "2", "2", "1", "1", "1", "3", "3", "1", "2", "1", "2", "1", "2", "2", "1", "1", "1", "3", "3", "1", "2", "1", "2", "1", "2", "2", "1", "1", "1", "3", "3", "1", "2", "1", "2", "1", "2", "2", "1", "1", "1", "3", "3", "1", "2", "1", "2", "1", "2", "2", "1", "1", "1", "3"};

I did 1000000 iterations to removed all the 1's.

The results were even better:
Manual: 1716.21 ms
RemoveAll: 581.744 ms

Micro-benchmark: http://pastebin.com/C4q4a2gb
Comment 6 Benjamin Poulain 2015-02-03 19:29:27 PST
Comment on attachment 245963 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=245963&action=review

That looks great to me.

> Source/WTF/ChangeLog:10
> +        QVector.

hehehe, back to the roots of WTF :)

> Source/WTF/ChangeLog:18
> +        (WTF::OverflowHandler>::removeOne):
> +        (WTF::OverflowHandler>::removeAll):

Not a big fan of the name.

The name removeAll() seems to be a better name for Vector::clear().
The name removeOne() seems like a removeAny(). It should be some kind of removeFirstOf().

Maybe the name should hint that the operation is not O(1)?
    findAndRemove()
    findAndRemoveAll()

You should ask Darin for ideas, he always have good feedback on naming.

> Source/WTF/wtf/Vector.h:739
> +    template<typename U>  bool removeOne(const U&);
> +    template<typename U>  unsigned removeAll(const U&);

Two spaces between the template<> and the return type.

> Source/WTF/wtf/Vector.h:1348
> +    m_size -= matchCount;

Orthogonal to your patch but: it may be worth considering shrinking the vector if "matchCount" >= m_size / 2.

> Source/WebCore/html/HTMLFormElement.cpp:-615
> -    unsigned index;
> -    for (index = 0; index < m_associatedElements.size(); ++index) {
> -        if (m_associatedElements[index] == e)
> -            break;
> -    }

Weird

> Source/WebCore/page/SecurityPolicy.cpp:158
> +    if (!list.removeOne(OriginAccessEntry(destinationProtocol, destinationDomain, allowDestinationSubdomains ? OriginAccessEntry::AllowSubdomains : OriginAccessEntry::DisallowSubdomains)))

Maybe split this?

OriginAccessEntry someVeryGoodNameForSecurityStuff(destinationProtocol, destinationDomain, allowDestinationSubdomains ? OriginAccessEntry::AllowSubdomains : OriginAccessEntry::DisallowSubdomains);

> Tools/TestWebKitAPI/Tests/WTF/Vector.cpp:348
> +    EXPECT_EQ(9U, v.size());
> +    EXPECT_TRUE(v == Vector<int>({1, 1, 1, 1, 1, 1, 1, 1, 1}));

I would test the loop doing removeOne() eleven times, brute forcing that all branches are used.

> Tools/TestWebKitAPI/Tests/WTF/Vector.cpp:377
> +    v = {1, 2, 1, 2, 1, 2, 2, 1, 1, 1};

Large match at the end.

> Tools/TestWebKitAPI/Tests/WTF/Vector.cpp:386
> +    v = {3, 1, 2, 1, 2, 1, 2, 2, 1, 1, 1, 3};

Large match in the middle.

> Tools/TestWebKitAPI/Tests/WTF/Vector.cpp:394
> +

Where is large match at the front?
Comment 7 Chris Dumez 2015-02-03 20:02:37 PST
Comment on attachment 245963 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=245963&action=review

>> Source/WTF/ChangeLog:18
>> +        (WTF::OverflowHandler>::removeAll):
> 
> Not a big fan of the name.
> 
> The name removeAll() seems to be a better name for Vector::clear().
> The name removeOne() seems like a removeAny(). It should be some kind of removeFirstOf().
> 
> Maybe the name should hint that the operation is not O(1)?
>     findAndRemove()
>     findAndRemoveAll()
> 
> You should ask Darin for ideas, he always have good feedback on naming.

For what it's worth, glib seems to be using g_list_remove() / g_list_remove_all() for those:
https://developer.gnome.org/glib/stable/glib-Doubly-Linked-Lists.html#g-list-remove-all

In our case we cannot really call the first one remove(value) because it could be ambiguous with existing remove(index).

findAndRemove*() is clearer in terms of complexity but I am not a fan because it is long (and I have never seen this naming anywhere :).
Comment 8 Chris Dumez 2015-02-03 20:12:00 PST
Comment on attachment 245963 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=245963&action=review

>>> Source/WTF/ChangeLog:18
>>> +        (WTF::OverflowHandler>::removeAll):
>> 
>> Not a big fan of the name.
>> 
>> The name removeAll() seems to be a better name for Vector::clear().
>> The name removeOne() seems like a removeAny(). It should be some kind of removeFirstOf().
>> 
>> Maybe the name should hint that the operation is not O(1)?
>>     findAndRemove()
>>     findAndRemoveAll()
>> 
>> You should ask Darin for ideas, he always have good feedback on naming.
> 
> For what it's worth, glib seems to be using g_list_remove() / g_list_remove_all() for those:
> https://developer.gnome.org/glib/stable/glib-Doubly-Linked-Lists.html#g-list-remove-all
> 
> In our case we cannot really call the first one remove(value) because it could be ambiguous with existing remove(index).
> 
> findAndRemove*() is clearer in terms of complexity but I am not a fan because it is long (and I have never seen this naming anywhere :).

I do think removeFirst(value) would be a bit more explicit than removeOne(value) though. It would be clearer that we delete the *first* occurrence, not just any. And it leaves room for a removeLast(value) if ever useful.
Comment 9 Chris Dumez 2015-02-03 20:53:32 PST
Created attachment 246012 [details]
Patch
Comment 10 Benjamin Poulain 2015-02-03 23:28:30 PST
Did you mean to cq+?
Comment 11 WebKit Commit Bot 2015-02-04 00:39:26 PST
Comment on attachment 246012 [details]
Patch

Clearing flags on attachment: 246012

Committed r179599: <http://trac.webkit.org/changeset/179599>
Comment 12 WebKit Commit Bot 2015-02-04 00:39:31 PST
All reviewed patches have been landed.  Closing bug.