Bug 234201

Summary: [WTF] Introduce TrailingArray
Product: WebKit Reporter: Yusuke Suzuki <ysuzuki>
Component: New BugsAssignee: Yusuke Suzuki <ysuzuki>
Status: RESOLVED FIXED    
Severity: Normal CC: annulen, benjamin, cdumez, cmarcelo, darin, ews-watchlist, gyuyoung.kim, keith_miller, mark.lam, msaboff, ryuan.choi, saam, sergio, tzagallo, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
ews-feeder: commit-queue-
Patch
none
Patch
none
Patch
none
Patch darin: review+, ews-feeder: commit-queue-

Description Yusuke Suzuki 2021-12-11 13:37:55 PST
[WTF] Introduce TrailingArray
Comment 1 Yusuke Suzuki 2021-12-11 13:41:03 PST
Created attachment 446900 [details]
Patch
Comment 2 Yusuke Suzuki 2021-12-11 13:48:15 PST
Created attachment 446901 [details]
Patch
Comment 3 Yusuke Suzuki 2021-12-11 15:34:38 PST
Created attachment 446910 [details]
Patch
Comment 4 Yusuke Suzuki 2021-12-11 21:55:44 PST
Created attachment 446921 [details]
Patch
Comment 5 Yusuke Suzuki 2021-12-11 22:03:14 PST
Created attachment 446922 [details]
Patch
Comment 6 Yusuke Suzuki 2021-12-11 23:47:52 PST
Created attachment 446924 [details]
Patch
Comment 7 Darin Adler 2021-12-14 10:42:30 PST
Comment on attachment 446924 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=446924&action=review

I’m not thrilled by how much code is put in the class template definitions here. Makes it a bit hard to see the overview of the class template.

> Source/WTF/wtf/EmbeddedFixedVector.h:45
> +        return UniqueRef { *new (NotNull, fastMalloc(Base::allocationSize(size))) EmbeddedFixedVector(size) };

Can we do without the type name entirely, and just use the braces?

> Source/WTF/wtf/EmbeddedFixedVector.h:51
> +        unsigned size = std::distance(first, last);

What guarantees the distance can be shortened from size_t to unsigned? Do we want a RELEASE_ASSERT of that? Or is there some way we know this is always safe?

> Source/WTF/wtf/EmbeddedFixedVector.h:52
> +        return UniqueRef { *new (NotNull, fastMalloc(Base::allocationSize(size))) EmbeddedFixedVector(size, first, last) };

Dito.

> Source/WTF/wtf/EmbeddedFixedVector.h:58
> +        unsigned size = std::size(other);

Why call std::size instead of size() here?

Same question about why this fits in unsigned.

> Source/WTF/wtf/EmbeddedFixedVector.h:65
> +        Vector<T, inlineCapacity, OverflowHandler> container = WTFMove(other);

Why is it good to do this move? I’d think we’d want to move the *elements* out of the vector and never move the vector itself, should just be able to use std::move_iterator { other.begin() } below.

Also unclear on why we are using std::begin. It seems to me that we can just use Vector::begin unless we have some reason we want to write code that’s more generic.

Also, I would use auto here for the type if we were keeping this.

> Source/WTF/wtf/EmbeddedFixedVector.h:66
> +        unsigned size = std::size(container);

Same question about narrowing to unsigned.

> Source/WTF/wtf/EmbeddedFixedVector.h:92
> +    EmbeddedFixedVector(unsigned size)

explicit

> Source/WTF/wtf/FixedVector.h:155
> +    Storage* getStorage() { return m_storage.get(); }

WebKit coding style says function should not be named with the word "get" in it. We use get to mean "out arguments" or such.

> Source/WTF/wtf/TrailingArray.h:65
> +    TrailingArray(unsigned size, InputIterator first, InputIterator last)

Seems a little bit strange to take the size as an argument but also call std::distance. I guess I’ll have to look at call sites to understand why.
Comment 8 Yusuke Suzuki 2021-12-17 19:57:55 PST
Comment on attachment 446924 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=446924&action=review

Thanks

>> Source/WTF/wtf/EmbeddedFixedVector.h:45
>> +        return UniqueRef { *new (NotNull, fastMalloc(Base::allocationSize(size))) EmbeddedFixedVector(size) };
> 
> Can we do without the type name entirely, and just use the braces?

It becomes compile error.

>> Source/WTF/wtf/EmbeddedFixedVector.h:51
>> +        unsigned size = std::distance(first, last);
> 
> What guarantees the distance can be shortened from size_t to unsigned? Do we want a RELEASE_ASSERT of that? Or is there some way we know this is always safe?

Use CheckedUInt here.

>> Source/WTF/wtf/EmbeddedFixedVector.h:58
>> +        unsigned size = std::size(other);
> 
> Why call std::size instead of size() here?
> 
> Same question about why this fits in unsigned.

Originally, this function can take C array too. But now I limited it because it makes && handling super hard.
Use other.size() for now.

>> Source/WTF/wtf/EmbeddedFixedVector.h:65
>> +        Vector<T, inlineCapacity, OverflowHandler> container = WTFMove(other);
> 
> Why is it good to do this move? I’d think we’d want to move the *elements* out of the vector and never move the vector itself, should just be able to use std::move_iterator { other.begin() } below.
> 
> Also unclear on why we are using std::begin. It seems to me that we can just use Vector::begin unless we have some reason we want to write code that’s more generic.
> 
> Also, I would use auto here for the type if we were keeping this.

We should perform this move since the parameter is Vector&&. The parameter is saying we should move the vector (not elements of vector).

>> Source/WTF/wtf/EmbeddedFixedVector.h:66
>> +        unsigned size = std::size(container);
> 
> Same question about narrowing to unsigned.

Ditto.

>> Source/WTF/wtf/EmbeddedFixedVector.h:92
>> +    EmbeddedFixedVector(unsigned size)
> 
> explicit

Added

>> Source/WTF/wtf/FixedVector.h:155
>> +    Storage* getStorage() { return m_storage.get(); }
> 
> WebKit coding style says function should not be named with the word "get" in it. We use get to mean "out arguments" or such.

OK, changed.

>> Source/WTF/wtf/TrailingArray.h:65
>> +    TrailingArray(unsigned size, InputIterator first, InputIterator last)
> 
> Seems a little bit strange to take the size as an argument but also call std::distance. I guess I’ll have to look at call sites to understand why.

This is because, at constructor call point, we must know the size of array (because TrailingArray memory should be allocated with this size). So we should have this size information already and we do not need to perform std::distance again (this can be costly, depending on how the iterator of some container is implemented).
This parameter requires the derived implementation to follow the above appropriate style.
Comment 9 Yusuke Suzuki 2021-12-18 02:17:17 PST
Committed r287220 (245383@trunk): <https://commits.webkit.org/245383@trunk>
Comment 10 Radar WebKit Bug Importer 2021-12-18 02:18:17 PST
<rdar://problem/86666683>
Comment 11 Darin Adler 2021-12-19 15:44:43 PST
Comment on attachment 446924 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=446924&action=review

>>> Source/WTF/wtf/EmbeddedFixedVector.h:65
>>> +        Vector<T, inlineCapacity, OverflowHandler> container = WTFMove(other);
>> 
>> Why is it good to do this move? I’d think we’d want to move the *elements* out of the vector and never move the vector itself, should just be able to use std::move_iterator { other.begin() } below.
>> 
>> Also unclear on why we are using std::begin. It seems to me that we can just use Vector::begin unless we have some reason we want to write code that’s more generic.
>> 
>> Also, I would use auto here for the type if we were keeping this.
> 
> We should perform this move since the parameter is Vector&&. The parameter is saying we should move the vector (not elements of vector).

I don’t agree with this.

When you pass an rvalue reference, what it means is that the recipient can take the contents of what is passed, but it does not mean literally that the top level object must be moved. For example, sometimes such functions examine and do not move the argument in some code paths. I think it would be better to move the elements and not unnecessarily move the vector itself.
Comment 12 Yusuke Suzuki 2021-12-20 10:48:52 PST
Comment on attachment 446924 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=446924&action=review

>>>> Source/WTF/wtf/EmbeddedFixedVector.h:65
>>>> +        Vector<T, inlineCapacity, OverflowHandler> container = WTFMove(other);
>>> 
>>> Why is it good to do this move? I’d think we’d want to move the *elements* out of the vector and never move the vector itself, should just be able to use std::move_iterator { other.begin() } below.
>>> 
>>> Also unclear on why we are using std::begin. It seems to me that we can just use Vector::begin unless we have some reason we want to write code that’s more generic.
>>> 
>>> Also, I would use auto here for the type if we were keeping this.
>> 
>> We should perform this move since the parameter is Vector&&. The parameter is saying we should move the vector (not elements of vector).
> 
> I don’t agree with this.
> 
> When you pass an rvalue reference, what it means is that the recipient can take the contents of what is passed, but it does not mean literally that the top level object must be moved. For example, sometimes such functions examine and do not move the argument in some code paths. I think it would be better to move the elements and not unnecessarily move the vector itself.

I think many code is doing that. For example, Vector move constructor.

   973 template<typename T, size_t inlineCapacity, typename OverflowHandler, size_t minCapacity, typename Malloc>
   974 inline Vector<T, inlineCapacity, OverflowHandler, minCapacity, Malloc>::Vector(Vector<T, inlineCapacity, OverflowHandler, minCapacity, Malloc>&& other)
   975 {
   976     swap(other);
   977 }

Since the |this| is empty vector, it is clearing the caller's Vector's storage.

And I think we must move in this case because it was the behavior of RefCountedArray, which is replaced to EmbeddedVector in FixedVector.
If we do not clear here, we will change the semantics (caller vector is not cleared), and it is not intention of this patch.
Comment 13 Darin Adler 2021-12-20 11:06:41 PST
Comment on attachment 446924 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=446924&action=review

>>>>> Source/WTF/wtf/EmbeddedFixedVector.h:65
>>>>> +        Vector<T, inlineCapacity, OverflowHandler> container = WTFMove(other);
>>>> 
>>>> Why is it good to do this move? I’d think we’d want to move the *elements* out of the vector and never move the vector itself, should just be able to use std::move_iterator { other.begin() } below.
>>>> 
>>>> Also unclear on why we are using std::begin. It seems to me that we can just use Vector::begin unless we have some reason we want to write code that’s more generic.
>>>> 
>>>> Also, I would use auto here for the type if we were keeping this.
>>> 
>>> We should perform this move since the parameter is Vector&&. The parameter is saying we should move the vector (not elements of vector).
>> 
>> I don’t agree with this.
>> 
>> When you pass an rvalue reference, what it means is that the recipient can take the contents of what is passed, but it does not mean literally that the top level object must be moved. For example, sometimes such functions examine and do not move the argument in some code paths. I think it would be better to move the elements and not unnecessarily move the vector itself.
> 
> I think many code is doing that. For example, Vector move constructor.
> 
>    973 template<typename T, size_t inlineCapacity, typename OverflowHandler, size_t minCapacity, typename Malloc>
>    974 inline Vector<T, inlineCapacity, OverflowHandler, minCapacity, Malloc>::Vector(Vector<T, inlineCapacity, OverflowHandler, minCapacity, Malloc>&& other)
>    975 {
>    976     swap(other);
>    977 }
> 
> Since the |this| is empty vector, it is clearing the caller's Vector's storage.
> 
> And I think we must move in this case because it was the behavior of RefCountedArray, which is replaced to EmbeddedVector in FixedVector.
> If we do not clear here, we will change the semantics (caller vector is not cleared), and it is not intention of this patch.

Yes, code *can* do that and it’s great to do it when it’s the most efficient way to do things. The Vector move constructor is the perfect example of that.

But I don’t think we should pay any efficiency cost to stick with it; it’s not something you can rely on in library, since it’s how rvalue reference arguments work in the C++ standard library, for example. I suppose we can debate this another time.
Comment 14 Yusuke Suzuki 2021-12-21 03:08:57 PST
Comment on attachment 446924 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=446924&action=review

>>>>>> Source/WTF/wtf/EmbeddedFixedVector.h:65
>>>>>> +        Vector<T, inlineCapacity, OverflowHandler> container = WTFMove(other);
>>>>> 
>>>>> Why is it good to do this move? I’d think we’d want to move the *elements* out of the vector and never move the vector itself, should just be able to use std::move_iterator { other.begin() } below.
>>>>> 
>>>>> Also unclear on why we are using std::begin. It seems to me that we can just use Vector::begin unless we have some reason we want to write code that’s more generic.
>>>>> 
>>>>> Also, I would use auto here for the type if we were keeping this.
>>>> 
>>>> We should perform this move since the parameter is Vector&&. The parameter is saying we should move the vector (not elements of vector).
>>> 
>>> I don’t agree with this.
>>> 
>>> When you pass an rvalue reference, what it means is that the recipient can take the contents of what is passed, but it does not mean literally that the top level object must be moved. For example, sometimes such functions examine and do not move the argument in some code paths. I think it would be better to move the elements and not unnecessarily move the vector itself.
>> 
>> I think many code is doing that. For example, Vector move constructor.
>> 
>>    973 template<typename T, size_t inlineCapacity, typename OverflowHandler, size_t minCapacity, typename Malloc>
>>    974 inline Vector<T, inlineCapacity, OverflowHandler, minCapacity, Malloc>::Vector(Vector<T, inlineCapacity, OverflowHandler, minCapacity, Malloc>&& other)
>>    975 {
>>    976     swap(other);
>>    977 }
>> 
>> Since the |this| is empty vector, it is clearing the caller's Vector's storage.
>> 
>> And I think we must move in this case because it was the behavior of RefCountedArray, which is replaced to EmbeddedVector in FixedVector.
>> If we do not clear here, we will change the semantics (caller vector is not cleared), and it is not intention of this patch.
> 
> Yes, code *can* do that and it’s great to do it when it’s the most efficient way to do things. The Vector move constructor is the perfect example of that.
> 
> But I don’t think we should pay any efficiency cost to stick with it; it’s not something you can rely on in library, since it’s how rvalue reference arguments work in the C++ standard library, for example. I suppose we can debate this another time.

Yeah, agree that clearing is not guaranteed in C++'s convention.
But in many cases, just moving it away would be more simpler than leaving the caller Vector with already moved elements, and many code is using `WTFMove(vector)` as a way to clear the original vector.
And in almost all cases, we will clear the caller's Vector anyway, thus, it does not affect on performance. So, unless we have a specific reason, clearing the passed vector looks safer option to me :)
Comment 15 Darin Adler 2021-12-21 11:41:30 PST
(In reply to Yusuke Suzuki from comment #14)
 But in many cases, just moving it away would be more simpler than leaving
> the caller Vector with already moved elements

OK. If that’s the argument, I am convinced.

> and many code is using
> `WTFMove(vector)` as a way to clear the original vector.

That seems dangerous if we do refactoring and possibly change data types.

I think that we should use std::exchange(vector, { }) for call sites where we need to clear the original vector and reserve std::move(vector) or WTFMove(vector) for call sites where we never do anything other than destroying the old vector or assigning a new value to the old vector and then using it after that.

> And in almost all cases, we will clear the caller's Vector anyway, thus, it
> does not affect on performance.

Sure, if "does not affect performance" is the argument, that is OK with me.

> So, unless we have a specific reason,

The specific reason would be better performance.

> clearing the passed vector looks safer option to me :)

I think it’s a false sense of safety since we refactor and change data types from time to time. I don’t want us to depend on "moved-from things are cleared" when the std::exchange idiom is easily accessible to provide a guarantee that does not rely on different conventions for different classes.