Bug 224726 - Make container naming a bit more consistent
Summary: Make container naming a bit more consistent
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Template Framework (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-04-17 20:12 PDT by Sam Weinig
Modified: 2021-04-24 20:13 PDT (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sam Weinig 2021-04-17 20:12:50 PDT
We have a bunch of array-like/adjacent container classes in WTF and we don't have super consistent naming. Currently we have:

Vector/VectorBuffer
- Supports dynamic size changing (append, insert, etc.)
- Supports optional inline capacity
- Size & capacity are stored in struct

SegmentedVector
- Supports dynamic size changing (append, insert, etc.)
- No inline capacity
- Size & capacity are stored in struct
* Uses segments to grow so pointers to element remain valid on growth

ConcurrentVector
- Supports dynamic size changing (append, insert, etc.), though no getting smaller.
- No inline capacity
- Size & capacity are stored in struct
* Uses segments to grow so pointers to element remain valid on growth
* Uses ConcurrentBuffer to store segments for concurrency guarantees

ConcurrentBuffer
- Supports dynamic size changing (append, insert, etc.), though no getting smaller.
- No inline capacity
- Size is stored in header of buffer
* Keeps all allocated buffers (one for each time it was grown) until destructor is called

RefCountedArray
- No dynamic size changing. Size fixed at construction / assignment.
- No inline capacity
- Size is stored in header of buffer
* Use refcount in header of buffer to allow copying to become an alias
* Word sized when empty

FixedVector
- No dynamic size changing. Size fixed at construction / assignment.
- No inline capacity
- Size is stored in header of buffer
* Word sized when empty


[There are others, including Deque, BitVector, FastBitVector, UniqueArray (and perhaps others I am missing) but they are different enough to not seem relevant for this.]

That gives us the following words to play with:

- Vector
- Buffer
- RefCounted
- Fixed
- Array
- Concurrent
- Segmented


Some initial thoughts on potential important traits that should probably play into naming:

- Does it support dynamic size changing?
- What is the size when empty? (e.g. does it just store a pointer to a header + buffer)?
- Does it alias on copy?
- Does it support a concurrency related use case? 


I think we could make these more consistent.
Comment 1 Darin Adler 2021-04-17 20:30:54 PDT
You didn’t mention Deque. Are there any other that are similar/adjacent? ListHashSet? Linked lists?
Comment 2 Sam Weinig 2021-04-18 10:24:43 PDT
(In reply to Darin Adler from comment #1)
> You didn’t mention Deque. Are there any other that are similar/adjacent?
> ListHashSet? Linked lists?

In addition to Deque, there is BitVector and FastBitVector, and I initially felt they were different enough, but that seems silly.

I was also limiting this to just containers that support random access, but again, that now seems silly. 

Non-random access containers we have are:

- SinglyLinkedList
- SinglyLinkedListWithTail
- DoublyLinkedList
- SentinelLinkedList
- ListHashSet


Fortunately here, we have a pretty consistent naming. All non-random access based list containers use the word "List", which I think is appropriate.

For the random access ones (with the exception of Deque, which I think we could see as more as an adaptor for any random access container, though that is not currently how it is implemented), it seems like the most prevelent words we have are "Vector" and "Buffer", with RefCountedArray being the only container not using one of those terms. Given that std::array implies a compile time size, I think RefCountedArray is ultimately more misleading than necessary, and we should probably converge its naming with FixedVector, which is almost identical to it, but doesn't have the refcount based aliasing. 

Speaking of FixedVector, that name kind of feels like an oxymoron to me, as Vector makes me think of something that is not fixed. I kind of like the word "Buffer" as a container that is a capacity + element array (with no separate "used size" being tracked), and we use that for containers like VectorBuffer (arguably not true since it has the m_size for packing), StringBuffer and ConcurrentBuffer (to an extent).

My initial thoughts on names would be:

"Vector" :: A container that conceptually tracks a separate capacity and size.
"Buffer" :: A container that conceptually tracks just capacity.

I think most of the containers follow these rules with the exception of FixedVector and RefCountedArray.

The primary goal both of these is be only a word when empty, so I think following the lead of WordLock, and prefixing them with "Word" would be useful. My proposal would be:

RefCountedArray -> WordRefCountedBuffer
FixedVector -> WordBuffer

Though, now that read the latter, it kind of seems like it is a buffer of words. We have also use the term "Compact" in other containers to emphasize they optimize for space with CompactPointerTuple, CompactRefPtrTuple and CompactUniquePtrTuple, so perhaps:

RefCountedArray -> CompactRefCountedBuffer
FixedVector -> CompactBuffer

?
Comment 3 Darin Adler 2021-04-18 10:30:31 PDT
(In reply to Sam Weinig from comment #2)
> RefCountedArray -> CompactRefCountedBuffer
> FixedVector -> CompactBuffer

Main thing that is lost here is the following:

"He programmer, you are using a Vector, but you never need to resize it, so you should instead use a FixedVector, which can be used the same way but is always more memory efficient."

I don’t think CompactBuffer works very well in that sentence, but I am not sure. Maybe we another goal is to help people think of Vector -> FixedVector in a way analogous to StringBuilder -> String.
Comment 4 Darin Adler 2021-04-18 10:33:07 PDT
And like String, we may need a Buffer/VectorView so that all these functions that currently take Vector can also take any of the types of buffers (fixed vectors) that support pointer plus length since they use contiguous storage. That seems like an important optimized place far less generic than the containers algorithms, but more generic than literally taking Vector just for clarity that the size has to be passed!
Comment 5 Darin Adler 2021-04-18 10:33:58 PDT
I like "Compact" a lot, by the way.

Not sure that "Buffer" to me says "non-resizable vector".
Comment 6 Darin Adler 2021-04-18 10:35:38 PDT
I suppose that FixedVector could technically be resizable like Buffer but right now it’s more restricted than that; unlike other Buffer objects it has to have its size set at creation time. The notion that it’s resizable, just not efficiently, because it tracks only capacity which is == size, is not really how I currently think about it.
Comment 7 Darin Adler 2021-04-18 10:40:07 PDT
Realized that what I describe above is literally std::array_view.
Comment 8 Darin Adler 2021-04-18 10:41:15 PDT
Sorry, I mean that what I describe above is literally std::span (there is no array_view).
Comment 9 Darin Adler 2021-04-18 10:41:55 PDT
They have std::span with "static extent" and "dynamic extent" and I am talking about the one with dynamic extent.

But C++20 and we are still C++17.
Comment 10 Sam Weinig 2021-04-18 10:46:19 PDT
(In reply to Darin Adler from comment #3)
> (In reply to Sam Weinig from comment #2)
> > RefCountedArray -> CompactRefCountedBuffer
> > FixedVector -> CompactBuffer
> 
> Main thing that is lost here is the following:
> 
> "He programmer, you are using a Vector, but you never need to resize it, so
> you should instead use a FixedVector, which can be used the same way but is
> always more memory efficient."
> 
> I don’t think CompactBuffer works very well in that sentence, but I am not
> sure. Maybe we another goal is to help people think of Vector -> FixedVector
> in a way analogous to StringBuilder -> String.

I like that argument. Perhaps then:

RefCountedArray -> RefCountedFixedVector
FixedVector -> FixedVector
Comment 11 Sam Weinig 2021-04-18 10:47:07 PDT
(In reply to Darin Adler from comment #5)
> I like "Compact" a lot, by the way.
> 
> Not sure that "Buffer" to me says "non-resizable vector".

(In reply to Sam Weinig from comment #10)
> (In reply to Darin Adler from comment #3)
> > (In reply to Sam Weinig from comment #2)
> > > RefCountedArray -> CompactRefCountedBuffer
> > > FixedVector -> CompactBuffer
> > 
> > Main thing that is lost here is the following:
> > 
> > "He programmer, you are using a Vector, but you never need to resize it, so
> > you should instead use a FixedVector, which can be used the same way but is
> > always more memory efficient."
> > 
> > I don’t think CompactBuffer works very well in that sentence, but I am not
> > sure. Maybe we another goal is to help people think of Vector -> FixedVector
> > in a way analogous to StringBuilder -> String.
> 
> I like that argument. Perhaps then:
> 
> RefCountedArray -> RefCountedFixedVector
> FixedVector -> FixedVector

Or rather,

RefCountedArray -> CompactRefCountedFixedVector
FixedVector -> CompactFixedVector
Comment 12 Sam Weinig 2021-04-18 10:48:17 PDT
(In reply to Darin Adler from comment #5)
> I like "Compact" a lot, by the way.
> 
> Not sure that "Buffer" to me says "non-resizable vector".

Yeah, I agree. I tried to use Buffer since we use it elsewhere, but we are a bit inconsistent and it also doesn't really say anything specific to me.
Comment 13 Radar WebKit Bug Importer 2021-04-24 20:13:14 PDT
<rdar://problem/77112856>