NEW 224726
Make container naming a bit more consistent
https://bugs.webkit.org/show_bug.cgi?id=224726
Summary Make container naming a bit more consistent
Sam Weinig
Reported 2021-04-17 20:12:50 PDT
We have a bunch of array-like/adjacent container classes in WTF and we don't have super consistent naming. Currently we have: Vector/VectorBuffer - Supports dynamic size changing (append, insert, etc.) - Supports optional inline capacity - Size & capacity are stored in struct SegmentedVector - Supports dynamic size changing (append, insert, etc.) - No inline capacity - Size & capacity are stored in struct * Uses segments to grow so pointers to element remain valid on growth ConcurrentVector - Supports dynamic size changing (append, insert, etc.), though no getting smaller. - No inline capacity - Size & capacity are stored in struct * Uses segments to grow so pointers to element remain valid on growth * Uses ConcurrentBuffer to store segments for concurrency guarantees ConcurrentBuffer - Supports dynamic size changing (append, insert, etc.), though no getting smaller. - No inline capacity - Size is stored in header of buffer * Keeps all allocated buffers (one for each time it was grown) until destructor is called RefCountedArray - No dynamic size changing. Size fixed at construction / assignment. - No inline capacity - Size is stored in header of buffer * Use refcount in header of buffer to allow copying to become an alias * Word sized when empty FixedVector - No dynamic size changing. Size fixed at construction / assignment. - No inline capacity - Size is stored in header of buffer * Word sized when empty [There are others, including Deque, BitVector, FastBitVector, UniqueArray (and perhaps others I am missing) but they are different enough to not seem relevant for this.] That gives us the following words to play with: - Vector - Buffer - RefCounted - Fixed - Array - Concurrent - Segmented Some initial thoughts on potential important traits that should probably play into naming: - Does it support dynamic size changing? - What is the size when empty? (e.g. does it just store a pointer to a header + buffer)? - Does it alias on copy? - Does it support a concurrency related use case? I think we could make these more consistent.
Attachments
Darin Adler
Comment 1 2021-04-17 20:30:54 PDT
You didn’t mention Deque. Are there any other that are similar/adjacent? ListHashSet? Linked lists?
Sam Weinig
Comment 2 2021-04-18 10:24:43 PDT
(In reply to Darin Adler from comment #1) > You didn’t mention Deque. Are there any other that are similar/adjacent? > ListHashSet? Linked lists? In addition to Deque, there is BitVector and FastBitVector, and I initially felt they were different enough, but that seems silly. I was also limiting this to just containers that support random access, but again, that now seems silly. Non-random access containers we have are: - SinglyLinkedList - SinglyLinkedListWithTail - DoublyLinkedList - SentinelLinkedList - ListHashSet Fortunately here, we have a pretty consistent naming. All non-random access based list containers use the word "List", which I think is appropriate. For the random access ones (with the exception of Deque, which I think we could see as more as an adaptor for any random access container, though that is not currently how it is implemented), it seems like the most prevelent words we have are "Vector" and "Buffer", with RefCountedArray being the only container not using one of those terms. Given that std::array implies a compile time size, I think RefCountedArray is ultimately more misleading than necessary, and we should probably converge its naming with FixedVector, which is almost identical to it, but doesn't have the refcount based aliasing. Speaking of FixedVector, that name kind of feels like an oxymoron to me, as Vector makes me think of something that is not fixed. I kind of like the word "Buffer" as a container that is a capacity + element array (with no separate "used size" being tracked), and we use that for containers like VectorBuffer (arguably not true since it has the m_size for packing), StringBuffer and ConcurrentBuffer (to an extent). My initial thoughts on names would be: "Vector" :: A container that conceptually tracks a separate capacity and size. "Buffer" :: A container that conceptually tracks just capacity. I think most of the containers follow these rules with the exception of FixedVector and RefCountedArray. The primary goal both of these is be only a word when empty, so I think following the lead of WordLock, and prefixing them with "Word" would be useful. My proposal would be: RefCountedArray -> WordRefCountedBuffer FixedVector -> WordBuffer Though, now that read the latter, it kind of seems like it is a buffer of words. We have also use the term "Compact" in other containers to emphasize they optimize for space with CompactPointerTuple, CompactRefPtrTuple and CompactUniquePtrTuple, so perhaps: RefCountedArray -> CompactRefCountedBuffer FixedVector -> CompactBuffer ?
Darin Adler
Comment 3 2021-04-18 10:30:31 PDT
(In reply to Sam Weinig from comment #2) > RefCountedArray -> CompactRefCountedBuffer > FixedVector -> CompactBuffer Main thing that is lost here is the following: "He programmer, you are using a Vector, but you never need to resize it, so you should instead use a FixedVector, which can be used the same way but is always more memory efficient." I don’t think CompactBuffer works very well in that sentence, but I am not sure. Maybe we another goal is to help people think of Vector -> FixedVector in a way analogous to StringBuilder -> String.
Darin Adler
Comment 4 2021-04-18 10:33:07 PDT
And like String, we may need a Buffer/VectorView so that all these functions that currently take Vector can also take any of the types of buffers (fixed vectors) that support pointer plus length since they use contiguous storage. That seems like an important optimized place far less generic than the containers algorithms, but more generic than literally taking Vector just for clarity that the size has to be passed!
Darin Adler
Comment 5 2021-04-18 10:33:58 PDT
I like "Compact" a lot, by the way. Not sure that "Buffer" to me says "non-resizable vector".
Darin Adler
Comment 6 2021-04-18 10:35:38 PDT
I suppose that FixedVector could technically be resizable like Buffer but right now it’s more restricted than that; unlike other Buffer objects it has to have its size set at creation time. The notion that it’s resizable, just not efficiently, because it tracks only capacity which is == size, is not really how I currently think about it.
Darin Adler
Comment 7 2021-04-18 10:40:07 PDT
Realized that what I describe above is literally std::array_view.
Darin Adler
Comment 8 2021-04-18 10:41:15 PDT
Sorry, I mean that what I describe above is literally std::span (there is no array_view).
Darin Adler
Comment 9 2021-04-18 10:41:55 PDT
They have std::span with "static extent" and "dynamic extent" and I am talking about the one with dynamic extent. But C++20 and we are still C++17.
Sam Weinig
Comment 10 2021-04-18 10:46:19 PDT
(In reply to Darin Adler from comment #3) > (In reply to Sam Weinig from comment #2) > > RefCountedArray -> CompactRefCountedBuffer > > FixedVector -> CompactBuffer > > Main thing that is lost here is the following: > > "He programmer, you are using a Vector, but you never need to resize it, so > you should instead use a FixedVector, which can be used the same way but is > always more memory efficient." > > I don’t think CompactBuffer works very well in that sentence, but I am not > sure. Maybe we another goal is to help people think of Vector -> FixedVector > in a way analogous to StringBuilder -> String. I like that argument. Perhaps then: RefCountedArray -> RefCountedFixedVector FixedVector -> FixedVector
Sam Weinig
Comment 11 2021-04-18 10:47:07 PDT
(In reply to Darin Adler from comment #5) > I like "Compact" a lot, by the way. > > Not sure that "Buffer" to me says "non-resizable vector". (In reply to Sam Weinig from comment #10) > (In reply to Darin Adler from comment #3) > > (In reply to Sam Weinig from comment #2) > > > RefCountedArray -> CompactRefCountedBuffer > > > FixedVector -> CompactBuffer > > > > Main thing that is lost here is the following: > > > > "He programmer, you are using a Vector, but you never need to resize it, so > > you should instead use a FixedVector, which can be used the same way but is > > always more memory efficient." > > > > I don’t think CompactBuffer works very well in that sentence, but I am not > > sure. Maybe we another goal is to help people think of Vector -> FixedVector > > in a way analogous to StringBuilder -> String. > > I like that argument. Perhaps then: > > RefCountedArray -> RefCountedFixedVector > FixedVector -> FixedVector Or rather, RefCountedArray -> CompactRefCountedFixedVector FixedVector -> CompactFixedVector
Sam Weinig
Comment 12 2021-04-18 10:48:17 PDT
(In reply to Darin Adler from comment #5) > I like "Compact" a lot, by the way. > > Not sure that "Buffer" to me says "non-resizable vector". Yeah, I agree. I tried to use Buffer since we use it elsewhere, but we are a bit inconsistent and it also doesn't really say anything specific to me.
Radar WebKit Bug Importer
Comment 13 2021-04-24 20:13:14 PDT
Note You need to log in before you can comment on or make changes to this bug.