Bug 210659

Summary: [EME][CDMProxy] Sort key status array lexicographically by key IDs
Product: WebKit Reporter: Charlie Turner <cturner>
Component: WebKitGTKAssignee: Charlie Turner <cturner>
Status: RESOLVED FIXED    
Severity: Normal CC: bugs-noreply, calvaris, darin, eric.carlson, ews-watchlist, glenn, jer.noble, philipj, sergio
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch none

Charlie Turner
Reported 2020-04-17 09:46:44 PDT
[EME][CDMProxy] Sort key status array lexicographically by key IDs
Attachments
Patch (4.22 KB, patch)
2020-04-17 10:57 PDT, Charlie Turner
no flags
Patch (3.93 KB, patch)
2020-04-21 08:23 PDT, Charlie Turner
no flags
Charlie Turner
Comment 1 2020-04-17 10:57:34 PDT
Darin Adler
Comment 2 2020-04-17 11:18:56 PDT
Comment on attachment 396773 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=396773&action=review > Source/WebCore/ChangeLog:4 > + [EME][CDMProxy] Sort key status array lexicographically by key IDs > + https://bugs.webkit.org/show_bug.cgi?id=210659 Re-sorting each time we insert something gives us an almost pathologically inefficient data structure. We might get away with it if it is always very small. I think a reasonable data structure here would be a vector that we keep sorted, using binary search for both searching for elements and to find where to insert a new element in the vector. The standard library offers std::binary_search, which I believe can be used for both of these purposes, and Vector offers Vector::insert which can be used to place new elements in the correct place in the vector after using binary_search to find where to insert. > Source/WebCore/ChangeLog:15 > + (WebCore::KeyStore::add): We could use a HashSet here and keep it > + sorted by design, but we also have to keep references to keys from HashSet is not an ordered collection. You may be thinking of non-hashed collections that are ordered, like std::set. > Source/WebCore/ChangeLog:19 > + not clear how HashSet<RefPtr<T>> could be made to compare with > + operator<(T) rather than operator<(RefPtr<T>) (pointer > + comparison). In addition to that, it's often required to search HashSet doesn't use operator< at all. It uses hashing and equality. > Source/WebCore/ChangeLog:23 > + for a key with a Vector<uint8_t> directly. The HashSet API doesn't > + provide a way to search via predicate (like Vector), which means I > + can not feasibly lookup a Vector<uint8_t> if I'm essentially > + storing RefPtr<Vector<uint8_t>> in the set. Instead, a compromise It’s straightforward to search a HashSet via predicate like you would with a Vector. It would be a loop like this: for (auto it = set.begin(), end = set.end(); it != end; ++it) { if (predicate(*it)) return it; } The reason this isn’t offered is that it’s an O(n) operation, and if doing operations like that, then we’d typically use a Vector instead. The set offers O(log n) membership checking, which is usually the reason we use it.
Charlie Turner
Comment 3 2020-04-19 15:54:23 PDT
Comment on attachment 396773 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=396773&action=review Thanks for the review, I feel I can keep the naive insert and resort in this context, but I should clarify why in the ChangeLog. >> Source/WebCore/ChangeLog:4 >> + https://bugs.webkit.org/show_bug.cgi?id=210659 > > Re-sorting each time we insert something gives us an almost pathologically inefficient data structure. We might get away with it if it is always very small. > > I think a reasonable data structure here would be a vector that we keep sorted, using binary search for both searching for elements and to find where to insert a new element in the vector. > > The standard library offers std::binary_search, which I believe can be used for both of these purposes, and Vector offers Vector::insert which can be used to place new elements in the correct place in the vector after using binary_search to find where to insert. > We might get away with it if it is always very small. I should have made it clearer that the collection in this context is typically very small. The common case in practice is for this collection to be of size 1 (one encryption key for all tracks). It's also less common for the collection to be size 2 (one key for audio and one for video) but cases where N > 2 are pathological in themselves, and so in practice the pathology wouldn't be seen. For N=1, the sort call is trivial, and for N=2 it's just a swap. I did consider keeping it sorted, but because of this context the extra complexity didn't seem worth it. >> Source/WebCore/ChangeLog:15 >> + sorted by design, but we also have to keep references to keys from > > HashSet is not an ordered collection. You may be thinking of non-hashed collections that are ordered, like std::set. I'll correct that mistake in the ChangeLog, I am thinking of the ordered variety. >> Source/WebCore/ChangeLog:23 >> + storing RefPtr<Vector<uint8_t>> in the set. Instead, a compromise > > It’s straightforward to search a HashSet via predicate like you would with a Vector. It would be a loop like this: > > for (auto it = set.begin(), end = set.end(); it != end; ++it) { > if (predicate(*it)) > return it; > } > > The reason this isn’t offered is that it’s an O(n) operation, and if doing operations like that, then we’d typically use a Vector instead. The set offers O(log n) membership checking, which is usually the reason we use it. Understood, I shouldn't have been so strong here in my explanation, but loops like that are definitely a warning sign I've picked the wrong data structure :)
Charlie Turner
Comment 4 2020-04-19 15:54:28 PDT
View in context: https://bugs.webkit.org/attachment.cgi?id=396773&action=review Thanks for the review, I feel I can keep the naive insert and resort in this context, but I should clarify why in the ChangeLog. >> Source/WebCore/ChangeLog:4 >> + https://bugs.webkit.org/show_bug.cgi?id=210659 > > Re-sorting each time we insert something gives us an almost pathologically inefficient data structure. We might get away with it if it is always very small. > > I think a reasonable data structure here would be a vector that we keep sorted, using binary search for both searching for elements and to find where to insert a new element in the vector. > > The standard library offers std::binary_search, which I believe can be used for both of these purposes, and Vector offers Vector::insert which can be used to place new elements in the correct place in the vector after using binary_search to find where to insert. > We might get away with it if it is always very small. I should have made it clearer that the collection in this context is typically very small. The common case in practice is for this collection to be of size 1 (one encryption key for all tracks). It's also less common for the collection to be size 2 (one key for audio and one for video) but cases where N > 2 are pathological in themselves, and so in practice the pathology wouldn't be seen. For N=1, the sort call is trivial, and for N=2 it's just a swap. I did consider keeping it sorted, but because of this context the extra complexity didn't seem worth it. >> Source/WebCore/ChangeLog:15 >> + sorted by design, but we also have to keep references to keys from > > HashSet is not an ordered collection. You may be thinking of non-hashed collections that are ordered, like std::set. I'll correct that mistake in the ChangeLog, I am thinking of the ordered variety. >> Source/WebCore/ChangeLog:23 >> + storing RefPtr<Vector<uint8_t>> in the set. Instead, a compromise > > It’s straightforward to search a HashSet via predicate like you would with a Vector. It would be a loop like this: > > for (auto it = set.begin(), end = set.end(); it != end; ++it) { > if (predicate(*it)) > return it; > } > > The reason this isn’t offered is that it’s an O(n) operation, and if doing operations like that, then we’d typically use a Vector instead. The set offers O(log n) membership checking, which is usually the reason we use it. Understood, I shouldn't have been so strong here in my explanation, but loops like that are definitely a warning sign I've picked the wrong data structure :)
Charlie Turner
Comment 5 2020-04-21 08:23:58 PDT
EWS
Comment 6 2020-04-23 07:50:52 PDT
Committed r260568: <https://trac.webkit.org/changeset/260568> All reviewed patches have been landed. Closing bug and clearing flags on attachment 397084 [details].
Note You need to log in before you can comment on or make changes to this bug.