Summary:  [EME][CDMProxy] Sort key status array lexicographically by key IDs  

Product:  WebKit  Reporter:  Charlie Turner <cturner>  
Component:  WebKitGTK  Assignee:  Charlie Turner <cturner>  
Status:  RESOLVED FIXED  
Severity:  Normal  CC:  bugsnoreply, calvaris, darin, eric.carlson, ewswatchlist, glenn, jer.noble, philipj, sergio  
Priority:  P2  
Version:  WebKit Nightly Build  
Hardware:  Unspecified  
OS:  Unspecified  
Attachments: 

Description
Charlie Turner
20200417 09:46:44 PDT
Created attachment 396773 [details]
Patch
Comment on attachment 396773 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=396773&action=review > Source/WebCore/ChangeLog:4 > + [EME][CDMProxy] Sort key status array lexicographically by key IDs > + https://bugs.webkit.org/show_bug.cgi?id=210659 Resorting each time we insert something gives us an almost pathologically inefficient data structure. We might get away with it if it is always very small. I think a reasonable data structure here would be a vector that we keep sorted, using binary search for both searching for elements and to find where to insert a new element in the vector. The standard library offers std::binary_search, which I believe can be used for both of these purposes, and Vector offers Vector::insert which can be used to place new elements in the correct place in the vector after using binary_search to find where to insert. > Source/WebCore/ChangeLog:15 > + (WebCore::KeyStore::add): We could use a HashSet here and keep it > + sorted by design, but we also have to keep references to keys from HashSet is not an ordered collection. You may be thinking of nonhashed collections that are ordered, like std::set. > Source/WebCore/ChangeLog:19 > + not clear how HashSet<RefPtr<T>> could be made to compare with > + operator<(T) rather than operator<(RefPtr<T>) (pointer > + comparison). In addition to that, it's often required to search HashSet doesn't use operator< at all. It uses hashing and equality. > Source/WebCore/ChangeLog:23 > + for a key with a Vector<uint8_t> directly. The HashSet API doesn't > + provide a way to search via predicate (like Vector), which means I > + can not feasibly lookup a Vector<uint8_t> if I'm essentially > + storing RefPtr<Vector<uint8_t>> in the set. Instead, a compromise It’s straightforward to search a HashSet via predicate like you would with a Vector. It would be a loop like this: for (auto it = set.begin(), end = set.end(); it != end; ++it) { if (predicate(*it)) return it; } The reason this isn’t offered is that it’s an O(n) operation, and if doing operations like that, then we’d typically use a Vector instead. The set offers O(log n) membership checking, which is usually the reason we use it. Comment on attachment 396773 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=396773&action=review Thanks for the review, I feel I can keep the naive insert and resort in this context, but I should clarify why in the ChangeLog. >> Source/WebCore/ChangeLog:4 >> + https://bugs.webkit.org/show_bug.cgi?id=210659 > > Resorting each time we insert something gives us an almost pathologically inefficient data structure. We might get away with it if it is always very small. > > I think a reasonable data structure here would be a vector that we keep sorted, using binary search for both searching for elements and to find where to insert a new element in the vector. > > The standard library offers std::binary_search, which I believe can be used for both of these purposes, and Vector offers Vector::insert which can be used to place new elements in the correct place in the vector after using binary_search to find where to insert. > We might get away with it if it is always very small. I should have made it clearer that the collection in this context is typically very small. The common case in practice is for this collection to be of size 1 (one encryption key for all tracks). It's also less common for the collection to be size 2 (one key for audio and one for video) but cases where N > 2 are pathological in themselves, and so in practice the pathology wouldn't be seen. For N=1, the sort call is trivial, and for N=2 it's just a swap. I did consider keeping it sorted, but because of this context the extra complexity didn't seem worth it. >> Source/WebCore/ChangeLog:15 >> + sorted by design, but we also have to keep references to keys from > > HashSet is not an ordered collection. You may be thinking of nonhashed collections that are ordered, like std::set. I'll correct that mistake in the ChangeLog, I am thinking of the ordered variety. >> Source/WebCore/ChangeLog:23 >> + storing RefPtr<Vector<uint8_t>> in the set. Instead, a compromise > > It’s straightforward to search a HashSet via predicate like you would with a Vector. It would be a loop like this: > > for (auto it = set.begin(), end = set.end(); it != end; ++it) { > if (predicate(*it)) > return it; > } > > The reason this isn’t offered is that it’s an O(n) operation, and if doing operations like that, then we’d typically use a Vector instead. The set offers O(log n) membership checking, which is usually the reason we use it. Understood, I shouldn't have been so strong here in my explanation, but loops like that are definitely a warning sign I've picked the wrong data structure :) View in context: https://bugs.webkit.org/attachment.cgi?id=396773&action=review Thanks for the review, I feel I can keep the naive insert and resort in this context, but I should clarify why in the ChangeLog. >> Source/WebCore/ChangeLog:4 >> + https://bugs.webkit.org/show_bug.cgi?id=210659 > > Resorting each time we insert something gives us an almost pathologically inefficient data structure. We might get away with it if it is always very small. > > I think a reasonable data structure here would be a vector that we keep sorted, using binary search for both searching for elements and to find where to insert a new element in the vector. > > The standard library offers std::binary_search, which I believe can be used for both of these purposes, and Vector offers Vector::insert which can be used to place new elements in the correct place in the vector after using binary_search to find where to insert. > We might get away with it if it is always very small. I should have made it clearer that the collection in this context is typically very small. The common case in practice is for this collection to be of size 1 (one encryption key for all tracks). It's also less common for the collection to be size 2 (one key for audio and one for video) but cases where N > 2 are pathological in themselves, and so in practice the pathology wouldn't be seen. For N=1, the sort call is trivial, and for N=2 it's just a swap. I did consider keeping it sorted, but because of this context the extra complexity didn't seem worth it. >> Source/WebCore/ChangeLog:15 >> + sorted by design, but we also have to keep references to keys from > > HashSet is not an ordered collection. You may be thinking of nonhashed collections that are ordered, like std::set. I'll correct that mistake in the ChangeLog, I am thinking of the ordered variety. >> Source/WebCore/ChangeLog:23 >> + storing RefPtr<Vector<uint8_t>> in the set. Instead, a compromise > > It’s straightforward to search a HashSet via predicate like you would with a Vector. It would be a loop like this: > > for (auto it = set.begin(), end = set.end(); it != end; ++it) { > if (predicate(*it)) > return it; > } > > The reason this isn’t offered is that it’s an O(n) operation, and if doing operations like that, then we’d typically use a Vector instead. The set offers O(log n) membership checking, which is usually the reason we use it. Understood, I shouldn't have been so strong here in my explanation, but loops like that are definitely a warning sign I've picked the wrong data structure :) Created attachment 397084 [details]
Patch
Committed r260568: <https://trac.webkit.org/changeset/260568> All reviewed patches have been landed. Closing bug and clearing flags on attachment 397084 [details]. 