RESOLVED FIXED 202851
[Clipboard API] Support writing multiple PasteboardCustomData with SharedBuffers to the pasteboard
https://bugs.webkit.org/show_bug.cgi?id=202851
Summary [Clipboard API] Support writing multiple PasteboardCustomData with SharedBuff...
Wenson Hsieh
Reported 2019-10-11 09:54:37 PDT
Work towards Async Clipboard API support.
Attachments
Patch (71.56 KB, patch)
2019-10-11 11:10 PDT, Wenson Hsieh
no flags
Try to fix GTK/WPE builds (72.14 KB, patch)
2019-10-11 11:18 PDT, Wenson Hsieh
darin: review+
Address review feedback (79.89 KB, patch)
2019-10-13 12:58 PDT, Wenson Hsieh
darin: review+
For EWS (79.61 KB, patch)
2019-10-14 10:49 PDT, Wenson Hsieh
no flags
Wenson Hsieh
Comment 1 2019-10-11 11:10:19 PDT Comment hidden (obsolete)
Wenson Hsieh
Comment 2 2019-10-11 11:18:48 PDT
Created attachment 380770 [details] Try to fix GTK/WPE builds
Darin Adler
Comment 3 2019-10-11 16:04:09 PDT
Comment on attachment 380770 [details] Try to fix GTK/WPE builds View in context: https://bugs.webkit.org/attachment.cgi?id=380770&action=review Some coding style comments. A lot are not about new things. > Source/WebCore/platform/PasteboardCustomData.cpp:48 > +PasteboardCustomData::PasteboardCustomData(const PasteboardCustomData& data) Can’t this be "= default"? > Source/WebCore/platform/PasteboardCustomData.cpp:59 > + const static unsigned currentCustomDataSerializationVersion = 1; Maybe constexpr instead? > Source/WebCore/platform/PasteboardCustomData.cpp:71 > + const static unsigned maxSupportedDataSerializationVersionNumber = 1; Ditto. > Source/WebCore/platform/PasteboardCustomData.cpp:74 > + WTF::Persistence::Decoder decoder { reinterpret_cast<const uint8_t*>(buffer.data()), buffer.size() }; So annoying to have the reinterpret_cast here for an idiom that is not at all specific to pasteboards. Can we find somewhere to put an overload so this Decoder/SharedBuffer interaction can be done cleanly? > Source/WebCore/platform/PasteboardCustomData.cpp:86 > + if (!decoder.decode(version) || version > maxSupportedDataSerializationVersionNumber) > + return { }; > + > + if (!decoder.decode(result.m_origin)) > + return { }; > + > + if (!decoder.decode(result.m_sameOriginCustomStringData)) > + return { }; > + > + if (!decoder.decode(result.m_orderedTypes)) > + return { }; This makes me think our decoder interface is not great. I would much prefer a decoder that would "go numb" once there was an error, and stop doing any work at all; subsequent calls to decode just wouldn’t do anything. Then at the end the caller would check for the error once. Anyway, not new here. > Source/WebCore/platform/PasteboardCustomData.cpp:97 > +static void updateTypes(Vector<String>& types, String type, bool moveToEnd) > +{ > + if (moveToEnd) > + types.removeFirst(type); > + ASSERT(!types.contains(type)); > + types.append(type); > +} The fact that we maintain a vector of types makes it seem wasteful that the we also have three maps. We are going to pay the O(n^2) cost if there are a ton of types so all the hashing in the maps is wasteful: we could instead keep all the data in this same vector. Or if we need the O(n log n) performance then this would need to be a ListHashSet (or whatever) instead of a Vector. Inconvenient that the internal data structures are exposed in the getters of this class so we can't change this around without changing a lot of call sites. > Source/WebCore/platform/PasteboardCustomData.cpp:130 > + if (!m_platformBinaryData.remove(type) && !m_platformStringData.remove(type) && !m_sameOriginCustomStringData.remove(type)) > + return; What guarantees this type will appear in at most one of these three maps? It looks like the functions above can put the same type in both m_platformStringData and m_sameOriginCustomStringData, for example. By the way, if these maps were truly mutually exclusive, then I’d suggest we consider a single map with a more complex value type instead. Although it might make the code more complex at various call sites. > Source/WebCore/platform/PasteboardCustomData.h:39 > + WEBCORE_EXPORT PasteboardCustomData(const String&, Vector<String>&&, HashMap<String, String>&& platformStringData, HashMap<String, String>&& customData, HashMap<String, RefPtr<SharedBuffer>>&&); Since we're using rvalue references for everything else, why not use it for the first argument too? Can save a tiny bit of reference count churn. Also, I think the first argument needs a name. It’s not obvious that it’s an origin. > Source/WebCore/platform/PasteboardCustomData.h:40 > + WEBCORE_EXPORT PasteboardCustomData(const PasteboardCustomData&); Seems we should include both move and copy constructors rather than only copy, since it can result in more efficient code for some common idioms. That’s one thing we get "for free" if we don’t define either the move or copy constructor, but need to do extra work to preserve. I think the struct we had before supported move. If copy is all that’s needed at the moment, sometimes I will delete the move constructor rather than implementing it, promising myself "I will implement it if anyone needs it". That makes me feel better that we won’t end up using copy-and-destroy when we could use a cheaper move. > Source/WebCore/platform/PasteboardCustomData.h:73 > + HashMap<String, RefPtr<SharedBuffer>> m_platformBinaryData; Seems the value type here could be Ref<> instead of RefPtr<>. > Source/WebCore/platform/StaticPasteboard.cpp:88 > + return WTFMove(m_customData); This doesn't guarantee the data is cleared. Might want to use std::exchange instead to guarantee it’s cleared afterward (see below about changing the return type). > Source/WebCore/platform/StaticPasteboard.h:42 > - PasteboardCustomData takeCustomData(); > + PasteboardCustomData&& takeCustomData(); Seems like the old signature is better for a function that definitively does a "take". Returning an rvalue reference is OK if it’s a function where some callers want to call it but not actually take the data, perhaps read it but leave it in place, but if callers are always taking then I think it’s better to return the object, not an rvalue reference. Harder to misuse by accident. > Source/WebCore/platform/ios/AbstractPasteboard.h:52 > #if __IPHONE_OS_VERSION_MIN_REQUIRED >= 110000 Is this conditional still needed? Do we support those old versions of iOS? Also, why isn’t this an "SPI" header? > Source/WebCore/platform/ios/PlatformPasteboardIOS.mm:747 > + return [(id<AbstractPasteboard>)m_pasteboard.get() changeCount]; Seems peculiar that this typecast is both needed, but also guaranteed to be safe and doesn’t require a protocol conformance check. > Source/WebCore/platform/ios/WebItemProviderPasteboard.mm:870 > + auto stagedRegistrationInfoLists = WTFMove(_stagedRegistrationInfoLists); > + return stagedRegistrationInfoLists.autorelease(); I don’t understand why the WTFMove and the local variable are needed. I would expect that this code would do the same thing: return _stagedRegistrationInfoLists.autorelease(); If RetainPtr::autorelease does not nil the pointer out (I think it does) then maybe it’s std::exchange we want, not WTFMove. One sign of that is that it would still be a one-liner: return std::exchange(_stagedRegistrationInfoLists, nil).autorelease(); > Source/WebCore/platform/mac/PlatformPasteboardMac.mm:484 > + auto platformStringIterator = data.platformStringData().find(type); > + if (platformStringIterator != data.platformStringData().end()) { > + [item setString:platformStringIterator->value forType:platformType]; > + continue; > + } This can be done simply with HashMap::get and it might be nicer: auto value = data.platformStringData().get(type); if (!value.isNull()) { [item setString:value forType:platformType]; continue; } The find/end/->value dance is needed for types where copying the value out of the map is expensive or there is no null value. You’ll also notice that I favor shorter variable names; my philosophy is to include adjectives only if they are needed for disambiguation or clarity. > Source/WebCore/platform/mac/PlatformPasteboardMac.mm:490 > + auto platformDataIterator = data.platformBinaryData().find(type); > + if (platformDataIterator != data.platformBinaryData().end()) { > + if (auto platformData = platformDataIterator->value->createNSData()) > + [item setData:platformData.get() forType:platformType]; > + } Same thing is possible here: if (auto buffer = data.platformBinaryData().get(type)) { if (auto platformData = buffer->createNSData()) [item setData: platformData.get() forType():platformType]; } I think the non-iterator code is easier to read. > Source/WebCore/platform/mac/PlatformPasteboardMac.mm:501 > + if (auto item = createPasteboardItem(data)) > + [platformItems addObject:item.get()]; Is "silently omit" the right thing to do when createPasteboardItem returns nil? Not obvious to me what this case is. > Source/WebKit/Shared/WebCoreArgumentCoders.cpp:1645 > + encoder << static_cast<uint64_t>(data.platformBinaryData().size()); Why is this cast needed? Seems that we could just use the appropriate type in the decode function. I guess size_t? Annoying to have to use this idiom.
Wenson Hsieh
Comment 4 2019-10-11 22:10:59 PDT
Comment on attachment 380770 [details] Try to fix GTK/WPE builds View in context: https://bugs.webkit.org/attachment.cgi?id=380770&action=review Thanks for the review! >> Source/WebCore/platform/PasteboardCustomData.cpp:48 >> +PasteboardCustomData::PasteboardCustomData(const PasteboardCustomData& data) > > Can’t this be "= default"? Yes, it can! Changed to = default. >> Source/WebCore/platform/PasteboardCustomData.cpp:59 >> + const static unsigned currentCustomDataSerializationVersion = 1; > > Maybe constexpr instead? Yep, changed to use constexpr. >> Source/WebCore/platform/PasteboardCustomData.cpp:71 >> + const static unsigned maxSupportedDataSerializationVersionNumber = 1; > > Ditto. Changed to constexpr here too. >> Source/WebCore/platform/PasteboardCustomData.cpp:74 >> + WTF::Persistence::Decoder decoder { reinterpret_cast<const uint8_t*>(buffer.data()), buffer.size() }; > > So annoying to have the reinterpret_cast here for an idiom that is not at all specific to pasteboards. Can we find somewhere to put an overload so this Decoder/SharedBuffer interaction can be done cleanly? Yeah, it is a bit unfortunate :/ I added a new helper to SharedBuffer to make this more palatable: WTF::Persistence::Decoder SharedBuffer::decoder() const { return { reinterpret_cast<const uint8_t*>(data()), size() }; } …and used this helper in PasteboardCustomData. >> Source/WebCore/platform/PasteboardCustomData.cpp:97 >> +} > > The fact that we maintain a vector of types makes it seem wasteful that the we also have three maps. We are going to pay the O(n^2) cost if there are a ton of types so all the hashing in the maps is wasteful: we could instead keep all the data in this same vector. Or if we need the O(n log n) performance then this would need to be a ListHashSet (or whatever) instead of a Vector. > > Inconvenient that the internal data structures are exposed in the getters of this class so we can't change this around without changing a lot of call sites. Yeah, that makes sense. I’ll go with keeping all the data in one Vector, and introduce a new class (PasteboardCustomDataEntry) to represent each type and its platform and custom data. I will also refactor PasteboardCustomData’s methods so that we don’t end up exposing the implementation details of PasteboardCustomData. >> Source/WebCore/platform/PasteboardCustomData.cpp:130 >> + return; > > What guarantees this type will appear in at most one of these three maps? > > It looks like the functions above can put the same type in both m_platformStringData and m_sameOriginCustomStringData, for example. > > By the way, if these maps were truly mutually exclusive, then I’d suggest we consider a single map with a more complex value type instead. Although it might make the code more complex at various call sites. Ah, so it was (and remains) valid for the same type to appear in both m_platformStringData and m_sameOriginCustomStringData. That is to say, a particular type may have a certain representation on the platform pasteboard that is different from its representation in custom data (which we expose to pages of the same origin). However, it doesn’t make sense for the same type to appear in both m_platformStringData and m_platformBinaryData. Essentially, a type may map to either a shared buffer (in m_platformBinaryData) or a string (in m_platformStringData) in the platform pasteboard, but not both. Now that I reflect on this some more, a better way to express this would be to replace m_platformBinaryData with m_platformData, a Vector<Variant<RefPtr<SharedBuffer>, String>>. >> Source/WebCore/platform/PasteboardCustomData.h:39 >> + WEBCORE_EXPORT PasteboardCustomData(const String&, Vector<String>&&, HashMap<String, String>&& platformStringData, HashMap<String, String>&& customData, HashMap<String, RefPtr<SharedBuffer>>&&); > > Since we're using rvalue references for everything else, why not use it for the first argument too? Can save a tiny bit of reference count churn. > > Also, I think the first argument needs a name. It’s not obvious that it’s an origin. Sounds good. Made this String&& origin. >> Source/WebCore/platform/PasteboardCustomData.h:40 >> + WEBCORE_EXPORT PasteboardCustomData(const PasteboardCustomData&); > > Seems we should include both move and copy constructors rather than only copy, since it can result in more efficient code for some common idioms. That’s one thing we get "for free" if we don’t define either the move or copy constructor, but need to do extra work to preserve. I think the struct we had before supported move. If copy is all that’s needed at the moment, sometimes I will delete the move constructor rather than implementing it, promising myself "I will implement it if anyone needs it". That makes me feel better that we won’t end up using copy-and-destroy when we could use a cheaper move. Sounds good! Added a move constructor. >> Source/WebCore/platform/PasteboardCustomData.h:73 >> + HashMap<String, RefPtr<SharedBuffer>> m_platformBinaryData; > > Seems the value type here could be Ref<> instead of RefPtr<>. I’ll be replacing all of these members (except an origin) with just: `Vector<PasteboardCustomDataEntry> m_data;`. And each PasteboardCustomDataEntry will have a type, customData string, and a platformData which may either be a Ref<SharedBuffer> or a String. >> Source/WebCore/platform/StaticPasteboard.cpp:88 >> + return WTFMove(m_customData); > > This doesn't guarantee the data is cleared. Might want to use std::exchange instead to guarantee it’s cleared afterward (see below about changing the return type). Good catch — changed to use std::exchange. >> Source/WebCore/platform/StaticPasteboard.h:42 >> + PasteboardCustomData&& takeCustomData(); > > Seems like the old signature is better for a function that definitively does a "take". Returning an rvalue reference is OK if it’s a function where some callers want to call it but not actually take the data, perhaps read it but leave it in place, but if callers are always taking then I think it’s better to return the object, not an rvalue reference. Harder to misuse by accident. Ok! I changed back to PasteboardCustomData. >> Source/WebCore/platform/ios/AbstractPasteboard.h:52 >> #if __IPHONE_OS_VERSION_MIN_REQUIRED >= 110000 > > Is this conditional still needed? Do we support those old versions of iOS? > > Also, why isn’t this an "SPI" header? Good point — all of these conditionals are no longer needed. I've removed them. This isn’t an *SPI.h header because it doesn’t declare any platform SPI, but rather, provides a common protocol (AbstractPasteboard) that both the platform UIPasteboard and the WebItemProviderPasteboard (a thin wrapper around a list of item providers) conforms to. >> Source/WebCore/platform/ios/PlatformPasteboardIOS.mm:747 >> + return [(id<AbstractPasteboard>)m_pasteboard.get() changeCount]; > > Seems peculiar that this typecast is both needed, but also guaranteed to be safe and doesn’t require a protocol conformance check. Oops, yeah, the explicit cast is not needed here. I’ve removed it. >> Source/WebCore/platform/ios/WebItemProviderPasteboard.mm:870 >> + return stagedRegistrationInfoLists.autorelease(); > > I don’t understand why the WTFMove and the local variable are needed. I would expect that this code would do the same thing: > > return _stagedRegistrationInfoLists.autorelease(); > > If RetainPtr::autorelease does not nil the pointer out (I think it does) then maybe it’s std::exchange we want, not WTFMove. One sign of that is that it would still be a one-liner: > > return std::exchange(_stagedRegistrationInfoLists, nil).autorelease(); Good point — it seems that _stagedRegistrationInfoLists.autorelease(); achieves what I wanted here. >> Source/WebCore/platform/mac/PlatformPasteboardMac.mm:484 >> + } > > This can be done simply with HashMap::get and it might be nicer: > > auto value = data.platformStringData().get(type); > if (!value.isNull()) { > [item setString:value forType:platformType]; > continue; > } > > The find/end/->value dance is needed for types where copying the value out of the map is expensive or there is no null value. > > You’ll also notice that I favor shorter variable names; my philosophy is to include adjectives only if they are needed for disambiguation or clarity. Sounds good! Changed to the above. >> Source/WebCore/platform/mac/PlatformPasteboardMac.mm:490 >> + } > > Same thing is possible here: > > if (auto buffer = data.platformBinaryData().get(type)) { > if (auto platformData = buffer->createNSData()) > [item setData: platformData.get() forType():platformType]; > } > > I think the non-iterator code is easier to read. Fixed! >> Source/WebCore/platform/mac/PlatformPasteboardMac.mm:501 >> + [platformItems addObject:item.get()]; > > Is "silently omit" the right thing to do when createPasteboardItem returns nil? Not obvious to me what this case is. Oops, I realized there was no point to this nil check, since createPasteboardItem always returns a non-null NSItemProvider. I removed the if statement. >> Source/WebKit/Shared/WebCoreArgumentCoders.cpp:1645 >> + encoder << static_cast<uint64_t>(data.platformBinaryData().size()); > > Why is this cast needed? Seems that we could just use the appropriate type in the decode function. I guess size_t? Annoying to have to use this idiom. I originally didn’t have this cast; however, when I tried to decode the `size_t` below, I observed a build error like: OpenSource/Source/WebKit/Platform/IPC/Decoder.h:176:31: error: no matching member function for call to 'decode' if (ArgumentCoder<T>::decode(*this, t)) { OpenSource/Source/WebKit/Platform/IPC/Decoder.h:158:15: note: in instantiation of function template specialization 'IPC::Decoder::operator>><unsigned long, nullptr>' requested here *this >> optional; OpenSource/Source/WebKit/Shared/WebCoreArgumentCoders.cpp:1673:18: note: in instantiation of function template specialization 'IPC::Decoder::decode<unsigned long, nullptr>' requested here if (!decoder.decode(binaryDataEntryCount)) Adding size_t encode/decode methods to WebKit::Encoder and WebKit::Decoder fixed this for me. I think this is okay because there’s no case anymore where we would need to run 32-bit WebKit2 clients against a 64-bit web process, so size_t shouldn’t differ between the processes. In any case, I will pursue this in a followup, since there are a number of other places in WebKit where we end up encoding and decoding `size_t`s as `uint64_t`s, and it would indeed be nice to address all of them.
Wenson Hsieh
Comment 5 2019-10-13 12:58:06 PDT
Created attachment 380850 [details] Address review feedback
Darin Adler
Comment 6 2019-10-14 09:57:41 PDT
Comment on attachment 380850 [details] Address review feedback View in context: https://bugs.webkit.org/attachment.cgi?id=380850&action=review > Source/WebCore/platform/PasteboardCustomData.h:38 > +struct PasteboardCustomDataEntry { Might want this to be a nested class instead of a separate top level class. > Source/WebCore/platform/PasteboardCustomData.h:42 > + WEBCORE_EXPORT PasteboardCustomDataEntry& operator=(const PasteboardCustomDataEntry& otherData); Same thing can apply for assignment operator as for copy constructor. Might want a move overload since that can be more efficient than copying, depending on how this is used. > Source/WebCore/platform/SharedBuffer.h:194 > + WTF::Persistence::Decoder decoder() const; It’s great to have this function available. Not so good that SharedBuffer.h now includes PersistentCoders.h. Could we use a forward declaration instead of including PersistentCoders.h or have this be a free function somewhere instead of a member function? > Source/WebCore/platform/ios/PlatformPasteboardIOS.mm:301 > + auto *itemDictionaries = [NSMutableArray arrayWithCapacity:itemLists.count]; For non-ARC code, I believe it’s more efficient to use alloc/init and RetainPtr than to use a method like this one that returns an auto-released object. Although this is slightly more elegant, and matches the idiom in some other existing code. > Source/WebCore/platform/ios/PlatformPasteboardIOS.mm:315 > + auto *itemProviders = [NSMutableArray arrayWithCapacity:itemLists.count]; For non-ARC code, I believe it’s more efficient to use alloc/init and RetainPtr than to use a method like this one that returns an auto-released object. Although this is slightly more elegant, and matches the idiom in some other existing code. > Source/WebCore/platform/ios/PlatformPasteboardIOS.mm:603 > + NSMutableArray *registrationLists = [NSMutableArray arrayWithCapacity:itemData.size()]; For non-ARC code, I believe it’s more efficient to use alloc/init and RetainPtr than to use a method like this one that returns an auto-released object. Although this is slightly more elegant, and matches the idiom in some other existing code. > Source/WebKit/Shared/WebCoreArgumentCoders.cpp:1679 > + ASSERT_NOT_REACHED(); Seems strange to assert in this one place. Other decode failures are equally worth asserting. I would omit this. Or could use a single 3-state value rather than two booleans for hasString and hasBuffer.
Wenson Hsieh
Comment 7 2019-10-14 10:45:51 PDT
Comment on attachment 380850 [details] Address review feedback View in context: https://bugs.webkit.org/attachment.cgi?id=380850&action=review >> Source/WebCore/platform/PasteboardCustomData.h:38 >> +struct PasteboardCustomDataEntry { > > Might want this to be a nested class instead of a separate top level class. Sounds good — refactored this to PasteboardCustomData::Entry. >> Source/WebCore/platform/PasteboardCustomData.h:42 >> + WEBCORE_EXPORT PasteboardCustomDataEntry& operator=(const PasteboardCustomDataEntry& otherData); > > Same thing can apply for assignment operator as for copy constructor. Might want a move overload since that can be more efficient than copying, depending on how this is used. Added a move assignment operator. >> Source/WebCore/platform/SharedBuffer.h:194 >> + WTF::Persistence::Decoder decoder() const; > > It’s great to have this function available. Not so good that SharedBuffer.h now includes PersistentCoders.h. Could we use a forward declaration instead of including PersistentCoders.h or have this be a free function somewhere instead of a member function? Good catch! Replaced the #include above with a forward declaration for Decoder. >> Source/WebCore/platform/ios/PlatformPasteboardIOS.mm:301 >> + auto *itemDictionaries = [NSMutableArray arrayWithCapacity:itemLists.count]; > > For non-ARC code, I believe it’s more efficient to use alloc/init and RetainPtr than to use a method like this one that returns an auto-released object. Although this is slightly more elegant, and matches the idiom in some other existing code. Sounds good! Changed to use RetainPtr. >> Source/WebCore/platform/ios/PlatformPasteboardIOS.mm:315 >> + auto *itemProviders = [NSMutableArray arrayWithCapacity:itemLists.count]; > > For non-ARC code, I believe it’s more efficient to use alloc/init and RetainPtr than to use a method like this one that returns an auto-released object. Although this is slightly more elegant, and matches the idiom in some other existing code. Changed to use RetainPtr. >> Source/WebCore/platform/ios/PlatformPasteboardIOS.mm:603 >> + NSMutableArray *registrationLists = [NSMutableArray arrayWithCapacity:itemData.size()]; > > For non-ARC code, I believe it’s more efficient to use alloc/init and RetainPtr than to use a method like this one that returns an auto-released object. Although this is slightly more elegant, and matches the idiom in some other existing code. Changed to use RetainPtr. >> Source/WebKit/Shared/WebCoreArgumentCoders.cpp:1679 >> + ASSERT_NOT_REACHED(); > > Seems strange to assert in this one place. Other decode failures are equally worth asserting. I would omit this. Or could use a single 3-state value rather than two booleans for hasString and hasBuffer. Okay! I’ve omitted this assertion.
Wenson Hsieh
Comment 8 2019-10-14 10:49:12 PDT
WebKit Commit Bot
Comment 9 2019-10-14 14:52:33 PDT
Comment on attachment 380895 [details] For EWS Clearing flags on attachment: 380895 Committed r251100: <https://trac.webkit.org/changeset/251100>
Ling Ho
Comment 10 2019-10-15 15:20:50 PDT
Note You need to log in before you can comment on or make changes to this bug.