RESOLVED FIXED228161
SharedBuffer::takeData() is a bit dangerous
https://bugs.webkit.org/show_bug.cgi?id=228161
Summary SharedBuffer::takeData() is a bit dangerous
Chris Dumez
Reported 2021-07-21 12:46:28 PDT
SharedBuffer::takeData() is a bit dangerous since SharedBuffer is RefCounted and several object may be sharing ownership of the buffer. Having one owner call takeData() in case ownership is shared leads to bugs such as Bug 228096.
Attachments
Patch (12.33 KB, patch)
2021-07-21 12:51 PDT, Chris Dumez
no flags
Patch (12.38 KB, patch)
2021-07-21 12:59 PDT, Chris Dumez
ews-feeder: commit-queue-
Patch (15.86 KB, patch)
2021-07-21 13:24 PDT, Chris Dumez
no flags
Patch (16.00 KB, patch)
2021-07-21 14:41 PDT, Chris Dumez
no flags
Patch (16.02 KB, patch)
2021-07-21 14:46 PDT, Chris Dumez
no flags
Chris Dumez
Comment 1 2021-07-21 12:51:36 PDT
Alex Christensen
Comment 2 2021-07-21 12:57:06 PDT
Comment on attachment 433948 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=433948&action=review > Source/WebCore/platform/SharedBuffer.cpp:127 > + Vector<uint8_t> data; This should use reserveInitialCapacity and uncheckedAppend.
Darin Adler
Comment 3 2021-07-21 12:58:26 PDT
Comment on attachment 433948 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=433948&action=review > Source/WebCore/platform/SharedBuffer.cpp:127 > + Vector<uint8_t> data; To make sure we don’t over allocate vector capacity, we could add this: data.reserveInitialCapacity(size()); It means we’d iterate the segments twice, but I think that’s better.
Darin Adler
Comment 4 2021-07-21 12:58:41 PDT
Ah, same comment as Alex!
Chris Dumez
Comment 5 2021-07-21 12:59:49 PDT
Darin Adler
Comment 6 2021-07-21 13:01:16 PDT
Might still want to offer a takeData with a RELEASE_ASSERT(buffer.hasOneRef()), because silently copying isn’t so great.
Chris Dumez
Comment 7 2021-07-21 13:03:00 PDT
(In reply to Darin Adler from comment #6) > Might still want to offer a takeData with a > RELEASE_ASSERT(buffer.hasOneRef()), because silently copying isn’t so great. Seems risky to me. Also note that there was no takeData() until I introduced it super recently as an optimization. Getting the optimization sometimes is still better than crashing.
Chris Dumez
Comment 8 2021-07-21 13:24:18 PDT
Alex Christensen
Comment 9 2021-07-21 13:49:52 PDT
Comment on attachment 433951 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=433951&action=review > Source/WTF/wtf/Vector.h:787 > + template<typename U> ALWAYS_INLINE void uncheckedAppend(const U* u, size_t size) { uncheckedAppend<FailureAction::Crash>(u, size); } Do we need this one? Removing it would encourage Span adoption. > Source/WebCore/platform/SharedBuffer.h:107 > + Vector<uint8_t> extractData(); Would it be possible to annotate this with && after () so we can only call it on r-values? As it stands now, I think it's still risky because it will clear the buffer sometimes and sometimes it won't. If we make it clear that this can only be called on r-values, we indicate in the code that this SharedBuffer reference shouldn't be used after this call.
Chris Dumez
Comment 10 2021-07-21 13:53:55 PDT
(In reply to Alex Christensen from comment #9) > Comment on attachment 433951 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=433951&action=review > > > Source/WTF/wtf/Vector.h:787 > > + template<typename U> ALWAYS_INLINE void uncheckedAppend(const U* u, size_t size) { uncheckedAppend<FailureAction::Crash>(u, size); } > > Do we need this one? Removing it would encourage Span adoption. > > > Source/WebCore/platform/SharedBuffer.h:107 > > + Vector<uint8_t> extractData(); > > Would it be possible to annotate this with && after () so we can only call > it on r-values? As it stands now, I think it's still risky because it will > clear the buffer sometimes and sometimes it won't. If we make it clear that > this can only be called on r-values, we indicate in the code that this > SharedBuffer reference shouldn't be used after this call. If we do use &&, then maybe it can simply be an optimized overload of copyData()? Vector<uint8_t> copyData() const &; Vector<uint8_t> copyData() &&; // Moves the data out instead of copying when hasOneRef
Alex Christensen
Comment 11 2021-07-21 14:24:54 PDT
I like it.
Chris Dumez
Comment 12 2021-07-21 14:41:42 PDT
Chris Dumez
Comment 13 2021-07-21 14:44:10 PDT
(In reply to Chris Dumez from comment #10) > (In reply to Alex Christensen from comment #9) > > Comment on attachment 433951 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=433951&action=review > > > > > Source/WTF/wtf/Vector.h:787 > > > + template<typename U> ALWAYS_INLINE void uncheckedAppend(const U* u, size_t size) { uncheckedAppend<FailureAction::Crash>(u, size); } > > > > Do we need this one? Removing it would encourage Span adoption. > > > > > Source/WebCore/platform/SharedBuffer.h:107 > > > + Vector<uint8_t> extractData(); > > > > Would it be possible to annotate this with && after () so we can only call > > it on r-values? As it stands now, I think it's still risky because it will > > clear the buffer sometimes and sometimes it won't. If we make it clear that > > this can only be called on r-values, we indicate in the code that this > > SharedBuffer reference shouldn't be used after this call. > > If we do use &&, then maybe it can simply be an optimized overload of > copyData()? > > Vector<uint8_t> copyData() const &; > Vector<uint8_t> copyData() &&; // Moves the data out instead of copying when > hasOneRef I looked in to that but I don't like that it means I have to keep calling WTFMove() in order the get the optimization. ``` void foo(Ref<SharedBuffer>&& buffer) { auto data = buffer.copyData(); // Doesn't get the optimization. // I have to do this, which is ugly and people are unlikely to do it in practice: auto data = WTFMove(buffer).copyData(); } ```
Darin Adler
Comment 14 2021-07-21 14:45:19 PDT
Please still mark copyData() const.
Chris Dumez
Comment 15 2021-07-21 14:46:17 PDT
Alex Christensen
Comment 16 2021-07-21 15:03:05 PDT
You could make a static function that does it: auto vector = SharedBuffer::copyData(WTFMove(buffer)); I think it's beneficial to have WTFMove because it indicates that buffer shouldn't be used after this call or it will get undefined behavior. There are even tools that check for it. There aren't tools that check for extractData(). I even think WTFMove(buffer).copyData() is good C++.
Chris Dumez
Comment 17 2021-07-21 15:06:06 PDT
(In reply to Alex Christensen from comment #16) > You could make a static function that does it: > auto vector = SharedBuffer::copyData(WTFMove(buffer)); > > I think it's beneficial to have WTFMove because it indicates that buffer > shouldn't be used after this call or it will get undefined behavior. There > are even tools that check for it. There aren't tools that check for > extractData(). I even think WTFMove(buffer).copyData() is good C++. I feel the name "extraData" conveys you shouldn't be using the buffer after calling it. I don't want to over-engineer this thing. WTFMove(buffer).copyData() may be good C++ but it is not common place in WebKit. We made such optimization for String::isolatedCopy() and look how much the optimization is used in WebKit (even though it could be, in many places).
Darin Adler
Comment 18 2021-07-23 12:32:13 PDT
Comment on attachment 433959 [details] Patch Should we do both? Still have the takeData overload for && but have extractData as syntactic sugar so it’s easier to call correctly. Seems like a way to "dip our toes in the water" of the && functions and maybe try to make move(x).take the future idiom.
Chris Dumez
Comment 19 2021-07-23 12:35:46 PDT
(In reply to Darin Adler from comment #18) > Comment on attachment 433959 [details] > Patch > > Should we do both? Still have the takeData overload for && but have > extractData as syntactic sugar so it’s easier to call correctly. Seems like > a way to "dip our toes in the water" of the && functions and maybe try to > make move(x).take the future idiom. In this proposal, would `takeData() &&` do the exact same thing as extractData()? - If so, why it is useful? - If not, I assume because it omits the hasOneRef() check and never copies, then I don't think this is ever safe. Just because something is an r-value reference doesn't mean we're the only ones holding a ref to it. Therefore, we'd be re-introduce the class of bugs I am trying to get rid of with this patch.
Chris Dumez
Comment 20 2021-07-23 14:31:52 PDT
Comment on attachment 433959 [details] Patch Landing as is for now. Will follow-up if needed based on discussion.
Darin Adler
Comment 21 2021-07-23 14:45:33 PDT
(In reply to Chris Dumez from comment #19) > In this proposal, would `takeData() &&` do the exact same thing as > extractData()? Yes. They are synonyms with different syntax. > - If so, why it is useful? We can use it to experiment with the different syntax and to let the static analysis tools know what is going on so they can give errors for any use after moving out before overwriting. Some day, after experimenting with it, we may decide that we prefer it, and we could perhaps remove extractData.
EWS
Comment 22 2021-07-23 15:05:12 PDT
Committed r280260 (239926@main): <https://commits.webkit.org/239926@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 433959 [details].
Radar WebKit Bug Importer
Comment 23 2021-07-23 15:06:15 PDT
Note You need to log in before you can comment on or make changes to this bug.