| Summary: | SharedBuffer::takeData() is a bit dangerous | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | WebKit | Reporter: | Chris Dumez <cdumez> | ||||||||||||
| Component: | WebCore Misc. | Assignee: | Chris Dumez <cdumez> | ||||||||||||
| Status: | RESOLVED FIXED | ||||||||||||||
| Severity: | Normal | CC: | achristensen, benjamin, changseok, cmarcelo, darin, eric.carlson, esprehn+autocc, ews-watchlist, glenn, gyuyoung.kim, jer.noble, mifenton, philipj, sergio, webkit-bug-importer | ||||||||||||
| Priority: | P2 | Keywords: | InRadar | ||||||||||||
| Version: | WebKit Nightly Build | ||||||||||||||
| Hardware: | Unspecified | ||||||||||||||
| OS: | Unspecified | ||||||||||||||
| See Also: |
https://bugs.webkit.org/show_bug.cgi?id=228096 https://bugs.webkit.org/show_bug.cgi?id=234724 |
||||||||||||||
| Attachments: |
|
||||||||||||||
|
Description
Chris Dumez
2021-07-21 12:46:28 PDT
Created attachment 433948 [details]
Patch
Comment on attachment 433948 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=433948&action=review > Source/WebCore/platform/SharedBuffer.cpp:127 > + Vector<uint8_t> data; This should use reserveInitialCapacity and uncheckedAppend. Comment on attachment 433948 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=433948&action=review > Source/WebCore/platform/SharedBuffer.cpp:127 > + Vector<uint8_t> data; To make sure we don’t over allocate vector capacity, we could add this: data.reserveInitialCapacity(size()); It means we’d iterate the segments twice, but I think that’s better. Ah, same comment as Alex! Created attachment 433950 [details]
Patch
Might still want to offer a takeData with a RELEASE_ASSERT(buffer.hasOneRef()), because silently copying isn’t so great. (In reply to Darin Adler from comment #6) > Might still want to offer a takeData with a > RELEASE_ASSERT(buffer.hasOneRef()), because silently copying isn’t so great. Seems risky to me. Also note that there was no takeData() until I introduced it super recently as an optimization. Getting the optimization sometimes is still better than crashing. Created attachment 433951 [details]
Patch
Comment on attachment 433951 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=433951&action=review > Source/WTF/wtf/Vector.h:787 > + template<typename U> ALWAYS_INLINE void uncheckedAppend(const U* u, size_t size) { uncheckedAppend<FailureAction::Crash>(u, size); } Do we need this one? Removing it would encourage Span adoption. > Source/WebCore/platform/SharedBuffer.h:107 > + Vector<uint8_t> extractData(); Would it be possible to annotate this with && after () so we can only call it on r-values? As it stands now, I think it's still risky because it will clear the buffer sometimes and sometimes it won't. If we make it clear that this can only be called on r-values, we indicate in the code that this SharedBuffer reference shouldn't be used after this call. (In reply to Alex Christensen from comment #9) > Comment on attachment 433951 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=433951&action=review > > > Source/WTF/wtf/Vector.h:787 > > + template<typename U> ALWAYS_INLINE void uncheckedAppend(const U* u, size_t size) { uncheckedAppend<FailureAction::Crash>(u, size); } > > Do we need this one? Removing it would encourage Span adoption. > > > Source/WebCore/platform/SharedBuffer.h:107 > > + Vector<uint8_t> extractData(); > > Would it be possible to annotate this with && after () so we can only call > it on r-values? As it stands now, I think it's still risky because it will > clear the buffer sometimes and sometimes it won't. If we make it clear that > this can only be called on r-values, we indicate in the code that this > SharedBuffer reference shouldn't be used after this call. If we do use &&, then maybe it can simply be an optimized overload of copyData()? Vector<uint8_t> copyData() const &; Vector<uint8_t> copyData() &&; // Moves the data out instead of copying when hasOneRef I like it. Created attachment 433958 [details]
Patch
(In reply to Chris Dumez from comment #10) > (In reply to Alex Christensen from comment #9) > > Comment on attachment 433951 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=433951&action=review > > > > > Source/WTF/wtf/Vector.h:787 > > > + template<typename U> ALWAYS_INLINE void uncheckedAppend(const U* u, size_t size) { uncheckedAppend<FailureAction::Crash>(u, size); } > > > > Do we need this one? Removing it would encourage Span adoption. > > > > > Source/WebCore/platform/SharedBuffer.h:107 > > > + Vector<uint8_t> extractData(); > > > > Would it be possible to annotate this with && after () so we can only call > > it on r-values? As it stands now, I think it's still risky because it will > > clear the buffer sometimes and sometimes it won't. If we make it clear that > > this can only be called on r-values, we indicate in the code that this > > SharedBuffer reference shouldn't be used after this call. > > If we do use &&, then maybe it can simply be an optimized overload of > copyData()? > > Vector<uint8_t> copyData() const &; > Vector<uint8_t> copyData() &&; // Moves the data out instead of copying when > hasOneRef I looked in to that but I don't like that it means I have to keep calling WTFMove() in order the get the optimization. ``` void foo(Ref<SharedBuffer>&& buffer) { auto data = buffer.copyData(); // Doesn't get the optimization. // I have to do this, which is ugly and people are unlikely to do it in practice: auto data = WTFMove(buffer).copyData(); } ``` Please still mark copyData() const. Created attachment 433959 [details]
Patch
You could make a static function that does it: auto vector = SharedBuffer::copyData(WTFMove(buffer)); I think it's beneficial to have WTFMove because it indicates that buffer shouldn't be used after this call or it will get undefined behavior. There are even tools that check for it. There aren't tools that check for extractData(). I even think WTFMove(buffer).copyData() is good C++. (In reply to Alex Christensen from comment #16) > You could make a static function that does it: > auto vector = SharedBuffer::copyData(WTFMove(buffer)); > > I think it's beneficial to have WTFMove because it indicates that buffer > shouldn't be used after this call or it will get undefined behavior. There > are even tools that check for it. There aren't tools that check for > extractData(). I even think WTFMove(buffer).copyData() is good C++. I feel the name "extraData" conveys you shouldn't be using the buffer after calling it. I don't want to over-engineer this thing. WTFMove(buffer).copyData() may be good C++ but it is not common place in WebKit. We made such optimization for String::isolatedCopy() and look how much the optimization is used in WebKit (even though it could be, in many places). Comment on attachment 433959 [details]
Patch
Should we do both? Still have the takeData overload for && but have extractData as syntactic sugar so it’s easier to call correctly. Seems like a way to "dip our toes in the water" of the && functions and maybe try to make move(x).take the future idiom.
(In reply to Darin Adler from comment #18) > Comment on attachment 433959 [details] > Patch > > Should we do both? Still have the takeData overload for && but have > extractData as syntactic sugar so it’s easier to call correctly. Seems like > a way to "dip our toes in the water" of the && functions and maybe try to > make move(x).take the future idiom. In this proposal, would `takeData() &&` do the exact same thing as extractData()? - If so, why it is useful? - If not, I assume because it omits the hasOneRef() check and never copies, then I don't think this is ever safe. Just because something is an r-value reference doesn't mean we're the only ones holding a ref to it. Therefore, we'd be re-introduce the class of bugs I am trying to get rid of with this patch. Comment on attachment 433959 [details]
Patch
Landing as is for now. Will follow-up if needed based on discussion.
(In reply to Chris Dumez from comment #19) > In this proposal, would `takeData() &&` do the exact same thing as > extractData()? Yes. They are synonyms with different syntax. > - If so, why it is useful? We can use it to experiment with the different syntax and to let the static analysis tools know what is going on so they can give errors for any use after moving out before overwriting. Some day, after experimenting with it, we may decide that we prefer it, and we could perhaps remove extractData. Committed r280260 (239926@main): <https://commits.webkit.org/239926@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 433959 [details]. |