NEW 145370
Vector's move assignment operator should move the passed-in vector into a temporary object
https://bugs.webkit.org/show_bug.cgi?id=145370
Summary Vector's move assignment operator should move the passed-in vector into a tem...
Zan Dobersek
Reported 2015-05-25 10:35:30 PDT
Vector's move assignment operator should move the passed-in vector into a temporary object
Attachments
Patch (2.09 KB, patch)
2015-05-25 11:07 PDT, Zan Dobersek
no flags
Patch (3.75 KB, patch)
2015-06-07 10:38 PDT, Zan Dobersek
no flags
WIP patch (2.61 KB, patch)
2016-07-25 22:09 PDT, Zan Dobersek
no flags
Patch (3.68 KB, patch)
2016-08-29 11:57 PDT, Zan Dobersek
mcatanzaro: review+
Zan Dobersek
Comment 1 2015-05-25 11:07:57 PDT
Darin Adler
Comment 2 2015-05-26 09:34:18 PDT
Comment on attachment 253688 [details] Patch Anders should review this.
Anders Carlsson
Comment 3 2015-05-26 09:46:52 PDT
Comment on attachment 253688 [details] Patch Patch looks good but need tests.
Alexey Proskuryakov
Comment 4 2015-05-26 22:25:34 PDT
Comment on attachment 253688 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=253688&action=review > Source/WTF/ChangeLog:11 > + into a temporary object. This ensures that the passed-in Vector is returned into > + the initial state (as if that object was default-constructed) by swapping its data > + with the data of the temporary object. Which part of this doesn't happen without this patch?
Darin Adler
Comment 5 2015-05-27 09:12:09 PDT
Comment on attachment 253688 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=253688&action=review >> Source/WTF/ChangeLog:11 >> + with the data of the temporary object. > > Which part of this doesn't happen without this patch? Without the patch, if a and b are vectors, then this: a = WTF::move(b); Swaps a and b, and after the swap b will have whatever happened to be in a, not necessarily an empty vector. Anders explained to me that this is OK from a C++ perspective because move just has to leave the “moved from” object in some state that is safe to destroy, not any particular state. With the patch, b is guaranteed to be an empty vector. This can be more pleasant to program with in many cases and I generally prefer it for our WebKit classes.
Alexey Proskuryakov
Comment 6 2015-05-27 09:53:06 PDT
Makes sense. Sorry, I misread the patch. That said, it's not clear to me whether moving is good for performance when made safe like this. We are adding a lot of default Vector constructors everywhere.
Darin Adler
Comment 7 2015-05-27 10:47:25 PDT
(In reply to comment #6) > That said, it's not clear to me whether moving is good for performance when > made safe like this. We are adding a lot of default Vector constructors > everywhere. I think that’s likely a misunderstanding. The constructor zeros out a pointer and two 32-bit integers and is quite low cost with no branches. The cost worry might actually be the Vector destructor. One downside of this implementation is that we end up compiling in one additional call to the destructor in the common case, and I supposed there is some risk that it won’t all be optimized away even though the compiler can easily prove the pointer is null.
Michael Catanzaro
Comment 8 2015-06-01 18:08:30 PDT
(In reply to comment #5) > With the patch, b is guaranteed to be an empty vector. This can be more > pleasant to program with in many cases and I generally prefer it for our > WebKit classes. If a programmer were to use the vector b after calling WTF::move(b), e.g. to test whether b is empty, I would assume it to be a bug. I don't think any of our code should rely on that behavior, and I'm unsure why Zan wants to introduce it. (Zan is usually right though; let's see what he has to say.)
Darin Adler
Comment 9 2015-06-01 18:17:52 PDT
(In reply to comment #8) > If a programmer were to use the vector b after calling WTF::move(b), e.g. to > test whether b is empty, I would assume it to be a bug. I have often written code where I have a collection in a data member, and I want to move it into a local variable for processing and leave an empty collection in its place, to “consume” and process the collection. In that case I like that the data members is empty after moving out of it. However, in such cases, I am always moving into a newly created object anyway, so it is typically the constructor rather than operator= that is involved.
Sergio Villar Senin
Comment 10 2015-06-02 04:09:30 PDT
(In reply to comment #5) > Comment on attachment 253688 [details] > Without the patch, if a and b are vectors, then this: > > a = WTF::move(b); > > Swaps a and b, and after the swap b will have whatever happened to be in a, > not necessarily an empty vector. Anders explained to me that this is OK from > a C++ perspective because move just has to leave the “moved from” object in > some state that is safe to destroy, not any particular state. Indeed. Also using b after a move is "undefined behavior" according to the spec if I am not wrong, so I guess that the actual implementation should focus on the performance of the move operation instead of caring about the actual contents of the "moved from" object after it.
Zan Dobersek
Comment 11 2015-06-02 05:38:04 PDT
(In reply to comment #10) > (In reply to comment #5) > > Comment on attachment 253688 [details] > > Without the patch, if a and b are vectors, then this: > > > > a = WTF::move(b); > > > > Swaps a and b, and after the swap b will have whatever happened to be in a, > > not necessarily an empty vector. Anders explained to me that this is OK from > > a C++ perspective because move just has to leave the “moved from” object in > > some state that is safe to destroy, not any particular state. > > Indeed. Also using b after a move is "undefined behavior" according to the > spec if I am not wrong, so I guess that the actual implementation should > focus on the performance of the move operation instead of caring about the > actual contents of the "moved from" object after it. Per specification, in most cases in the standard library, the moved-from object is in a valid but unspecified state, meaning that the moved-from object is still usable but is 'empty' directly after the move -- containers have no elements but can be queried for size and have new elements added, and std::unique_ptr<> has its pointer nulled out but can be reset with a pointer to a different object. Accessing some specific element of an empty container or dereferencing a null std::unique_ptr<> would lead to undefined behavior. A case in point is a generic std::swap() implementation: template<typename T> void swap(T& a, T& b) { T temp = std::move(a); a = std::move(b); b = std::move(temp); } Here both objects are reused after they've been moved-from. (This reminded me that we could specialize std::swap() for WebKit containers to call T::swap() directly instead of moving objects around.)
Zan Dobersek
Comment 12 2015-06-02 07:08:11 PDT
Inspecting the generated code on x86_64, produced with Clang 3.6 and -O2, the compiler does a formidable optimizing job. This is the code that's current generated for operator=(Vector&&) for a Vector with no internal capacity, just swapping the resources of the two objects: 00000000002b42a0 <WTF::Vector<JSC::DFG::Node*, 0ul, WTF::CrashOnOverflow, 16ul>::operator=(WTF::Vector<JSC::DFG::Node*, 0ul, WTF::CrashOnOverflow, 16ul>&&)>: 2b42a0: 48 8b 07 mov (%rdi),%rax 2b42a3: 48 8b 0e mov (%rsi),%rcx 2b42a6: 48 89 0f mov %rcx,(%rdi) 2b42a9: 48 89 06 mov %rax,(%rsi) 2b42ac: 8b 47 08 mov 0x8(%rdi),%eax 2b42af: 8b 4e 08 mov 0x8(%rsi),%ecx 2b42b2: 89 4f 08 mov %ecx,0x8(%rdi) 2b42b5: 89 46 08 mov %eax,0x8(%rsi) 2b42b8: 8b 47 0c mov 0xc(%rdi),%eax 2b42bb: 8b 4e 0c mov 0xc(%rsi),%ecx 2b42be: 89 4f 0c mov %ecx,0xc(%rdi) 2b42c1: 89 46 0c mov %eax,0xc(%rsi) 2b42c4: 48 89 f8 mov %rdi,%rax 2b42c7: c3 retq 2b42c8: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1) 2b42cf: 00 This is the code generated with the proposed patch, avoiding any temporary object construction, directly nullifying the moved-from object and simply moving the resources into the moved-to object, freeing the previous data: 00000000002bde10 <WTF::Vector<JSC::DFG::Node*, 0ul, WTF::CrashOnOverflow, 16ul>::operator=(WTF::Vector<JSC::DFG::Node*, 0ul, WTF::CrashOnOverflow, 16ul>&&)>: 2bde10: 53 push %rbx 2bde11: 48 89 fb mov %rdi,%rbx 2bde14: 48 8b 06 mov (%rsi),%rax 2bde17: 48 c7 06 00 00 00 00 movq $0x0,(%rsi) 2bde1e: 8b 4e 08 mov 0x8(%rsi),%ecx 2bde21: c7 46 08 00 00 00 00 movl $0x0,0x8(%rsi) 2bde28: 8b 56 0c mov 0xc(%rsi),%edx 2bde2b: c7 46 0c 00 00 00 00 movl $0x0,0xc(%rsi) 2bde32: 48 8b 3b mov (%rbx),%rdi 2bde35: 48 89 03 mov %rax,(%rbx) 2bde38: 89 4b 08 mov %ecx,0x8(%rbx) 2bde3b: 89 53 0c mov %edx,0xc(%rbx) 2bde3e: 48 85 ff test %rdi,%rdi 2bde41: 74 05 je 2bde48 <WTF::Vector<JSC::DFG::Node*, 0ul, WTF::CrashOnOverflow, 16ul>::operator=(WTF::Vector<JSC::DFG::Node*, 0ul, WTF::CrashOnOverflow, 16ul>&&)+0x38> 2bde43: e8 38 43 f1 ff callq 1d2180 <WTF::fastFree(void*)@plt> 2bde48: 48 89 d8 mov %rbx,%rax 2bde4b: 5b pop %rbx 2bde4c: c3 retq 2bde4d: 0f 1f 00 nopl (%rax)
Zan Dobersek
Comment 13 2015-06-07 10:38:55 PDT
Created attachment 254442 [details] Patch Now with a simple test case.
Darin Adler
Comment 14 2015-06-07 17:46:42 PDT
(In reply to comment #12) > Inspecting the generated code on x86_64, produced with Clang 3.6 and -O2, > the compiler does a formidable optimizing job. So if the move assignment operator was used alone, this means that the change would make the code bigger. It’s worth investigating further to see that when we include the variable being assigned from going out of scope if the total code size is bigger or not. Maybe not critical before landing this change, but something I am wondering about.
Darin Adler
Comment 15 2015-06-07 17:52:04 PDT
Comment on attachment 254442 [details] Patch I like this change, but I think Anders should make the call. Anders, could you review+?
Zan Dobersek
Comment 16 2015-09-05 01:21:58 PDT
(In reply to comment #14) > (In reply to comment #12) > > Inspecting the generated code on x86_64, produced with Clang 3.6 and -O2, > > the compiler does a formidable optimizing job. > > So if the move assignment operator was used alone, this means that the > change would make the code bigger. It’s worth investigating further to see > that when we include the variable being assigned from going out of scope if > the total code size is bigger or not. Maybe not critical before landing this > change, but something I am wondering about. If the Vector that's being assigned from is a temporary local object, then the compiler will likely optimize the move assignment down to just moving the resources of the temporary object into the target object, and destroying the resources previously held by the target object. The process in this case is pretty similar to just swapping the resources between the two objects (which is what the move assignment operator is doing until this patch lands). This is a very simple example that shows a local object being moved into a static object: static Vector<uint8_t*> s_vec; static void vectorTest() { Vector<uint8_t*> vec(4, nullptr); s_vec = WTF::move(vec); } Clang generates this -- the code allocates the necessary memory and moves all the information into the static object, deallocating previous data, if any: 0000000000536c60 <JSC::vectorTest()>: 536c60: 50 push %rax 536c61: bf 20 00 00 00 mov $0x20,%edi 536c66: e8 c5 39 cc ff callq 1fa630 <WTF::fastMalloc(unsigned long)@plt> 536c6b: 48 85 c0 test %rax,%rax 536c6e: 74 0a je 536c7a <JSC::vectorTest()+0x1a> 536c70: 0f 57 c0 xorps %xmm0,%xmm0 536c73: 0f 11 40 10 movups %xmm0,0x10(%rax) 536c77: 0f 11 00 movups %xmm0,(%rax) 536c7a: 48 8b 3d b7 61 38 00 mov 0x3861b7(%rip),%rdi # 8bce38 <JSC::s_vec> 536c81: 48 89 05 b0 61 38 00 mov %rax,0x3861b0(%rip) # 8bce38 <JSC::s_vec> 536c88: c7 05 ae 61 38 00 04 movl $0x4,0x3861ae(%rip) # 8bce40 <JSC::s_vec+0x8> 536c8f: 00 00 00 536c92: c7 05 a8 61 38 00 04 movl $0x4,0x3861a8(%rip) # 8bce44 <JSC::s_vec+0xc> 536c99: 00 00 00 536c9c: 48 85 ff test %rdi,%rdi 536c9f: 74 06 je 536ca7 <JSC::vectorTest()+0x47> 536ca1: 58 pop %rax 536ca2: e9 59 3a cc ff jmpq 1fa700 <WTF::fastFree(void*)@plt> 536ca7: 58 pop %rax 536ca8: c3 retq 536ca9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Again, this example is probably not tough to optimize for any compiler, but it shows the behavior we're hoping for in this specific case. Overall the move assignment operators will now generate larger code because of the extra destructor, but that's The Right Thing(TM) to do.
Zan Dobersek
Comment 17 2016-07-25 22:09:20 PDT
Created attachment 284563 [details] WIP patch
Zan Dobersek
Comment 18 2016-08-29 11:57:59 PDT
Michael Catanzaro
Comment 19 2016-09-08 20:33:16 PDT
Comment on attachment 287301 [details] Patch (Please give Anders some time to object before landing.)
Anders Carlsson
Comment 20 2016-09-09 07:26:31 PDT
If this is done on a vector with inline capacity, won't it add an extra copy?
Note You need to log in before you can comment on or make changes to this bug.