WebCore/platform/text/TextDecoder.cpp: void TextDecoder::reset(const TextEncoding& encoding) { m_encoding = encoding; m_encoding is a TextEncoding, which has a default assignment operator which will be implemented by gcc as a memcpy. The following patch: commit 58f7c694c743381477042501d9069b2cda497751 Author: antti@apple.com <antti@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc> Date: Sun Nov 9 04:11:26 2008 +0000 WebCore: 2008-11-08 Antti Koivisto <antti@apple.com> Reviewed by Sam Weinig. Fix https://bugs.webkit.org/show_bug.cgi?id=22141 REGRESSION: Safari error page is not fully styled when loaded from cache Reset text decoder on flush so it does not pass through the BOM when it is reused. Test: fast/encoding/css-cached-bom.html * loader/TextResourceDecoder.cpp: (WebCore::TextResourceDecoder::flush): LayoutTests: 2008-11-08 Antti Koivisto <antti@apple.com> Reviewed by Sam Weinig. Test for https://bugs.webkit.org/show_bug.cgi?id=22141 REGRESSION: Safari error page is not fully styled when loaded from cache * fast/encoding/css-cached-bom.html: Added. * fast/encoding/css-cached-bom-expected.txt: Added. * fast/encoding/resources/css-cached-bom-frame.html: Added. * fast/encoding/resources/utf-16-little-endian.css: Added. git-svn-id: http://svn.webkit.org/repository/webkit/trunk@38240 268f45cc-cd09-0410-ab3c-d52691b4dbfc diff --git a/WebCore/loader/TextResourceDecoder.cpp b/WebCore/loader/TextResourceDecoder.cpp index 4a0caa0..2064393 100644 --- a/WebCore/loader/TextResourceDecoder.cpp +++ b/WebCore/loader/TextResourceDecoder.cpp @@ -793,6 +793,7 @@ String TextResourceDecoder::flush() { String result = m_decoder.decode(m_buffer.data(), m_buffer.size(), true, m_contentType == XML, m_sawError); m_buffer.clear(); + m_decoder.reset(m_decoder.encoding()); return result; } Added a call to m_decoder.reset(m_decoder.encoding()); which means that m_encoding = encoding; is assignment to itself, which is the equivalent of memcpy'ing with the same buffer as the source and destination, which is invalid. I think the solution here would either be to identity check the assignment: if (&m_encoding != &encoding) ... Or to to offer some flavor or reset() which does not reset the encoding. Valgrind: ==29911== Source and destination overlap in memcpy(0xC9D7F34, 0xC9D7F34, 6) ==29911== at 0x79A2F22: memcpy (mc_replace_strmem.c:402) ==29911== by 0x8806E6F: WebCore::TextDecoder::reset(WebCore::TextEncoding const&) (TextDecoder.cpp:44) ==29911== by 0x84AEE0B: WebCore::TextResourceDecoder::flush() (TextResourceDecoder.cpp:896) ==29911== by 0x85B2FCA: WebCore::FrameLoader::write(char const*, int, bool) (FrameLoader.cpp:1039) ==29911== by 0x85B8163: WebCore::FrameLoader::endIfNotLoadingMainResource() (FrameLoader.cpp:1093) ==29911== by 0x85B81AE: WebCore::FrameLoader::end() (FrameLoader.cpp:1078) ==29911== by 0x85BA257: WebCore::FrameLoader::init() (FrameLoader.cpp:293) ==29911== by 0x83DAA49: WebCore::Frame::init() (Frame.cpp:216) ==29911== by 0x805D581: WebFrameImpl::InitMainFrame(WebViewImpl*) (webframe_impl.cc:319) ==29911== by 0x8084DBD: WebView::Create(WebViewDelegate*, WebPreferences const&) (webview_impl.cc:268) ==29911== by 0x8A3E54C: WebViewHost::Create(_GtkWidget*, WebViewDelegate*, WebPreferences const&) (webview_host_gtk.cc:25) ==29911== by 0x8A410D0: TestShell::Initialize(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) (test_shell_gtk.cc:349)
Created attachment 27074 [details] Patch attempt for skipping assignment in reset().
Comment on attachment 27074 [details] Patch attempt for skipping assignment in reset(). This is the wrong level to fix this at. The TextEncoding class's assignment operator needs to deal with this. Generally, C++ allows you to self-assign, and it's the assignment operator's responsibility to deal with that. I don't see how memcpy is involved at all in the compiler-generated assignment operator. A TextEncoding has a single data member, a const char*, so I can't imagine how memcpy gets involved. Is this something specific to some particular platform? Perhaps you're using a misconfigured development tool that is complaining about something that's not a real problem.
(In reply to comment #2) > (From update of attachment 27074 [details] [review]) > This is the wrong level to fix this at. The TextEncoding class's assignment > operator needs to deal with this. Generally, C++ allows you to self-assign, and > it's the assignment operator's responsibility to deal with that. > > I don't see how memcpy is involved at all in the compiler-generated assignment > operator. A TextEncoding has a single data member, a const char*, so I can't > imagine how memcpy gets involved. Is this something specific to some particular It is not a single member: const char* m_name; UChar m_backslashAsCurrencySymbol; The compiler uses memcpy() to implement the copy. I don't think I am crazy or have misconfigured development tools: Dump of assembler code for function _ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE: 0x08809df8 <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+0>: push ebp 0x08809df9 <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+1>: mov ebp,esp 0x08809dfb <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+3>: sub esp,0x18 0x08809dfe <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+6>: mov edx,DWORD PTR [ebp+0x8] 0x08809e01 <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+9>: mov DWORD PTR [esp+0x8],0x6 0x08809e09 <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+17>: mov eax,DWORD PTR [ebp+0xc] 0x08809e0c <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+20>: mov DWORD PTR [esp+0x4],eax 0x08809e10 <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+24>: mov DWORD PTR [esp],edx 0x08809e13 <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+27>: call 0x8052e84 <memcpy@plt> From the disassembly, you can easily see the 6 constant passed to memcpy, this of course is sizeof(TextEncoding), a pointer and a UChar. > platform? Perhaps you're using a misconfigured development tool that is > complaining about something that's not a real problem. >
(In reply to comment #3) > The compiler uses memcpy() to implement the copy. I don't think I am crazy or > have misconfigured development tools: Your compiler uses memcpy, so that means that in the opinion of the folks implementing the compiler, it's OK to use memcpy when both arguments point to the same address. But your development tool considers that a dangerous overlapping memcpy. Those are the misconfigured development tools. Those tools need to agree. Either memcpy is illegal for this case, and hence the C++ compile must not generate that code because self-assignment is legal and supported in C++. Or memcpy is legal for this case, and your overlapping memcpy tool needs to understand that an not complain. Of course you're not "crazy"! I know you are trying to do the right thing here. But checking before an assignment that the new value isn't the same as the old is *not* the right thing.
Sounds like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667
(In reply to comment #5) > Sounds like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667 > Thanks Dan. So I guess it's a GCC bug. I could implement a custom operator=() that did the obvious thing, that would probably avoid the memcpy() on the structure assignment. It's not really a big deal, but it seems nice to avoid the poorly generated code. Darin?
(In reply to comment #6) > (In reply to comment #5) > > Sounds like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667 Good to know! > I could implement a custom operator=() that did the obvious thing, that would > probably avoid the memcpy() on the structure assignment. It's not really a big > deal, but it seems nice to avoid the poorly generated code. I think it's OK to do that either as a performance optimization, or to make it easier to use the tool that detects overlapping memcpy. Neither seems a compelling important reason to make the change to me. We normally optimize performance in three ways: 1) when we can measure the difference, 2) choosing good algorithms, 3) cultivating habits that make the code fast. I'm not sure this qualifies for any of those. On the other hand, if this would make it easier to use that overlapping memcpy tool, then we could put the code in on that basis. I suspect we might have to add it to classes other than TextEncoding too in the future. Either way is OK.
I'm going to close this as invalid, and leave the Chromium bug for tracking this: http://code.google.com/p/chromium/issues/detail?id=7097 Probably chromium will just add this to the "suppressions" list. If you believe WebKit still needs to change here, feel free to reopen the bug.