Bug 23501 - Overlapping memcpy in TestDecoder::reset
Summary: Overlapping memcpy in TestDecoder::reset
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-01-23 05:26 PST by Dean McNamee
Modified: 2009-05-13 17:30 PDT (History)
1 user (show)

See Also:


Attachments
Patch attempt for skipping assignment in reset(). (1.20 KB, patch)
2009-01-27 08:46 PST, Dean McNamee
darin: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dean McNamee 2009-01-23 05:26:02 PST
WebCore/platform/text/TextDecoder.cpp:
void TextDecoder::reset(const TextEncoding& encoding)
{
    m_encoding = encoding;

m_encoding is a TextEncoding, which has a default assignment operator which will be implemented by gcc as a memcpy.  The following patch:

commit 58f7c694c743381477042501d9069b2cda497751
Author: antti@apple.com <antti@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Date:   Sun Nov 9 04:11:26 2008 +0000

    WebCore:
    
    2008-11-08  Antti Koivisto  <antti@apple.com>
    
            Reviewed by Sam Weinig.
    
            Fix https://bugs.webkit.org/show_bug.cgi?id=22141
            REGRESSION: Safari error page is not fully styled when loaded from cache
    
            Reset text decoder on flush so it does not pass through the BOM when it is reused.
    
            Test: fast/encoding/css-cached-bom.html
    
            * loader/TextResourceDecoder.cpp:
            (WebCore::TextResourceDecoder::flush):
    
    LayoutTests:
    
    2008-11-08  Antti Koivisto  <antti@apple.com>
    
            Reviewed by Sam Weinig.
    
            Test for https://bugs.webkit.org/show_bug.cgi?id=22141
            REGRESSION: Safari error page is not fully styled when loaded from cache
    
            * fast/encoding/css-cached-bom.html: Added.
            * fast/encoding/css-cached-bom-expected.txt: Added.
            * fast/encoding/resources/css-cached-bom-frame.html: Added.
            * fast/encoding/resources/utf-16-little-endian.css: Added.
    
    
    
    git-svn-id: http://svn.webkit.org/repository/webkit/trunk@38240 268f45cc-cd09-0410-ab3c-d52691b4dbfc

diff --git a/WebCore/loader/TextResourceDecoder.cpp b/WebCore/loader/TextResourceDecoder.cpp
index 4a0caa0..2064393 100644
--- a/WebCore/loader/TextResourceDecoder.cpp
+++ b/WebCore/loader/TextResourceDecoder.cpp
@@ -793,6 +793,7 @@ String TextResourceDecoder::flush()
 {
     String result = m_decoder.decode(m_buffer.data(), m_buffer.size(), true, m_contentType == XML, m_sawError);
     m_buffer.clear();
+    m_decoder.reset(m_decoder.encoding());
     return result;
 }


Added a call to m_decoder.reset(m_decoder.encoding());

which means that m_encoding = encoding; is assignment to itself, which is the equivalent of memcpy'ing with the same buffer as the source and destination, which is invalid.

I think the solution here would either be to identity check the assignment:

if (&m_encoding != &encoding)
    ...

Or to to offer some flavor or reset() which does not reset the encoding.

Valgrind:

==29911== Source and destination overlap in memcpy(0xC9D7F34, 0xC9D7F34, 6)
==29911==    at 0x79A2F22: memcpy (mc_replace_strmem.c:402)
==29911==    by 0x8806E6F: WebCore::TextDecoder::reset(WebCore::TextEncoding const&) (TextDecoder.cpp:44)
==29911==    by 0x84AEE0B: WebCore::TextResourceDecoder::flush() (TextResourceDecoder.cpp:896)
==29911==    by 0x85B2FCA: WebCore::FrameLoader::write(char const*, int, bool) (FrameLoader.cpp:1039)
==29911==    by 0x85B8163: WebCore::FrameLoader::endIfNotLoadingMainResource() (FrameLoader.cpp:1093)
==29911==    by 0x85B81AE: WebCore::FrameLoader::end() (FrameLoader.cpp:1078)
==29911==    by 0x85BA257: WebCore::FrameLoader::init() (FrameLoader.cpp:293)
==29911==    by 0x83DAA49: WebCore::Frame::init() (Frame.cpp:216)
==29911==    by 0x805D581: WebFrameImpl::InitMainFrame(WebViewImpl*) (webframe_impl.cc:319)
==29911==    by 0x8084DBD: WebView::Create(WebViewDelegate*, WebPreferences const&) (webview_impl.cc:268)
==29911==    by 0x8A3E54C: WebViewHost::Create(_GtkWidget*, WebViewDelegate*, WebPreferences const&) (webview_host_gtk.cc:25)
==29911==    by 0x8A410D0: TestShell::Initialize(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) (test_shell_gtk.cc:349)
Comment 1 Dean McNamee 2009-01-27 08:46:50 PST
Created attachment 27074 [details]
Patch attempt for skipping assignment in reset().
Comment 2 Darin Adler 2009-01-27 09:19:34 PST
Comment on attachment 27074 [details]
Patch attempt for skipping assignment in reset().

This is the wrong level to fix this at. The TextEncoding class's assignment operator needs to deal with this. Generally, C++ allows you to self-assign, and it's the assignment operator's responsibility to deal with that.

I don't see how memcpy is involved at all in the compiler-generated assignment operator. A TextEncoding has a single data member, a const char*, so I can't imagine how memcpy gets involved. Is this something specific to some particular platform? Perhaps you're using a misconfigured development tool that is complaining about something that's not a real problem.
Comment 3 Dean McNamee 2009-01-27 09:36:31 PST
(In reply to comment #2)
> (From update of attachment 27074 [details] [review])
> This is the wrong level to fix this at. The TextEncoding class's assignment
> operator needs to deal with this. Generally, C++ allows you to self-assign, and
> it's the assignment operator's responsibility to deal with that.
> 
> I don't see how memcpy is involved at all in the compiler-generated assignment
> operator. A TextEncoding has a single data member, a const char*, so I can't
> imagine how memcpy gets involved. Is this something specific to some particular

It is not a single member:

        const char* m_name;
        UChar m_backslashAsCurrencySymbol;

The compiler uses memcpy() to implement the copy.  I don't think I am crazy or have misconfigured development tools:

Dump of assembler code for function _ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE:
0x08809df8 <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+0>:     push   ebp
0x08809df9 <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+1>:     mov    ebp,esp
0x08809dfb <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+3>:     sub    esp,0x18
0x08809dfe <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+6>:     mov    edx,DWORD PTR [ebp+0x8]
0x08809e01 <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+9>:     mov    DWORD PTR [esp+0x8],0x6
0x08809e09 <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+17>:    mov    eax,DWORD PTR [ebp+0xc]
0x08809e0c <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+20>:    mov    DWORD PTR [esp+0x4],eax
0x08809e10 <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+24>:    mov    DWORD PTR [esp],edx
0x08809e13 <_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+27>:    call   0x8052e84 <memcpy@plt>

From the disassembly, you can easily see the 6 constant passed to memcpy, this of course is sizeof(TextEncoding), a pointer and a UChar.

> platform? Perhaps you're using a misconfigured development tool that is
> complaining about something that's not a real problem.
> 

Comment 4 Darin Adler 2009-01-27 09:51:07 PST
(In reply to comment #3)
> The compiler uses memcpy() to implement the copy.  I don't think I am crazy or
> have misconfigured development tools:

Your compiler uses memcpy, so that means that in the opinion of the folks implementing the compiler, it's OK to use memcpy when both arguments point to the same address.

But your development tool considers that a dangerous overlapping memcpy.

Those are the misconfigured development tools. Those tools need to agree. Either memcpy is illegal for this case, and hence the C++ compile must not generate that code because self-assignment is legal and supported in C++. Or memcpy is legal for this case, and your overlapping memcpy tool needs to understand that an not complain.

Of course you're not "crazy"! I know you are trying to do the right thing here.

But checking before an assignment that the new value isn't the same as the old is *not* the right thing.
Comment 5 Dan Kegel 2009-01-27 10:28:23 PST
Sounds like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667
Comment 6 Dean McNamee 2009-01-27 10:51:03 PST
(In reply to comment #5)
> Sounds like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667
> 

Thanks Dan.  So I guess it's a GCC bug.

I could implement a custom operator=() that did the obvious thing, that would probably avoid the memcpy() on the structure assignment.  It's not really a big deal, but it seems nice to avoid the poorly generated code.  Darin?
Comment 7 Darin Adler 2009-01-27 12:13:58 PST
(In reply to comment #6)
> (In reply to comment #5)
> > Sounds like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667

Good to know!

> I could implement a custom operator=() that did the obvious thing, that would
> probably avoid the memcpy() on the structure assignment. It's not really a big
> deal, but it seems nice to avoid the poorly generated code.

I think it's OK to do that either as a performance optimization, or to make it easier to use the tool that detects overlapping memcpy. Neither seems a compelling important reason to make the change to me. We normally optimize performance in three ways: 1) when we can measure the difference, 2) choosing good algorithms, 3) cultivating habits that make the code fast. I'm not sure this qualifies for any of those.

On the other hand, if this would make it easier to use that overlapping memcpy tool, then we could put the code in on that basis. I suspect we might have to add it to classes other than TextEncoding too in the future.

Either way is OK.
Comment 8 Eric Seidel (no email) 2009-05-13 17:30:24 PDT
I'm going to close this as invalid, and leave the Chromium bug for tracking this: http://code.google.com/p/chromium/issues/detail?id=7097

Probably chromium will just add this to the "suppressions" list.

If you believe WebKit still needs to change here, feel free to reopen the bug.