<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>23501</bug_id>
          
          <creation_ts>2009-01-23 05:26:02 -0800</creation_ts>
          <short_desc>Overlapping memcpy in TestDecoder::reset</short_desc>
          <delta_ts>2009-05-13 17:30:24 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>WebCore Misc.</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>PC</rep_platform>
          <op_sys>OS X 10.5</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>WONTFIX</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>0</everconfirmed>
          <reporter name="Dean McNamee">deanm</reporter>
          <assigned_to name="Nobody">webkit-unassigned</assigned_to>
          <cc>darin</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>106738</commentid>
    <comment_count>0</comment_count>
    <who name="Dean McNamee">deanm</who>
    <bug_when>2009-01-23 05:26:02 -0800</bug_when>
    <thetext>WebCore/platform/text/TextDecoder.cpp:
void TextDecoder::reset(const TextEncoding&amp; encoding)
{
    m_encoding = encoding;

m_encoding is a TextEncoding, which has a default assignment operator which will be implemented by gcc as a memcpy.  The following patch:

commit 58f7c694c743381477042501d9069b2cda497751
Author: antti@apple.com &lt;antti@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc&gt;
Date:   Sun Nov 9 04:11:26 2008 +0000

    WebCore:
    
    2008-11-08  Antti Koivisto  &lt;antti@apple.com&gt;
    
            Reviewed by Sam Weinig.
    
            Fix https://bugs.webkit.org/show_bug.cgi?id=22141
            REGRESSION: Safari error page is not fully styled when loaded from cache
    
            Reset text decoder on flush so it does not pass through the BOM when it is reused.
    
            Test: fast/encoding/css-cached-bom.html
    
            * loader/TextResourceDecoder.cpp:
            (WebCore::TextResourceDecoder::flush):
    
    LayoutTests:
    
    2008-11-08  Antti Koivisto  &lt;antti@apple.com&gt;
    
            Reviewed by Sam Weinig.
    
            Test for https://bugs.webkit.org/show_bug.cgi?id=22141
            REGRESSION: Safari error page is not fully styled when loaded from cache
    
            * fast/encoding/css-cached-bom.html: Added.
            * fast/encoding/css-cached-bom-expected.txt: Added.
            * fast/encoding/resources/css-cached-bom-frame.html: Added.
            * fast/encoding/resources/utf-16-little-endian.css: Added.
    
    
    
    git-svn-id: http://svn.webkit.org/repository/webkit/trunk@38240 268f45cc-cd09-0410-ab3c-d52691b4dbfc

diff --git a/WebCore/loader/TextResourceDecoder.cpp b/WebCore/loader/TextResourceDecoder.cpp
index 4a0caa0..2064393 100644
--- a/WebCore/loader/TextResourceDecoder.cpp
+++ b/WebCore/loader/TextResourceDecoder.cpp
@@ -793,6 +793,7 @@ String TextResourceDecoder::flush()
 {
     String result = m_decoder.decode(m_buffer.data(), m_buffer.size(), true, m_contentType == XML, m_sawError);
     m_buffer.clear();
+    m_decoder.reset(m_decoder.encoding());
     return result;
 }


Added a call to m_decoder.reset(m_decoder.encoding());

which means that m_encoding = encoding; is assignment to itself, which is the equivalent of memcpy&apos;ing with the same buffer as the source and destination, which is invalid.

I think the solution here would either be to identity check the assignment:

if (&amp;m_encoding != &amp;encoding)
    ...

Or to to offer some flavor or reset() which does not reset the encoding.

Valgrind:

==29911== Source and destination overlap in memcpy(0xC9D7F34, 0xC9D7F34, 6)
==29911==    at 0x79A2F22: memcpy (mc_replace_strmem.c:402)
==29911==    by 0x8806E6F: WebCore::TextDecoder::reset(WebCore::TextEncoding const&amp;) (TextDecoder.cpp:44)
==29911==    by 0x84AEE0B: WebCore::TextResourceDecoder::flush() (TextResourceDecoder.cpp:896)
==29911==    by 0x85B2FCA: WebCore::FrameLoader::write(char const*, int, bool) (FrameLoader.cpp:1039)
==29911==    by 0x85B8163: WebCore::FrameLoader::endIfNotLoadingMainResource() (FrameLoader.cpp:1093)
==29911==    by 0x85B81AE: WebCore::FrameLoader::end() (FrameLoader.cpp:1078)
==29911==    by 0x85BA257: WebCore::FrameLoader::init() (FrameLoader.cpp:293)
==29911==    by 0x83DAA49: WebCore::Frame::init() (Frame.cpp:216)
==29911==    by 0x805D581: WebFrameImpl::InitMainFrame(WebViewImpl*) (webframe_impl.cc:319)
==29911==    by 0x8084DBD: WebView::Create(WebViewDelegate*, WebPreferences const&amp;) (webview_impl.cc:268)
==29911==    by 0x8A3E54C: WebViewHost::Create(_GtkWidget*, WebViewDelegate*, WebPreferences const&amp;) (webview_host_gtk.cc:25)
==29911==    by 0x8A410D0: TestShell::Initialize(std::basic_string&lt;wchar_t, std::char_traits&lt;wchar_t&gt;, std::allocator&lt;wchar_t&gt; &gt; const&amp;) (test_shell_gtk.cc:349)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>107190</commentid>
    <comment_count>1</comment_count>
      <attachid>27074</attachid>
    <who name="Dean McNamee">deanm</who>
    <bug_when>2009-01-27 08:46:50 -0800</bug_when>
    <thetext>Created attachment 27074
Patch attempt for skipping assignment in reset().</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>107199</commentid>
    <comment_count>2</comment_count>
      <attachid>27074</attachid>
    <who name="Darin Adler">darin</who>
    <bug_when>2009-01-27 09:19:34 -0800</bug_when>
    <thetext>Comment on attachment 27074
Patch attempt for skipping assignment in reset().

This is the wrong level to fix this at. The TextEncoding class&apos;s assignment operator needs to deal with this. Generally, C++ allows you to self-assign, and it&apos;s the assignment operator&apos;s responsibility to deal with that.

I don&apos;t see how memcpy is involved at all in the compiler-generated assignment operator. A TextEncoding has a single data member, a const char*, so I can&apos;t imagine how memcpy gets involved. Is this something specific to some particular platform? Perhaps you&apos;re using a misconfigured development tool that is complaining about something that&apos;s not a real problem.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>107205</commentid>
    <comment_count>3</comment_count>
    <who name="Dean McNamee">deanm</who>
    <bug_when>2009-01-27 09:36:31 -0800</bug_when>
    <thetext>(In reply to comment #2)
&gt; (From update of attachment 27074 [review])
&gt; This is the wrong level to fix this at. The TextEncoding class&apos;s assignment
&gt; operator needs to deal with this. Generally, C++ allows you to self-assign, and
&gt; it&apos;s the assignment operator&apos;s responsibility to deal with that.
&gt; 
&gt; I don&apos;t see how memcpy is involved at all in the compiler-generated assignment
&gt; operator. A TextEncoding has a single data member, a const char*, so I can&apos;t
&gt; imagine how memcpy gets involved. Is this something specific to some particular

It is not a single member:

        const char* m_name;
        UChar m_backslashAsCurrencySymbol;

The compiler uses memcpy() to implement the copy.  I don&apos;t think I am crazy or have misconfigured development tools:

Dump of assembler code for function _ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE:
0x08809df8 &lt;_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+0&gt;:     push   ebp
0x08809df9 &lt;_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+1&gt;:     mov    ebp,esp
0x08809dfb &lt;_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+3&gt;:     sub    esp,0x18
0x08809dfe &lt;_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+6&gt;:     mov    edx,DWORD PTR [ebp+0x8]
0x08809e01 &lt;_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+9&gt;:     mov    DWORD PTR [esp+0x8],0x6
0x08809e09 &lt;_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+17&gt;:    mov    eax,DWORD PTR [ebp+0xc]
0x08809e0c &lt;_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+20&gt;:    mov    DWORD PTR [esp+0x4],eax
0x08809e10 &lt;_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+24&gt;:    mov    DWORD PTR [esp],edx
0x08809e13 &lt;_ZN7WebCore11TextDecoder5resetERKNS_12TextEncodingE+27&gt;:    call   0x8052e84 &lt;memcpy@plt&gt;

From the disassembly, you can easily see the 6 constant passed to memcpy, this of course is sizeof(TextEncoding), a pointer and a UChar.

&gt; platform? Perhaps you&apos;re using a misconfigured development tool that is
&gt; complaining about something that&apos;s not a real problem.
&gt; 

</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>107207</commentid>
    <comment_count>4</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2009-01-27 09:51:07 -0800</bug_when>
    <thetext>(In reply to comment #3)
&gt; The compiler uses memcpy() to implement the copy.  I don&apos;t think I am crazy or
&gt; have misconfigured development tools:

Your compiler uses memcpy, so that means that in the opinion of the folks implementing the compiler, it&apos;s OK to use memcpy when both arguments point to the same address.

But your development tool considers that a dangerous overlapping memcpy.

Those are the misconfigured development tools. Those tools need to agree. Either memcpy is illegal for this case, and hence the C++ compile must not generate that code because self-assignment is legal and supported in C++. Or memcpy is legal for this case, and your overlapping memcpy tool needs to understand that an not complain.

Of course you&apos;re not &quot;crazy&quot;! I know you are trying to do the right thing here.

But checking before an assignment that the new value isn&apos;t the same as the old is *not* the right thing.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>107213</commentid>
    <comment_count>5</comment_count>
    <who name="Dan Kegel">dank</who>
    <bug_when>2009-01-27 10:28:23 -0800</bug_when>
    <thetext>Sounds like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667
</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>107216</commentid>
    <comment_count>6</comment_count>
    <who name="Dean McNamee">deanm</who>
    <bug_when>2009-01-27 10:51:03 -0800</bug_when>
    <thetext>(In reply to comment #5)
&gt; Sounds like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667
&gt; 

Thanks Dan.  So I guess it&apos;s a GCC bug.

I could implement a custom operator=() that did the obvious thing, that would probably avoid the memcpy() on the structure assignment.  It&apos;s not really a big deal, but it seems nice to avoid the poorly generated code.  Darin?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>107229</commentid>
    <comment_count>7</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2009-01-27 12:13:58 -0800</bug_when>
    <thetext>(In reply to comment #6)
&gt; (In reply to comment #5)
&gt; &gt; Sounds like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667

Good to know!

&gt; I could implement a custom operator=() that did the obvious thing, that would
&gt; probably avoid the memcpy() on the structure assignment. It&apos;s not really a big
&gt; deal, but it seems nice to avoid the poorly generated code.

I think it&apos;s OK to do that either as a performance optimization, or to make it easier to use the tool that detects overlapping memcpy. Neither seems a compelling important reason to make the change to me. We normally optimize performance in three ways: 1) when we can measure the difference, 2) choosing good algorithms, 3) cultivating habits that make the code fast. I&apos;m not sure this qualifies for any of those.

On the other hand, if this would make it easier to use that overlapping memcpy tool, then we could put the code in on that basis. I suspect we might have to add it to classes other than TextEncoding too in the future.

Either way is OK.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>121177</commentid>
    <comment_count>8</comment_count>
    <who name="Eric Seidel (no email)">eric</who>
    <bug_when>2009-05-13 17:30:24 -0700</bug_when>
    <thetext>I&apos;m going to close this as invalid, and leave the Chromium bug for tracking this: http://code.google.com/p/chromium/issues/detail?id=7097

Probably chromium will just add this to the &quot;suppressions&quot; list.

If you believe WebKit still needs to change here, feel free to reopen the bug.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="1"
              isprivate="0"
          >
            <attachid>27074</attachid>
            <date>2009-01-27 08:46:50 -0800</date>
            <delta_ts>2009-01-27 09:19:34 -0800</delta_ts>
            <desc>Patch attempt for skipping assignment in reset().</desc>
            <filename>x.diff</filename>
            <type>text/plain</type>
            <size>1224</size>
            <attacher name="Dean McNamee">deanm</attacher>
            
              <data encoding="base64">ZGlmZiAtLWdpdCBhL1dlYkNvcmUvQ2hhbmdlTG9nIGIvV2ViQ29yZS9DaGFuZ2VMb2cKaW5kZXgg
ZDY0ZjI2MS4uNmExM2ExNiAxMDA2NDQKLS0tIGEvV2ViQ29yZS9DaGFuZ2VMb2cKKysrIGIvV2Vi
Q29yZS9DaGFuZ2VMb2cKQEAgLTEsMyArMSwxNSBAQAorMjAwOS0wMS0yNyAgRGVhbiBNY05hbWVl
ICA8ZGVhbm1AY2hyb21pdW0ub3JnPgorCisgICAgICAgIFJldmlld2VkIGJ5IE5PQk9EWSAoT09Q
UyEpLgorCisgICAgICAgIEFsbG93IFRleHREZWNvZGVyOjpyZXNldCgpIHRvIGJlIGdpdmVuIGl0
cyBvd24gZW5jb2RpbmcuICBUaGlzIGF2b2lkcworICAgICAgICBhbiBhbGlhc2VkIGFzc2lnbm1l
bnQsIHdoaWNoIGNhbiBjYXVzZSBhbiBvdmVybGFwcGluZyBtZW1jcHkoKS4KKworICAgICAgICBo
dHRwczovL2J1Z3Mud2Via2l0Lm9yZy9zaG93X2J1Zy5jZ2k/aWQ9MjM1MDEKKworICAgICAgICAq
IHBsYXRmb3JtL3RleHQvVGV4dERlY29kZXIuY3BwOgorICAgICAgICAoV2ViQ29yZTo6VGV4dERl
Y29kZXI6OnJlc2V0KToKKwogMjAwOS0wMS0yNyAgQXJpeWEgSGlkYXlhdCAgPGFyaXlhLmhpZGF5
YXRAdHJvbGx0ZWNoLmNvbT4KIAogICAgICAgICBSdWJiZXItc3RhbXBlZCBieSBTaW1vbiBIYXVz
bWFubi4KZGlmZiAtLWdpdCBhL1dlYkNvcmUvcGxhdGZvcm0vdGV4dC9UZXh0RGVjb2Rlci5jcHAg
Yi9XZWJDb3JlL3BsYXRmb3JtL3RleHQvVGV4dERlY29kZXIuY3BwCmluZGV4IGUzOWE2YjcuLjg5
ODhlN2IgMTAwNjQ0Ci0tLSBhL1dlYkNvcmUvcGxhdGZvcm0vdGV4dC9UZXh0RGVjb2Rlci5jcHAK
KysrIGIvV2ViQ29yZS9wbGF0Zm9ybS90ZXh0L1RleHREZWNvZGVyLmNwcApAQCAtNDEsNyArNDEs
OSBAQCBUZXh0RGVjb2Rlcjo6VGV4dERlY29kZXIoY29uc3QgVGV4dEVuY29kaW5nJiBlbmNvZGlu
ZykKIAogdm9pZCBUZXh0RGVjb2Rlcjo6cmVzZXQoY29uc3QgVGV4dEVuY29kaW5nJiBlbmNvZGlu
ZykKIHsKLSAgICBtX2VuY29kaW5nID0gZW5jb2Rpbmc7CisgICAgLy8gRG9uJ3QgcGVyZm9ybSBh
biBhbGlhc2VkIGFzc2lnbm1lbnQgZm9yIHRkLnJlc2V0KHRkLmVuY29kaW5nKCkpOworICAgIGlm
ICgmbV9lbmNvZGluZyAhPSAmZW5jb2RpbmcpCisgICAgICAgIG1fZW5jb2RpbmcgPSBlbmNvZGlu
ZzsKICAgICBtX2NvZGVjLmNsZWFyKCk7CiAgICAgbV9jaGVja2VkRm9yQk9NID0gZmFsc2U7CiAg
ICAgbV9udW1CdWZmZXJlZEJ5dGVzID0gMDsK
</data>
<flag name="review"
          id="12995"
          type_id="1"
          status="-"
          setter="darin"
    />
          </attachment>
      

    </bug>

</bugzilla>