<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>46892</bug_id>
          
          <creation_ts>2010-09-30 06:12:33 -0700</creation_ts>
          <short_desc>webkitpy.common.system.executive_unittest.ExecutiveTest.test_run_command_with_unicode fails on Windows</short_desc>
          <delta_ts>2010-10-01 14:07:34 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>Tools / Tests</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Windows XP</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc>http://build.webkit.org/builders/Windows%20Debug%20%28Tests%29/builds/20495/steps/webkitpy-test/logs/stdio</bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords>InRadar, PlatformOnly</keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Adam Roben (:aroben)">aroben</reporter>
          <assigned_to name="Nobody">webkit-unassigned</assigned_to>
          <cc>abarth</cc>
    
    <cc>eric</cc>
    
    <cc>webkit.review.bot</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>287507</commentid>
    <comment_count>0</comment_count>
    <who name="Adam Roben (:aroben)">aroben</who>
    <bug_when>2010-09-30 06:12:33 -0700</bug_when>
    <thetext>Here&apos;s the failure:


ERROR: Validate that it is safe to pass unicode() objects
----------------------------------------------------------------------
Traceback (most recent call last):
  File &quot;/home/buildbot/slave/win-debug-tests/build/WebKitTools/Scripts/webkitpy/common/system/executive_unittest.py&quot;, line 63, in test_run_command_with_unicode
    output = executive.run_command([&quot;echo&quot;, &quot;-n&quot;, unicode_tor])
  File &quot;/home/buildbot/slave/win-debug-tests/build/WebKitTools/Scripts/webkitpy/common/system/executive.py&quot;, line 292, in run_command
    close_fds=self._should_close_fds())
  File &quot;/usr/lib/python2.5/subprocess.py&quot;, line 594, in __init__
    errread, errwrite)
  File &quot;/usr/lib/python2.5/subprocess.py&quot;, line 1091, in _execute_child
    raise child_exception
TypeError: execv() arg 2 must contain only strings</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>287510</commentid>
    <comment_count>1</comment_count>
    <who name="Adam Roben (:aroben)">aroben</who>
    <bug_when>2010-09-30 06:13:42 -0700</bug_when>
    <thetext>&lt;rdar://problem/8496639&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>287516</commentid>
    <comment_count>2</comment_count>
    <who name="Adam Roben (:aroben)">aroben</who>
    <bug_when>2010-09-30 06:31:10 -0700</bug_when>
    <thetext>I&apos;m not really sure what to do with this one. Seems likely to be a limitation of Cygwin Python.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>287625</commentid>
    <comment_count>3</comment_count>
    <who name="Adam Barth">abarth</who>
    <bug_when>2010-09-30 10:25:44 -0700</bug_when>
    <thetext>Maybe we need to encode in UTF8?  Sounds like a question for Eric.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>287649</commentid>
    <comment_count>4</comment_count>
    <who name="Eric Seidel (no email)">eric</who>
    <bug_when>2010-09-30 10:47:02 -0700</bug_when>
    <thetext>Looks like we&apos;ll need to encode the args for windows.

See:

        args = map(unicode, args)  # Popen will throw an exception if args are non-strings (like int())
in POpen.run_command

I don&apos;t know what encoding cygwin would want things passed to it on the command line, but utf8 sounds like a good try.

Just add an if block after that line which is:

# cygwin&apos;s os.execv in YOUR_PYTHON_VERSION doesn&apos;t seem to like unicode strings.
if sys.platform == &quot;cygwin&quot;:
    args = [arg.encode(&quot;utf8&quot;) for arg in args]


If this is version specific, we&apos;ll need to add a version check there too.

I&apos;ve long thought we should consider making a POpen subclass which we use everywhere to do fixups like this.  Maybe now is the time.  Popen.__init__ should be able to take unicode strings.  It has to for python 3.0 for sure!</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>287650</commentid>
    <comment_count>5</comment_count>
    <who name="Eric Seidel (no email)">eric</who>
    <bug_when>2010-09-30 10:47:55 -0700</bug_when>
    <thetext>Actually, my comment is subtly wrong.  str is not really a &quot;string&quot;.  It was renamed to bytearray() in python 3.x.  unicode is the one true string in any version of python.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>287789</commentid>
    <comment_count>6</comment_count>
    <who name="Adam Roben (:aroben)">aroben</who>
    <bug_when>2010-09-30 13:53:11 -0700</bug_when>
    <thetext>(In reply to comment #4)
&gt; Looks like we&apos;ll need to encode the args for windows.
&gt; 
&gt; See:
&gt; 
&gt;         args = map(unicode, args)  # Popen will throw an exception if args are non-strings (like int())
&gt; in POpen.run_command
&gt; 
&gt; I don&apos;t know what encoding cygwin would want things passed to it on the command line, but utf8 sounds like a good try.

It looks like Cygwin ends up calling mbstowcs, which uses the current code page. I wonder how we can get access to that from Python...</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>288230</commentid>
    <comment_count>7</comment_count>
    <who name="Adam Roben (:aroben)">aroben</who>
    <bug_when>2010-10-01 06:57:45 -0700</bug_when>
    <thetext>(In reply to comment #6)
&gt; (In reply to comment #4)
&gt; &gt; Looks like we&apos;ll need to encode the args for windows.
&gt; &gt; 
&gt; &gt; See:
&gt; &gt; 
&gt; &gt;         args = map(unicode, args)  # Popen will throw an exception if args are non-strings (like int())
&gt; &gt; in POpen.run_command
&gt; &gt; 
&gt; &gt; I don&apos;t know what encoding cygwin would want things passed to it on the command line, but utf8 sounds like a good try.
&gt; 
&gt; It looks like Cygwin ends up calling mbstowcs, which uses the current code page. I wonder how we can get access to that from Python...

Even if we were to do this, we&apos;d run into the problem of code pages not being able to encode certain characters. For instance, the code page on my machine is 437, which corresponds to the cp437 codec in Python. It is not able to handle the strings in the failing test.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>288231</commentid>
    <comment_count>8</comment_count>
    <who name="Adam Roben (:aroben)">aroben</who>
    <bug_when>2010-10-01 06:58:59 -0700</bug_when>
    <thetext>So I think we&apos;ve hit a limitation of Cygwin here. Cygwin is not using UTF-16 everywhere like Windows normally does, and thus is not able to handle characters that don&apos;t have a representation in the current code page.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>288268</commentid>
    <comment_count>9</comment_count>
    <who name="Eric Seidel (no email)">eric</who>
    <bug_when>2010-10-01 09:03:07 -0700</bug_when>
    <thetext>We should just encode with replacement characters then.  Or some sort of reasonable fallback.  And have the test expect said reasonable fallback on cygwin.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>288412</commentid>
    <comment_count>10</comment_count>
      <attachid>69501</attachid>
    <who name="Adam Roben (:aroben)">aroben</who>
    <bug_when>2010-10-01 12:26:07 -0700</bug_when>
    <thetext>Created attachment 69501
Encode Executive command arguments using UTF-8 on Cygwin</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>288415</commentid>
    <comment_count>11</comment_count>
      <attachid>69501</attachid>
    <who name="Adam Barth">abarth</who>
    <bug_when>2010-10-01 12:31:48 -0700</bug_when>
    <thetext>Comment on attachment 69501
Encode Executive command arguments using UTF-8 on Cygwin

Sorry abou the copy/paste.  It&apos;s a long standing problem that we need to unify these two codepaths in executive.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>288425</commentid>
    <comment_count>12</comment_count>
    <who name="Adam Roben (:aroben)">aroben</who>
    <bug_when>2010-10-01 12:43:37 -0700</bug_when>
    <thetext>Committed r68913: &lt;http://trac.webkit.org/changeset/68913&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>288471</commentid>
    <comment_count>13</comment_count>
    <who name="WebKit Review Bot">webkit.review.bot</who>
    <bug_when>2010-10-01 14:07:34 -0700</bug_when>
    <thetext>http://trac.webkit.org/changeset/68913 might have broken GTK Linux 32-bit Debug</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="1"
              isprivate="0"
          >
            <attachid>69501</attachid>
            <date>2010-10-01 12:26:07 -0700</date>
            <delta_ts>2010-10-01 12:31:48 -0700</delta_ts>
            <desc>Encode Executive command arguments using UTF-8 on Cygwin</desc>
            <filename>bug-46892-20101001152606.patch</filename>
            <type>text/plain</type>
            <size>3523</size>
            <attacher name="Adam Roben (:aroben)">aroben</attacher>
            
              <data encoding="base64">ZGlmZiAtLWdpdCBhL1dlYktpdFRvb2xzL0NoYW5nZUxvZyBiL1dlYktpdFRvb2xzL0NoYW5nZUxv
ZwppbmRleCBkMzNiY2NkYTU4Nzg0MzA4YzJhMjlmOGViZTZkOWZjYTEyMTQ5OWRiLi5lNzFkOGRh
ZjY0ZGIyMjU5N2U4MTkyNzI0N2I4NDQxYTk3ZDg0NjY1IDEwMDY0NAotLS0gYS9XZWJLaXRUb29s
cy9DaGFuZ2VMb2cKKysrIGIvV2ViS2l0VG9vbHMvQ2hhbmdlTG9nCkBAIC0xLDUgKzEsMjkgQEAK
IDIwMTAtMTAtMDEgIEFkYW0gUm9iZW4gIDxhcm9iZW5AYXBwbGUuY29tPgogCisgICAgICAgIEVu
Y29kZSBFeGVjdXRpdmUgY29tbWFuZCBhcmd1bWVudHMgdXNpbmcgVVRGLTggb24gQ3lnd2luCisK
KyAgICAgICAgQ3lnd2luJ3MgUHl0aG9uJ3Mgb3MuZXhlY3YgZG9lc24ndCBzdXBwb3J0IHVuaWNv
ZGUgY29tbWFuZCBhcmd1bWVudHMuCisgICAgICAgIEN5Z3dpbidzIGV4ZWN2IGV4cGVjdHMgYXJn
dW1lbnRzIHRvIGJlIGVuY29kZWQgdXNpbmcgdGhlIGN1cnJlbnQgY29kZQorICAgICAgICBwYWdl
LiBCdXQgY29kZSBwYWdlcyBhcmUgbGltaXRlZCBpbiB3aGF0IGNoYXJhY3RlcnMgdGhleSBjYW4g
aGFuZGxlLAorICAgICAgICBhbmQgb3VyIHRlc3RzIGluY2x1ZGUgY2hhcmFjdGVycyB0aGF0IHRo
ZSBFbmdsaXNoIGNvZGUgcGFnZSBjYW4ndAorICAgICAgICBoYW5kbGUuICBTbyBmb3Igbm93IHdl
J2xsIGp1c3QgZW5jb2RlIGV2ZXJ5dGhpbmcgaW4gVVRGLTggb24gQ3lnd2luLAorICAgICAgICB3
aGljaCBjYW4gaGFuZGxlIGFsbCBjaGFyYWN0ZXJzIGJ1dCBtaWdodCBjb25mdXNlIHNvbWUgY29t
bWFuZHMsIGZvcgorICAgICAgICBleHBlZGllbmN5J3Mgc2FrZS4gSSdtIHN1cmUgd2UnbGwgcnVu
IGludG8gY2FzZXMgd2hlcmUgVVRGLTggaXNuJ3QKKyAgICAgICAgZ29vZCBlbm91Z2gsIGJ1dCB3
ZSBjYW4gZGVhbCB3aXRoIHRoYXQgd2hlbiB0aGUgcHJvYmxlbSBhcmlzZXMuCisKKyAgICAgICAg
UmV2aWV3ZWQgYnkgTk9CT0RZIChPT1BTISkuCisKKyAgICAgICAgRml4ZXMgPGh0dHA6Ly93ZWJr
aXQub3JnL2IvNDY4OTI+IDxyZGFyOi8vcHJvYmxlbS84NDk2NjM5PgorICAgICAgICB3ZWJraXRw
eS5jb21tb24uc3lzdGVtLmV4ZWN1dGl2ZV91bml0dGVzdC5FeGVjdXRpdmVUZXN0LnRlc3RfcnVu
X2NvbW1hbmRfd2l0aF91bmljb2RlCisgICAgICAgIGZhaWxzIG9uIFdpbmRvd3MKKworICAgICAg
ICAqIFNjcmlwdHMvd2Via2l0cHkvY29tbW9uL3N5c3RlbS9leGVjdXRpdmUucHk6CisgICAgICAg
IChFeGVjdXRpdmUuX3J1bl9jb21tYW5kX3dpdGhfdGVlZF9vdXRwdXQpOgorICAgICAgICAoRXhl
Y3V0aXZlLnJ1bl9jb21tYW5kKToKKyAgICAgICAgT24gQ3lnd2luLCBlbmNvZGUgYXJndW1lbnRz
IHVzaW5nIFVURi04LgorCisyMDEwLTEwLTAxICBBZGFtIFJvYmVuICA8YXJvYmVuQGFwcGxlLmNv
bT4KKwogICAgICAgICBEb24ndCBhc3N1bWUgQWNjZXNzaWJsZU9iamVjdEZyb21FdmVudCBzdWNj
ZWVkcwogCiAgICAgICAgIEZpeGVzIDxodHRwOi8vd2Via2l0Lm9yZy9iLzQ0MTM2PiA8cmRhcjov
L3Byb2JsZW0vODMyMTY4ND4gQ3Jhc2ggaW4KZGlmZiAtLWdpdCBhL1dlYktpdFRvb2xzL1Njcmlw
dHMvd2Via2l0cHkvY29tbW9uL3N5c3RlbS9leGVjdXRpdmUucHkgYi9XZWJLaXRUb29scy9TY3Jp
cHRzL3dlYmtpdHB5L2NvbW1vbi9zeXN0ZW0vZXhlY3V0aXZlLnB5CmluZGV4IDdjMDBmMjI0NzE3
Y2Q3NjFjYjUxNTZlZjkwNTI1MzczMGM3YWUwYjMuLjIxNmNmNThhNzk0NGNmNTE0YjRmNWE4NTBm
OTNjZDhkZTJhOGFiYWMgMTAwNjQ0Ci0tLSBhL1dlYktpdFRvb2xzL1NjcmlwdHMvd2Via2l0cHkv
Y29tbW9uL3N5c3RlbS9leGVjdXRpdmUucHkKKysrIGIvV2ViS2l0VG9vbHMvU2NyaXB0cy93ZWJr
aXRweS9jb21tb24vc3lzdGVtL2V4ZWN1dGl2ZS5weQpAQCAtMTAzLDYgKzEwMywxMyBAQCBjbGFz
cyBFeGVjdXRpdmUob2JqZWN0KToKIAogICAgIGRlZiBfcnVuX2NvbW1hbmRfd2l0aF90ZWVkX291
dHB1dChzZWxmLCBhcmdzLCB0ZWVkX291dHB1dCk6CiAgICAgICAgIGFyZ3MgPSBtYXAodW5pY29k
ZSwgYXJncykgICMgUG9wZW4gd2lsbCB0aHJvdyBhbiBleGNlcHRpb24gaWYgYXJncyBhcmUgbm9u
LXN0cmluZ3MgKGxpa2UgaW50KCkpCisgICAgICAgIGlmIHN5cy5wbGF0Zm9ybSA9PSAnY3lnd2lu
JzoKKyAgICAgICAgICAgICMgQ3lnd2luJ3MgUHl0aG9uJ3Mgb3MuZXhlY3YgZG9lc24ndCBzdXBw
b3J0IHVuaWNvZGUgY29tbWFuZAorICAgICAgICAgICAgIyBhcmd1bWVudHMsIGFuZCBuZWl0aGVy
IGRvZXMgQ3lnd2luJ3MgZXhlY3YgaXRzZWxmLgorICAgICAgICAgICAgIyBGSVhNRTogVXNpbmcg
VVRGLTggaGVyZSB3aWxsIGNvbmZ1c2UgV2luZG93cy1uYXRpdmUgY29tbWFuZHMKKyAgICAgICAg
ICAgICMgd2hpY2ggd2lsbCBleHBlY3QgYXJndW1lbnRzIHRvIGJlIGVuY29kZWQgdXNpbmcgdGhl
IGN1cnJlbnQgY29kZQorICAgICAgICAgICAgIyBwYWdlLgorICAgICAgICAgICAgYXJncyA9IFth
cmcuZW5jb2RlKCd1dGYtOCcpIGZvciBhcmcgaW4gYXJnc10KICAgICAgICAgY2hpbGRfcHJvY2Vz
cyA9IHN1YnByb2Nlc3MuUG9wZW4oYXJncywKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgc3Rkb3V0PXN1YnByb2Nlc3MuUElQRSwKICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgc3RkZXJyPXN1YnByb2Nlc3MuU1RET1VULApAQCAtMjgxLDYg
KzI4OCwxMyBAQCBjbGFzcyBFeGVjdXRpdmUob2JqZWN0KToKICAgICAgICAgYXNzZXJ0KGlzaW5z
dGFuY2UoYXJncywgbGlzdCkgb3IgaXNpbnN0YW5jZShhcmdzLCB0dXBsZSkpCiAgICAgICAgIHN0
YXJ0X3RpbWUgPSB0aW1lLnRpbWUoKQogICAgICAgICBhcmdzID0gbWFwKHVuaWNvZGUsIGFyZ3Mp
ICAjIFBvcGVuIHdpbGwgdGhyb3cgYW4gZXhjZXB0aW9uIGlmIGFyZ3MgYXJlIG5vbi1zdHJpbmdz
IChsaWtlIGludCgpKQorICAgICAgICBpZiBzeXMucGxhdGZvcm0gPT0gJ2N5Z3dpbic6CisgICAg
ICAgICAgICAjIEN5Z3dpbidzIFB5dGhvbidzIG9zLmV4ZWN2IGRvZXNuJ3Qgc3VwcG9ydCB1bmlj
b2RlIGNvbW1hbmQKKyAgICAgICAgICAgICMgYXJndW1lbnRzLCBhbmQgbmVpdGhlciBkb2VzIEN5
Z3dpbidzIGV4ZWN2IGl0c2VsZi4KKyAgICAgICAgICAgICMgRklYTUU6IFVzaW5nIFVURi04IGhl
cmUgd2lsbCBjb25mdXNlIFdpbmRvd3MtbmF0aXZlIGNvbW1hbmRzCisgICAgICAgICAgICAjIHdo
aWNoIHdpbGwgZXhwZWN0IGFyZ3VtZW50cyB0byBiZSBlbmNvZGVkIHVzaW5nIHRoZSBjdXJyZW50
IGNvZGUKKyAgICAgICAgICAgICMgcGFnZS4KKyAgICAgICAgICAgIGFyZ3MgPSBbYXJnLmVuY29k
ZSgndXRmLTgnKSBmb3IgYXJnIGluIGFyZ3NdCiAgICAgICAgIHN0ZGluLCBzdHJpbmdfdG9fY29t
bXVuaWNhdGUgPSBzZWxmLl9jb21wdXRlX3N0ZGluKGlucHV0KQogICAgICAgICBzdGRlcnIgPSBz
dWJwcm9jZXNzLlNURE9VVCBpZiByZXR1cm5fc3RkZXJyIGVsc2UgTm9uZQogCg==
</data>
<flag name="review"
          id="59206"
          type_id="1"
          status="+"
          setter="abarth"
    />
          </attachment>
      

    </bug>

</bugzilla>