WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
NEW
50843
gtk-ews having trouble with non-ascii characters
https://bugs.webkit.org/show_bug.cgi?id=50843
Summary
gtk-ews having trouble with non-ascii characters
Adam Barth
Reported
2010-12-10 13:57:47 PST
../../JavaScriptCore/wtf/TCPageMap.h: In function ‘size_t WTF::fastMallocSize(const void*)’: Traceback (most recent call last): File "/mnt/git/webkit-gtk-ews/WebKitTools/Scripts/webkitpy/tool/bot/queueengine.py", line 108, in run if not self._delegate.process_work_item(work_item): File "/mnt/git/webkit-gtk-ews/WebKitTools/Scripts/webkitpy/tool/commands/queues.py", line 362, in process_work_item if not self.review_patch(patch): File "/mnt/git/webkit-gtk-ews/WebKitTools/Scripts/webkitpy/tool/commands/earlywarningsystem.py", line 92, in review_patch if not self._can_build(): File "/mnt/git/webkit-gtk-ews/WebKitTools/Scripts/webkitpy/tool/commands/earlywarningsystem.py", line 53, in _can_build "--no-update"]) File "/mnt/git/webkit-gtk-ews/WebKitTools/Scripts/webkitpy/tool/commands/queues.py", line 96, in run_webkit_patch return self._tool.executive.run_and_throw_if_fail(webkit_patch_args) File "/mnt/git/webkit-gtk-ews/WebKitTools/Scripts/webkitpy/common/system/executive.py", line 141, in run_and_throw_if_fail exit_code = self._run_command_with_teed_output(args, child_stdout) File "/mnt/git/webkit-gtk-ews/WebKitTools/Scripts/webkitpy/common/system/executive.py", line 126, in _run_command_with_teed_output teed_output.write(output_line) File "/mnt/git/webkit-gtk-ews/WebKitTools/Scripts/webkitpy/common/system/deprecated_logging.py", line 55, in write file.write(bytes) File "/mnt/git/webkit-gtk-ews/WebKitTools/Scripts/webkitpy/common/system/deprecated_logging.py", line 55, in write file.write(bytes) File "/usr/lib/python2.6/codecs.py", line 691, in write return self.writer.write(data) File "/usr/lib/python2.6/codecs.py", line 351, in write data, consumed = self.encode(object, self.errors) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 50: ordinal not in range(128)
Attachments
Add attachment
proposed patch, testcase, etc.
Eric Seidel (no email)
Comment 1
2010-12-10 14:05:28 PST
I'm not sure how to solve this. I remember explicitly moving tee() to operate on bytes instead of unicode strings long ago. This seems to suggest that the logging module is using a codecs.open'd log file and trying to decode the byte stream we're sending to it.
Eric Seidel (no email)
Comment 2
2010-12-10 14:07:39 PST
Maybe it doesn't make sense to write bytes to std out?
Leandro Pereira
Comment 3
2010-12-14 09:49:44 PST
(In reply to
comment #0
)
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 50: > ordinal not in range(128)
BTW, having the same problem with EFL-EWS. (In reply to
comment #2
)
> Maybe it doesn't make sense to write bytes to std out?
Maybe only saving to a log file and printing the file name would help? Less noise on EWS output, and debuggable whenever needed.
Eric Seidel (no email)
Comment 4
2010-12-14 11:53:45 PST
We haven't yet been able to produce a minimal reduction. However we used: setenv LANG en_US.US-ASCII to work around the issue on the gtk-ews for the moment.
Eric Seidel (no email)
Comment 5
2010-12-14 11:54:07 PST
gcc seems to like to print fancy quotes in recent versions
Leandro Pereira
Comment 6
2010-12-14 12:24:07 PST
(In reply to
comment #4
)
> However we used: > setenv LANG en_US.US-ASCII > to work around the issue on the gtk-ews for the moment.
Using the same workaround on EFL-EWS. Seems it's working.
Eric Seidel (no email)
Comment 7
2011-06-20 17:12:34 PDT
I'm struggling to reproduce this with a minimal example. I'm not sure how we're hitting this. I could see we might hit a decoding error with run_and_throw_if_fail(cmd, silent=True), because /dev/null is opened w/o any encoding. But I don't see how we hit this case. What stream are we opening with encoding of ascii? The logging stream? Maybe the python on that system doesn't correctly default to utf8? What's the lang value before we override it to US-ASCII?
Martin Robinson
Comment 8
2011-06-20 20:18:31 PDT
(In reply to
comment #7
)
> But I don't see how we hit this case. What stream are we opening with encoding of ascii? The logging stream? Maybe the python on that system doesn't correctly default to utf8?
Looks like you can figure out the default encoding in Python by running this in the REPL: import sys sys.getdefaultencoding()
Eric Seidel (no email)
Comment 9
2011-06-20 22:47:52 PDT
On both my mac and on linux, sys.getdefaultencoding() returns 'ascii'. It does seem like that must be the encoding we're hitting. The question is what file is opened with default encoding? I assume it must be stderr/stdout. But why?
Eric Seidel (no email)
Comment 10
2011-06-27 10:16:45 PDT
If I were able to reproduce this this would be easy to fix. But I failed to make a reduced python script on our EC2 bots. I could probably just edit a webkit file and wait for a whole build to fail, but I was too lazy to try that.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug