Bug 128971 - [GTK] webkit 2.3.5 build failure with python 3.3.4
Summary: [GTK] webkit 2.3.5 build failure with python 3.3.4
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC Other
: P3 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-02-18 07:04 PST by Patrick Welche
Modified: 2016-01-02 01:40 PST (History)
6 users (show)

See Also:


Attachments
build fix (1.23 KB, patch)
2014-02-27 16:37 PST, Patrick Welche
no flags Details | Formatted Diff | Diff
Patch (1.17 KB, patch)
2015-09-15 12:06 PDT, Alberto Garcia
mcatanzaro: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Patrick Welche 2014-02-18 07:04:36 PST
Trying to build webkit 2.3.5 with python 3.3.4, I see

  GEN      DerivedSources/WebCore/XMLViewerJS.h
Traceback (most recent call last):
  File "./Source/JavaScriptCore/inspector/scripts/cssmin.py", line 44, in <module>
    sys.stdout.write(cssminify(sys.stdin.read()))
  File "/usr/pkg/lib/python3.3/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 642: ordinal not in range(128)
GNUmakefile:81815: recipe for target 'DerivedSources/WebCore/XMLViewerCSS.h' failed


(Also, GNU bison 3.0.2 isn't happy with

DerivedSources/WebCore/CSSGrammar.y:62.1-12: warning: deprecated directive, use '%pure-parser' [-Wdeprecated]
 %pure_parser
 ^^^^^^^^^^^^
)
Comment 1 Patrick Welche 2014-02-19 15:07:57 PST
Changing component to WebCore as that is were the build fault happens. It seems as though webkitgtk 2.3.5 contains

m4_define([webkit_user_agent_major_version], [538])
m4_define([webkit_user_agent_minor_version], [15])

version 538 of webkit?
Comment 2 Alberto Garcia 2014-02-27 04:02:08 PST
Probably a duplicate of bug 128927, can you try with the patch attached there?
Comment 3 Patrick Welche 2014-02-27 06:28:15 PST
It might be similar, but it isn't the same: you had the bug building documentation, so the python program involved was gtkdoc.py, here I am building WebCore, and the python program involved is cssmin.py.

sys.stdout.write(cssminify(sys.stdin.read()))

has to be imbued with UTF-8 somehow?
Comment 4 Alberto Garcia 2014-02-27 06:33:30 PST
(In reply to comment #3)
> sys.stdout.write(cssminify(sys.stdin.read()))
>
> has to be imbued with UTF-8 somehow?

I think so, see this:

http://bugs.python.org/issue4947
Comment 5 Patrick Welche 2014-02-27 15:05:57 PST
Interesting! That bug seems to revolve around Python 2.7. I am surprised by:

That bug:
>>> type(sys.stdout)
<type 'file'>
>>> sys.stdout.encoding
'UTF-8'

My Python 3.3.4:
>>> type(sys.stdout)
<class '_io.TextIOWrapper'>
>>> sys.stdout.encoding
'646'

Presumably 646 is the ISO successor to ASCII which would explain the message...
It seems that Python 2.7 had a problem even when UTF-8 was set.
Comment 6 Patrick Welche 2014-02-27 15:16:29 PST
P.S. on my Python 3.3.4 installation

>>> sys.getdefaultencoding()
'utf-8'
Comment 7 Patrick Welche 2014-02-27 16:06:09 PST
Actually, it makes far more sense not to change the python, but to change the file that it processes. It seems that the only unicode characters in XMLViewer.css which cssmin.py processes are there because someone thought it would look pretty to change " to unicode double quotes in the licence. I think I can live with the ugly but practical ascii ".
Comment 8 Patrick Welche 2014-02-27 16:37:03 PST
Created attachment 225428 [details]
build fix

Pretty quotes in licence break Python stdin.
Comment 9 Alberto Garcia 2014-03-04 04:29:32 PST
(In reply to comment #8)
> Created an attachment (id=225428) [details]
> build fix
> 
> Pretty quotes in licence break Python stdin.

Although that fixes the build issue, I think I'd rather fix the python
code so that it will not happen again in the future.
Comment 10 Patrick Welche 2014-03-04 15:21:46 PST
Why not do both? Apply this trivial patch now, so that webkitgtk 2.4 won't have the problem. Then ponder the better solution.
Comment 11 Vicente Olivert Riera 2015-09-09 06:30:24 PDT
Ping.

Is there any strong reason to not change those unicode double quotes by the ascii ones in the upstream code?
Comment 12 Alberto Garcia 2015-09-15 03:49:01 PDT
(In reply to comment #11)
> Is there any strong reason to not change those unicode double quotes
> by the ascii ones in the upstream code?

I'm not a big fan of sweeping bugs under the rug rather than fixing
them for good, but this has been certainly open for too long and the
change is really not important.

So I'm not against changing the quotes. The patch would need a
ChangeLog entry (you can use 'Tools/Scripts/prepare-ChangeLog').

Still I would like to know what's the problem exactly. How do you
reproduce this problem and how come it doesn't break for everyone?
Comment 13 Michael Catanzaro 2015-09-15 05:44:33 PDT
CC Darin: question for you at the bottom of this comment.

It's because cssmin.py uses the following shebang:

#!/usr/bin/python

For about four years now, this has declared that the script will run correctly under both python2 and python3 [1], but our script is only compatible with python2. (Yes, this was a compatibility break requiring changing all the shebangs of all python files everywhere. Note that /usr/bin/python is a symlink to /usr/bin/python3 on Arch, but to /usr/bin/python2 on other distros, so the bug presumably only occurs for Arch users. I have no comment on whether any of this was a good idea. ;)

The script must be modified to work with python3 in addition to python2 (the worst way to do this would be to ensure that no files it process include Unicode characters; the best way would be for someone who groks python to just fix the broken input stream), or the shebang must be changed to use /usr/bin/python2 instead of /usr/bin/python.

I'm quite impressed that this is apparently the only such compatibility issue in our build.

Warning: OS X used to not provide /usr/bin/python2. Presumably they do nowadays, since otherwise it will be a pain to fix the shebang, but maybe they don't. (I've CCed Darin to find out if we still need to support this case.) Without a /usr/bin/python2, you'd have to find the right python2 binary at the CMake step, then generate the python script using a CMake template file to substitute the right python binary into the shebang. In Autotools land you would do that using AM_PATH_PYTHON. Probably easier just to fix the script to work with python3.

[1] https://www.python.org/dev/peps/pep-0394/
Comment 14 Michael Catanzaro 2015-09-15 05:45:05 PDT
Forgot to CC Darin: question for you in the comment above.
Comment 15 Vicente Olivert Riera 2015-09-15 05:50:59 PDT
(In reply to comment #13)
> the best way would be for someone who groks python to
> just fix the broken input stream), or the shebang must be changed to use
> /usr/bin/python2 instead of /usr/bin/python.

I'd suggest #!/usr/bin/env python2
Comment 16 Michael Catanzaro 2015-09-15 08:03:26 PDT
(In reply to comment #15)
> (In reply to comment #13)
> > the best way would be for someone who groks python to
> > just fix the broken input stream), or the shebang must be changed to use
> > /usr/bin/python2 instead of /usr/bin/python.
> 
> I'd suggest #!/usr/bin/env python2

Note that various distributions (e.g. Debian) prohibit using env for installed scripts, as the script could then be run in an unexpected interpreter. For scripts only needed for the build, it's probably fine.

But again, we can only do that if all supported versions of OS X provide a python2 in $PATH.
Comment 17 Alberto Garcia 2015-09-15 08:17:53 PDT
(In reply to comment #13)

> It's because cssmin.py uses the following shebang:
>
> #!/usr/bin/python
>
> For about four years now, this has declared that the script will run
> correctly under both python2 and python3 [1], but our script is only
> compatible with python2.

Is it really like that?

These work fine in my computer:

$ python2 --version
Python 2.7.10
$ python2 ./Source/WebInspectorUI/Scripts/cssmin.py < ./Source/WebCore/xml/XMLViewer.css > XMLViewer.min.css

$ python3 --version
Python 3.4.3+
$ python3 ./Source/WebInspectorUI/Scripts/cssmin.py < ./Source/WebCore/xml/XMLViewer.css > XMLViewer.min.css
Comment 18 Michael Catanzaro 2015-09-15 09:36:02 PDT
Good find Berto. I tested Python 2.7.10 and 3.4.2. sys.stdout.encoding returns UTF-8 in both versions. But in Patrick's report above, it returns 646 in Python 3.3.4. I am presuming the version number is responsible for this difference, and not some other difference in configuration, but I can't find any documentation about this change, not even in the Python 3.4 release notes, so I'm not sure.

Anyway, the script *appears* to be incompatible with earlier versions of Python 3. I would therefore suggest RESOLVED -> OBSOLETE for this bug. If you're using older Python 3 you can continue to replace the quotes like you're already doing.
Comment 19 Alberto Garcia 2015-09-15 12:06:56 PDT
Created attachment 261222 [details]
Patch

(In reply to comment #18)
> Good find Berto. I tested Python 2.7.10 and
> 3.4.2. sys.stdout.encoding returns UTF-8 in both versions. But in
> Patrick's report above, it returns 646 in Python 3.3.4. I am
> presuming the version number is responsible for this difference

Nah, it's not a problem with the Python version, it's the locale of
the build environment:

$ LC_ALL=C python2 ./Source/WebInspectorUI/Scripts/cssmin.py < ./Source/WebCore/xml/XMLViewer.css > /dev/null

$ LC_ALL=C python3 ./Source/WebInspectorUI/Scripts/cssmin.py < ./Source/WebCore/xml/XMLViewer.css > /dev/null
Traceback (most recent call last):
  File "./Source/WebInspectorUI/Scripts/cssmin.py", line 46, in <module>
    sys.stdout.write(cssminify(sys.stdin.read()))
  File "/usr/lib/python3.4/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 642: ordinal not in range(128)

So if the locale is set to ascii, then python 3 expects the standard
input to be ascii, and otherwise it complains.

We can tell people to use UTF-8 for building WebKit, but we can also
force the stdin enconding using this patch.
Comment 20 Alberto Garcia 2016-01-02 01:40:46 PST
Committed r194498: <http://trac.webkit.org/changeset/194498>