Bug 192909

Summary: webkitpy: Autoinstall package URLs have changed
Product: WebKit Reporter: Simon Fraser (smfr) <simon.fraser>
Component: Tools / TestsAssignee: Jonathan Bedard <jbedard>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, bburg, commit-queue, dbates, ews-watchlist, glenn, jbedard, lforschler, simon.fraser, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=192954
Attachments:
Description Flags
Patch
none
Patch
none
Patch ap: review+, commit-queue: commit-queue-

Description Simon Fraser (smfr) 2018-12-19 18:35:27 PST
_install_selenium() calls get_latest_pypi_url() which uses an old URL: https://pypi.python.org/pypi/

 128$ $ curl https://pypi.python.org/pypi/selenium/json
<html><head><title>301 Moved Permanently</title></head><body><center><h1>301 Moved Permanently</h1></center></body></html>[smfr@simfra /Volumes/Data/Development/system/webkit/OpenSource]

and urllib2.urlopen(json_url) just hangs.

The right URL seems to be https://pypi.org/pypi/%s/json which works with curl, but if I change the script to this, urllib2.urlopen(json_url) still hangs.

We also reference pypi.python.org in lots of other places.
Comment 1 Radar WebKit Bug Importer 2018-12-19 18:36:24 PST
<rdar://problem/46860359>
Comment 2 Radar WebKit Bug Importer 2018-12-19 18:36:24 PST
<rdar://problem/46860360>
Comment 3 Simon Fraser (smfr) 2018-12-19 18:37:55 PST
This means no-one can run webkitpy tests if they haven't before.
Comment 4 Jonathan Bedard 2018-12-20 10:49:23 PST
Yikes....Looking into this, looks like there are actually 3 problems:

1) As Simon points out, pypi.python.org is defunct, it's now pypi.org
2) Something weird is going on with both urllib2 and requests, those libraries have a hard time communicating with pypi.org, even though curl can do so just fine
3) (Unique to Selenium) Selenium tries to communicate with pypi before it checks if it's actually installed. Which is a bit crazy since if we have a good version of Selenium, we can just skip all the networks stuff.
Comment 5 Jonathan Bedard 2018-12-20 11:19:46 PST
(In reply to Jonathan Bedard from comment #4)
> Yikes....Looking into this, looks like there are actually 3 problems:
> 
> 1) As Simon points out, pypi.python.org is defunct, it's now pypi.org
> 2) Something weird is going on with both urllib2 and requests, those
> libraries have a hard time communicating with pypi.org, even though curl can
> do so just fine

Turns out, this is actually partially #1 (all of our absolute package URLs are old) and partially the fact that our Selenium didn't define a timeout.

> 3) (Unique to Selenium) Selenium tries to communicate with pypi before it
> checks if it's actually installed. Which is a bit crazy since if we have a
> good version of Selenium, we can just skip all the networks stuff.
Comment 6 Jonathan Bedard 2018-12-20 11:28:45 PST
Created attachment 357837 [details]
Patch
Comment 7 Jonathan Bedard 2018-12-20 11:44:02 PST
Created attachment 357843 [details]
Patch
Comment 8 Jonathan Bedard 2018-12-20 11:59:47 PST
Created attachment 357846 [details]
Patch
Comment 9 Jonathan Bedard 2018-12-20 12:53:06 PST
Just realized that the only reason our bots haven't hit this is that they have cached versions of the libraries. This patch changes the urls in the cache, so this is almost certainly going to break any EWS machine it runs on until it lands.
Comment 10 Alexey Proskuryakov 2018-12-20 13:03:31 PST
Comment on attachment 357846 [details]
Patch

rs=me
Comment 11 WebKit Commit Bot 2018-12-20 13:13:49 PST
Comment on attachment 357846 [details]
Patch

Rejecting attachment 357846 [details] from commit-queue.

Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-02', 'validate-changelog', '--check-oops', '--non-interactive', 357846, '--port=mac']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit

No handlers could be found for logger "webkitpy.common.system.autoinstall"
Traceback (most recent call last):
  File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch", line 44, in <module>
    from webkitpy.tool.main import WebKitPatch
  File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/main.py", line 41, in <module>
    from webkitpy.common.net.statusserver import StatusServer
  File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/statusserver.py", line 33, in <module>
    from webkitpy.common.net.bugzilla.attachment import Attachment
  File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/bugzilla/__init__.py", line 4, in <module>
    from .bugzilla import Bugzilla
  File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/bugzilla/bugzilla.py", line 49, in <module>
    from webkitpy.common.net.credentials import Credentials
  File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/credentials.py", line 40, in <module>
    from webkitpy.thirdparty.autoinstalled import keyring
  File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/__init__.py", line 100, in find_module
    self._install_keyring()
  File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/__init__.py", line 124, in _install_keyring
    "keyring-7.3.1/keyring")
  File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/__init__.py", line 257, in _install
    installer.install(url=url, url_subpath=url_subpath, target_name=target_name)
  File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/system/autoinstall.py", line 517, in install
    raise Exception(message)
Exception: Error auto-installing the keyring package to:
 "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/keyring"
 --> Inner message: Could not download Python modules from URL "None".
 Make sure you are connected to the internet.
 You must be connected to the internet when downloading needed modules for the first time.
 --> Inner message: <urlopen error [SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocol version (_ssl.c:590)>

Full output: https://webkit-queues.webkit.org/results/10495748
Comment 12 Daniel Bates 2018-12-20 13:27:23 PST
Comment on attachment 357846 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=357846&action=review

> Tools/ChangeLog:11
> +        We should check for Selenium before asking PyPi for the latest version.

Why?

> Tools/Scripts/webkitpy/thirdparty/__init__.py:237
> +        if os.path.isfile(os.path.join(_AUTOINSTALLED_DIR, 'selenium', '__init__.py')):
> +            import selenium.webdriver
> +            if AutoinstallImportHook.greater_than_equal_to_version(minimum_version, selenium.webdriver.__version__):
> +                return
> +
>          try:
>              url, url_subpath = self.get_latest_pypi_url('selenium')
>          except urllib2.URLError:
> -            minimum_version = '3.5.0'
> -            if os.path.isfile(os.path.join(_AUTOINSTALLED_DIR, 'selenium', '__init__.py')):
> -                import selenium.webdriver
> -                if AutoinstallImportHook.greater_than_equal_to_version(minimum_version, selenium.webdriver.__version__):
> -                    sys.stderr.write('\nFailed to find latest selenium, falling back to existing {} version\n'.format(selenium.webdriver.__version__))
> -                    return
> -
>              # URL for installing the minimum required version.
> -            url = 'https://pypi.python.org/packages/ac/d7/1928416439d066c60f26c87a8d1b78a8edd64c7d05a0aa917fa97a8ee02d/selenium-3.5.0.tar.gz#986702fdd0e2aec6a4eda134678b8b3f'
> +            url = 'https://files.pythonhosted.org/packages/ac/d7/1928416439d066c60f26c87a8d1b78a8edd64c7d05a0aa917fa97a8ee02d/selenium-3.5.0.tar.gz'
>              url_subpath = 'selenium-{}/selenium'.format(minimum_version)
>              sys.stderr.write('\nFailed to find latest selenium, falling back to minimum {} version\n'.format(minimum_version))
>          self._install(url=url, url_subpath=url_subpath)

...now we never auto update to the latest Selenium. Disregarding a person that manually updates into _AUTOINSTALLED_DIR (do people do that or expected to do that?), we are forever stuck with 3.5.0.
Comment 13 Daniel Bates 2018-12-20 13:28:40 PST
Comment on attachment 357846 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=357846&action=review

> Tools/Scripts/webkitpy/thirdparty/__init__.py:222
> +
> +        installer = AutoInstaller(prepend_to_search_path=True, target_dir=self._fs.join(_AUTOINSTALLED_DIR, "urllib3"))
> +        installer.install(url="https://files.pythonhosted.org/packages/b1/53/37d82ab391393565f2f831b8eedbffd57db5a718216f82f1a8b4d381a1c1/urllib3-1.24.1.tar.gz", url_subpath="urllib3-1.24.1")

Weird that this _install_selenium function installs URLlib. Seems like a violation of separation of responsibilities.
Comment 14 Jonathan Bedard 2018-12-20 13:37:10 PST
(In reply to Daniel Bates from comment #12)
> Comment on attachment 357846 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=357846&action=review
> 
> > Tools/ChangeLog:11
> > +        We should check for Selenium before asking PyPi for the latest version.
> 
> Why?

Because otherwise every instantiation of test-webkitpy would need to check for a newer version of Selenium. Which 1) is not how the other libraries work and 2) would mean that any scripts using Selenium (namely test-webkitpy) would need to check for new versions of Selenium even if a version was already installed. Checking for the latest version is a network operation, there is no reason a script needs to do this ever time.

> 
> > Tools/Scripts/webkitpy/thirdparty/__init__.py:237
> > +        if os.path.isfile(os.path.join(_AUTOINSTALLED_DIR, 'selenium', '__init__.py')):
> > +            import selenium.webdriver
> > +            if AutoinstallImportHook.greater_than_equal_to_version(minimum_version, selenium.webdriver.__version__):
> > +                return
> > +
> >          try:
> >              url, url_subpath = self.get_latest_pypi_url('selenium')
> >          except urllib2.URLError:
> > -            minimum_version = '3.5.0'
> > -            if os.path.isfile(os.path.join(_AUTOINSTALLED_DIR, 'selenium', '__init__.py')):
> > -                import selenium.webdriver
> > -                if AutoinstallImportHook.greater_than_equal_to_version(minimum_version, selenium.webdriver.__version__):
> > -                    sys.stderr.write('\nFailed to find latest selenium, falling back to existing {} version\n'.format(selenium.webdriver.__version__))
> > -                    return
> > -
> >              # URL for installing the minimum required version.
> > -            url = 'https://pypi.python.org/packages/ac/d7/1928416439d066c60f26c87a8d1b78a8edd64c7d05a0aa917fa97a8ee02d/selenium-3.5.0.tar.gz#986702fdd0e2aec6a4eda134678b8b3f'
> > +            url = 'https://files.pythonhosted.org/packages/ac/d7/1928416439d066c60f26c87a8d1b78a8edd64c7d05a0aa917fa97a8ee02d/selenium-3.5.0.tar.gz'
> >              url_subpath = 'selenium-{}/selenium'.format(minimum_version)
> >              sys.stderr.write('\nFailed to find latest selenium, falling back to minimum {} version\n'.format(minimum_version))
> >          self._install(url=url, url_subpath=url_subpath)
> 
> ...now we never auto update to the latest Selenium. Disregarding a person
> that manually updates into _AUTOINSTALLED_DIR (do people do that or expected
> to do that?), we are forever stuck with 3.5.0.

That's the point. More accurately, we never update farther than the minimum supported version. If someone adds a feature which requires a newer version of Selenium, we can bump the minimum version.
Comment 15 Jonathan Bedard 2018-12-20 13:39:00 PST
(In reply to Daniel Bates from comment #13)
> Comment on attachment 357846 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=357846&action=review
> 
> > Tools/Scripts/webkitpy/thirdparty/__init__.py:222
> > +
> > +        installer = AutoInstaller(prepend_to_search_path=True, target_dir=self._fs.join(_AUTOINSTALLED_DIR, "urllib3"))
> > +        installer.install(url="https://files.pythonhosted.org/packages/b1/53/37d82ab391393565f2f831b8eedbffd57db5a718216f82f1a8b4d381a1c1/urllib3-1.24.1.tar.gz", url_subpath="urllib3-1.24.1")
> 
> Weird that this _install_selenium function installs URLlib. Seems like a
> violation of separation of responsibilities.

_install_buildbot does the exact same thing with Jinja, SQLAlchemy and Twisted. Selenium needs urllib3, so we need to make sure it's installed.
Comment 16 Jonathan Bedard 2018-12-20 13:41:00 PST
(In reply to WebKit Commit Bot from comment #11)
> Comment on attachment 357846 [details]
> Patch
> 
> Rejecting attachment 357846 [details] from commit-queue.
> 
> Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch',
> '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-02',
> 'validate-changelog', '--check-oops', '--non-interactive', 357846,
> '--port=mac']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit
> 
> No handlers could be found for logger "webkitpy.common.system.autoinstall"
> Traceback (most recent call last):
>   File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch", line 44, in
> <module>
>     from webkitpy.tool.main import WebKitPatch
>   File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/main.py", line
> 41, in <module>
>     from webkitpy.common.net.statusserver import StatusServer
>   File
> "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/statusserver.py",
> line 33, in <module>
>     from webkitpy.common.net.bugzilla.attachment import Attachment
>   File
> "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/bugzilla/
> __init__.py", line 4, in <module>
>     from .bugzilla import Bugzilla
>   File
> "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/bugzilla/
> bugzilla.py", line 49, in <module>
>     from webkitpy.common.net.credentials import Credentials
>   File
> "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/credentials.py",
> line 40, in <module>
>     from webkitpy.thirdparty.autoinstalled import keyring
>   File
> "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/__init__.py",
> line 100, in find_module
>     self._install_keyring()
>   File
> "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/__init__.py",
> line 124, in _install_keyring
>     "keyring-7.3.1/keyring")
>   File
> "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/__init__.py",
> line 257, in _install
>     installer.install(url=url, url_subpath=url_subpath,
> target_name=target_name)
>   File
> "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/system/autoinstall.
> py", line 517, in install
>     raise Exception(message)
> Exception: Error auto-installing the keyring package to:
>  "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/
> keyring"
>  --> Inner message: Could not download Python modules from URL "None".
>  Make sure you are connected to the internet.
>  You must be connected to the internet when downloading needed modules for
> the first time.
>  --> Inner message: <urlopen error [SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1
> alert protocol version (_ssl.c:590)>
> 
> Full output: https://webkit-queues.webkit.org/results/10495748

As I mentioned in Comment 9, this patch updated the auto installed cache on the EWS machine and now that machine is broken until this patch lands.
Comment 17 Jonathan Bedard 2018-12-20 13:45:14 PST
Committed r239465: <https://trac.webkit.org/changeset/239465>
Comment 18 Jonathan Bedard 2018-12-20 13:51:36 PST
(In reply to Daniel Bates from comment #12)
> ...
> 
> ...now we never auto update to the latest Selenium. Disregarding a person
> that manually updates into _AUTOINSTALLED_DIR (do people do that or expected
> to do that?), we are forever stuck with 3.5.0.

Since this change was about preventing a serious problem where test-webkitpy would not work in some circumstances, I think discussing the precise behavior of Selenium's auto-install should be done here <https://bugs.webkit.org/show_bug.cgi?id=192954>.

I'm not necessarily against auto-updating Selenium, it's just that the previous auto-updating behavior was actively harmful to engineering workflows.
Comment 19 BJ Burg 2018-12-20 14:43:43 PST
I've been bitten by auto-updated PyPI packages before, in particular pytest, urllib3, and and selenium. Is there some expected benefit from dependencies randomly changing out from under us?
Comment 20 Jonathan Bedard 2018-12-20 15:48:25 PST
(In reply to Brian Burg from comment #19)
> I've been bitten by auto-updated PyPI packages before, in particular pytest,
> urllib3, and and selenium. Is there some expected benefit from dependencies
> randomly changing out from under us?

With the exception of Selenium, none of these packages actually auto-update, they are just auto-installed. Basically, this whole part of webkitpy is a re-implementation of virtualenv.
Comment 21 Jonathan Bedard 2018-12-20 15:56:17 PST
For anyone stumbling upon this bug in the future:

BE VERY CAREFUL MAKING CHANGES LIKE THIS

As it turns out, all of this auto-install stuff is basically a house of cards in our infrastructure. This change brought down commit queue, not because the previous code was working (it wasn't) but because the machines on commit queue had cached the previous version of the libraries and the version of TLS used by urllib2 on Sierra is rejected server-side. We manually repaired the bots, but further changes to the auto-installed urls would bring down commit queue.