Migrate create_venv to Python

This commit came about for the purpose of adding a method for
importing first party (ChromeOS owned) Python modules inside
virtualenvs.  Because this feature would have added another Python
script that create_venv would need to spawn and also because
create_venv is stretching the limits for a comfortably maintainable
shell script, the best course of action would be to port all of
create_venv to Python.

As a side effect of being written in Python, unit tests are now easier
to write, and the new cros_venv package has a number of unit tests.
Coupled with the expensive VM tests hosted elsewhere, this should
provide good coverage for virtualenv code, which will be crucial for
all of its users.

This Python rewrite is basically 100% backward compatible with the
previous version, with a few disclaimers:

- The locking mechanism changed from the flock utility to a Python
  implementation of a symlink-based lock.

  The flock utility was not used because it would be very awkward to
  call from Python.

  The new locking mechanism uses symlink locks instead of an flock(2)
  based mechanism because flock(2) has quite a few gotchas, is
  slightly trickier to implement, and extremely complicated to unit
  test (due to one of said gotchas).  As a bonus, the new symlink
  implementation includes the process id of the owning process, which
  may be useful.

- Recursive expansion of requirement files has been removed.  This
  feature is not used yet, and after careful consideration I decided
  that the feature was not worth it.

  The cost of removing the feature (the benefit of the feature) is
  that first party packages that depend on other first party
  packages (for example, Autotest depending on chromite) will need to
  have their requirements file kept up to date with their dependencys
  requirements file.

  The cost of the feature is the maintenance of some dozen lines of
  ad-hoc text processing code and a negligible performance cost on
  every run.

  However, the main motivation is that pips handling of recursive
  relative paths (and thus, our handling of recursive relative paths)
  is insane.  Relative paths are resolved relative to the current
  directory, regardless of the location of any recursive requirements
  files.  This is completely unmanageable and it is best to throw out
  the entire feature than to rely on this insanity.

  Having to keep requirement files in sync across repositories is
  annoying, but quite simple and any errors would be easy to debug.
  The subtleties of some twice recursive relative path or some other
  insane situation that breaks due to some rename thats otherwise
  completely unrelated would be FUN.

The original motivation for the migration to Python was to add a
method for importing first party (ChromeOS owned) Python modules
inside virtualenvs.  Such a feature was not added.  If such a feature
is still needed, it can be recovered from a patchset.

Currently, we set up the import path for first party modules by
patching sys.path in very creative ways.  This tends to cause
problems.

In the original virtualenv design, first party packages would be
handled by using pip -e to install the packages inside the virtualenv
in editable mode.  However, the implementation of this is wonky,
buggy, and generally considered a second citizen to installing
packages "properly".  The blocking issue is that pip -e must copy the
entire source tree internally, all for writing a metadata file and
what amounts to a symlink.  This takes a considerable amount of time
for large packages such as chromite (the .git directory is copied
also).  This is not trivially fixed upstream, and the editable
installs feature is not considered a top priority.

Thus, the reason for adding our own import path patching is to work
around pip while solving our existing woes.

A test concept of the feature was implemented using .pth files, which
are simple files that contain paths to add to Pythons import path.

The problem with this implementation is that our code is run in a lot
of really weird configurations.  Having a single .pth file may not be
good enough.  While relative paths are supported (and sane), some of
the places our code is run do not use the same file system hierarchy.
Also, there is no simple way to handle recursive requirements.

Thus, going forward, the standard way to support first party imports
is a small bit of sys.path patching code in the respective packages
__init__.py file.  This enable the use of Python for handling weird
environments as needed; a little logic goes a long way.  A lot of
other things also just work due to __init__.py file semantics.  The
details are in the README added by this commit.

However, the migration to Python is still, I think, a good thing,
so I am keeping this commit, removing the .pth handling parts.

BUG=None
TEST=Run unit tests and virtualenv VM tests

Change-Id: Ibd817b889acb74d62d7ce7ebc9c017341e2d3580
Reviewed-on: https://chromium-review.googlesource.com/444924
Commit-Ready: Allen Li <ayatane@chromium.org>
Tested-by: Allen Li <ayatane@chromium.org>
Reviewed-by: Allen Li <ayatane@chromium.org>
13 files changed
tree: f00fb200b0ebcfc247908201573bed92c6aa0fad
  1. bin/
  2. cros_venv/
  3. pip_packages/
  4. .gitignore
  5. create_venv
  6. PRESUBMIT.cfg
  7. README.md
  8. venv_command
README.md

infra_virtualenv README

This repository provides a common Python virtualenv interface that infra code (such as chromite) can depend on. At this point, it is experimental and not yet used in production.

Virtualenv users should create a requirements.txt file listing the packages that they need and use the wrapper scripts (described below) to create the virtualenv and run commands within it.

To add packages to this repository, run:

$ pip wheel -w path/to/pip_packages -r path/to/requirements.txt

Commit the changes and make a CL.

For example for chromite, from within chromite/virtualenv, run:

$ pip wheel -w pip_packages -r requirements.txt

Wrapper scripts

create_venv creates or updates a virtualenv using a requirements.txt file.

$ create_venv .venv requirements.txt

To run the virtualenv python, use:

$ .venv/bin/python

NOTE: it is not generally safe to run the other scripts in .venv/bin due to the hard-coded paths in the virtualenv. Instead of running .venv/bin/pip for example, use .venv/bin/python -m pip.

Here’s a complete example:

$ echo mock==2.0.0 > requirements.txt
$ ./create_venv .venv requirements.txt
$ .venv/bin/python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.prefix  # This points to the virtualenv now
'/usr/local/google/home/ayatane/src/chromiumos/infra_virtualenv/.venv'
>>> import mock

Adding arbitrary directories to import path

NOTE: Do not use this for third party dependencies (stuff not owned by ChromiumOS)! This should only be used to set up imports for stuff we own. For example, importing python-MySQL should NOT use this, but importing chromite from Autotest may use this.

This should be handled by the minimum amount of code in the package's __init__.py file.

Example:

"""Autotest package."""

import sys

# Use the minimum amount of logic to find the path to add
_chromite_parent = 'site-packages'
sys.path.append(_chromite_parent)

A solid understanding of the Python import system is recommended (link is for Python 3, but is informative).

In brief, __init__.py is executed whenever the package is imported. The package is imported before any submodule or subpackage is imported. The package is only imported once per Python process; future imports get the “cached” “singleton” package object. Thus, __init__.py will modify sys.path exactly once and is guaranteed to be run before anything in that package is used.

Background for init.py recommended usage

(Updated on 2017-02-21)

Previously, we set up the import path for first party modules by patching sys.path in very creative ways. This tends to cause problems.

In the original virtualenv design, first party packages would be handled by using pip -e to install the packages inside the virtualenv in editable mode. However, the implementation of this is wonky, buggy, and generally considered a second citizen to installing packages “properly”. The blocking issue is that pip -e must copy the entire source tree internally, all for writing a metadata file and what amounts to a symlink. This takes a considerable amount of time for large packages such as chromite (the .git directory is copied also). This is not trivially fixed upstream, and the editable installs feature is not considered a top priority.

Thus, the reason for adding our own import path patching is to work around pip while solving our existing woes.

A test concept of the feature was implemented using .pth files, which are simple files that contain paths to add to Python’s import path.

The problem with this implementation is that our code is run in a lot of really weird configurations. Having a single .pth file may not be good enough. While relative paths are supported (and sane), some of the places our code is run do not use the same file system hierarchy. Also, there is no simple way to handle recursive requirements.

Thus, going forward, the standard way to support first party imports is a small bit of sys.path patching code in the respective package’s __init__.py file. This enable the use of Python's full power for handling weird environments as needed; a little logic goes a long way. A lot of other things also just work due to __init__.py file semantics: for example, recursive requirements.