The early days of Recursion saw a small group of really talented and smart people working in a hurry to deliver big ideas. When you have to move fast, your choices are driven by the need for speed, rather than development for long-term stability. One example of this is dependency hell, particularly in our Python stack for legacy packages. To Recursion’s credit, we are currently creating smaller, leaner packages with dependency pinnings that are broad for packages and exact for services. However, we had some workhorse libraries that could not be decommissioned or refactored very easily.
Conda has a strong foothold in Python stacks for both data science and life science, which makes it a natural fit for Recursion and managing our Python packages and environments. As demands for new features grew, some of our workhorse environments took on a life of their own. We found ourselves at a point where adding a new dependency would become at least a half-day excursion in fighting dependency pinnings and Conda’s somewhat obtuse resolution output.
Here’s an example that a full-stack intern came across when making a single line change to add the recursion_sec package to this environment file:
In total, the environment file included roughly 60 different packages ranging from data science utilities to web framework tooling. Conda output for this environment was over 5000 lines with a lot of red herring messages. For example, this snippet shows two package dependencies that share the exact same requirement for jemalloc, yet Conda reports them as conflicting:
In this case, the problem was a single-package conflict with an internal package that was pinned, but Conda hid this information among irrelevant cruft. The worst part is that the offending library conflict recursion_log did not even show up as a package conflict itself. The closest we get to the problem is this snippet, where our logging library shows up as a dependency of colorama when Conda falsely claims the colorama dependency is in conflict:
Our recursion_sec library wanted a newer version of our recursion_log library, but Conda reported a lot about our environment except for this. If we use a tool like pip-compile from the pip-tools package instead, the problem becomes very clear:
What’s worse than a single developer fighting for half-a-day to adjust a Python environment is that anyone who wanted to use this new environment would run conda env create and go for a half-hour coffee break to wait for Conda to resolve. One of the mantras that our CTO, Ben Mabey, has been repeating lately is “slow down, so we can move fast.” This is in direct contrast to the “hurry up and wait” mentality, which led to the problem we are now dissecting.
We decided to address this problem once we could dedicate resources more wholly to DevEx (developer experience). With additional support, the time-value of fixing this became cost effective. We worked with teams to educate about and put in place tooling to create Python environments that follow good design principles. You can read more about our guiding principles in our colleague Eric Hurst’s personal blog post. Moreover, it also raised the question: is this the best a Python resolver can do?
First, we gathered all of the major players to write a benchmark. We also wanted to explore the ergonomics of tools available in the Python space. We prioritized porting our internal packages to a private PyPI server. Porting was pretty simple since all of our internal packages are pure Python and did not require any linking to system level libraries, an area that Conda excels at making easy for the user.
https://github.com/jazzband/pip-tools
Pip-tools is a set of tools that provides the pip-compile and pip-sync commands. We mostly explored usage of pip-compile, which is a command that compiles a Python package spec file (*.txt) from a source spec file that lists dependencies (either setup.py or *.in file)[1]. This can be used to generate “locked” requirements files. These lockfiles can then be installed using pip. Most of the tools below offer a similar feature.
https://pipenv.pypa.io/en/latest/
Once touted as the officially recognized way to manage environments within the Python community, this project has suffered a few setbacks. Pipenv uses a relatively new standard input file, the Pipfile[2]. Pipfile is a toml input file that cleanly allows users to declare both separate PyPI repositories, default dependencies, and dev dependencies.
Poetry is a dependency and virtual environment management tool for Python. Poetry provides isolated environments, either by creating a Python virtual environment during the package install process or by detecting if it’s being run in an existing Python virtual environment and installing into that environment[3]. Poetry can also be used to publish packages directly where other tools would need something like twine[4], Conda, or flit[5].
https://docs.conda.io/en/latest/
Conda is a dependency and environment management tool developed and maintained by Anaconda Inc. Conda is one of the few tools on this list that can also act as a Python version manager: without any additional tools, it can create and manage virtual environments with different Python versions. Conda is the only tool on this list that also has its own package repository format. A Conda environment can install packages from its own repository format or from a PyPI repository. We include both examples in our benchmark results. To the best of our knowledge, there is no easy way of generating a “lockfile” for Conda. We simulate this by generating a virtual environment and then dumping the resulting environment to a YAML file.
https://github.com/mamba-org/mamba
Mamba is a C++ re-implementation of Conda that boasts raw performance boosts and parallelism, leading to a much faster experience. Mamba is a newer project and part of a wider set of tools aimed at improving the conda user experience. At time of writing, it does not offer all of Conda’s functionality one-for-one, but the project is evolving quickly. Similar to Conda, Mamba can install packages from either a Conda or a PyPI repository.
https://docs.python.org/3/library/venv.html
Python has shipped with its own virtual environment management package, Venv[6], since Python 3.3. One Venv limitation is that it is inherently tied to a Python installation, so any instance can only work with that particular Python installation. This does not prevent a user from installing separate versions of Python on their machine, but it does mean that each one will have to use its own Venv to manage virtual environments tied to that Python interpreter.
https://virtualenv.pypa.io/en/latest/
Virtualenv is an alternative to Venv that actually predates Venv’s existence. It has the advantage of being separate from the core Python library and can be iterated more rapidly. Virtualenv has a few extra bells and whistles and it integrates well with Pyenv.
https://github.com/pyenv/pyenv
Pyenv is both a Python version manager and, with the help of virtualenv[7] or the built-in Venv, a virtual environment manager. Pyenv is installed independent of Python on your system and thus can manage multiple versions of Python and allows users to switch Python versions simply and easily. It also supports versions of Python beyond just CPython, such as PyPy, Anaconda, IronPython, Jython, etc.
We are excited to share with the community a transparent, reproducible, and hopefully unbiased utility for comparing Python environment resolution tooling: the Python Environment Resolution Profiler, or PERP for short. Alongside this project, we are also open-sourcing a separate repository that contains a few benchmarks to be used in conjunction with PERP. These benchmarks are specifications necessary to create the same Python environment using each of the tools listed above. We would be remiss not to mention a tool called DepHell, which made translating between toolsets less cumbersome. The benchmarks cover a few of our standard Python use cases:
The PERP code runs everything in Docker containers and will generate lockfiles for tools that support it.
Below are the results of our three benchmark cases run against both Python 3.7 and 3.8. All of our benchmarks were run on n1-standard-16 machines in Google Cloud Platform using a runtime environment provisioned with 4 cores and 16 GB memory.
One thing we learned through this process is that these raw metrics are not the only thing that should be considered when choosing your tooling. It was an instructive exercise dealing with errors in each tool and seeing what each tool outputs when there is a conflict between package requirements. Additionally, there are several features like lockfile support, the ability to resolve environment variables (important when dealing with private PyPI servers), and the ability to split main and dev requirements that made certain tools more attractive to our use case. Stay tuned to find out which tool we picked and why!
Our objective in sharing this profiler and associated benchmarks is multi-faceted. First and foremost, we are hoping to provide an educational artifact that demonstrates by example what a basic workflow looks like using each of these tools and how they map to one another. We also want to motivate maintainers and developers to improve performance with a reproducible benchmark that compares similar tools fairly. Lastly, we see this as just a starting point and are soliciting the greater community for ways to improve our methodology. If this is all fascinating to you, we are hiring!
Ayla Khan is a Senior Software Engineer and Dan Maljovec is an Infrastructure Engineer at Recursion.