Virtual environments in Python
“Python applications will often use packages and modules that don’t come as part of the standard library. Applications will sometimes need a specific version of a library, because the application may require that a particular bug has been fixed or the application may be written using an obsolete version of the library’s interface.
This means it may not be possible for one Python installation to meet the requirements of every application. If application A needs version 1.0 of a particular module but application B needs version 2.0, then the requirements are in conflict and installing either version 1.0 or 2.0 will leave one application unable to run.
The solution for this problem is to create a virtual environment, a self-contained directory tree that contains a Python installation for a particular version of Python, plus a number of additional packages.” - from the Python Tutorial
In this tutorial we will discuss about two major methods to create and manage virtual environments in Python.
Venv
To create a virtual environment, run the venv module as a script
with the directory path. For example the following code uses the venv
module, -m venv
, to create tutorial-env
in
the current directory:
python3 -m venv tutorial-env
It also creates directories containing a copy of the Python interpreter, the standard library, and various supporting files. Once we’ve created a virtual environment, we can activate it by:
source tutorial-env/bin/activate
After activating the environment with the above command, we can use
pip
to install required packages by:
pip install <package1> <package2=version> <package3>=version> ...
For example, the following Bash script creates
tutorial-env
if it does not exist, and install
redis
and gitpython
after activation:
#!/bin/bash
if [ ! -d tutorial-env ]; then
python3 -m venv tutorial-env
fi
source tutorial-env/bin/activate
pip install redis gitpython
To uninstall packges use:
pip unisntall <package1> <package2> ...
To keep a list of the packages (commonly called
requirements.txt
) that are installed in the env, use:
pip freez > requirements.txt
To install packages from a requirements.txt
file
use:
pip install -r requirements.txt
We can use deactivate
command to deactivate and env
and use rm -rf <env_path>
to remove th env.
Miniconda
To create virtual environments, Conda could be the best environment management system. Miniconda is an open source package and environment management system that includes Conda. Conda quickly installs, runs and updates packages and their dependencies. To start using Conda, follow the instruction in here to install Miniconda (or Anaconda if you want to have most of the scientific packages) on your operating system.
When Miniconda is installed, use
conda init <shell-name>
to initiate Conda and run
conda config --set auto_activate_base false
to stop auto
base activation. We can also use conda update conda
to
upate Conda.
For adding Conda autocompletion, in bash terminal copy conda-bash-completion
in /usr/share/bash-completion/completions/conda
. And for
a macOS with zsh, copy conda-zsh-completion
in ~/miniconda3/zsh-completion/_conda
and add the
following to the ~/.zshrc
file:
fpath+="/Users/${USER}/miniconda3/zsh-completion" && compinit
Usage
To create a new environment, use conda create
command
including names of the environment and required packages:
conda create --name <env_name> <package1> <package2=version> ...
# or
conda create --prefix <env_path> <package1> <package2=version> ...
We also can use --yes
flag to set up the environment
without a question. We use --prefix <env_path>
to
setup the environment in a certain path. Note that we can not use
--prefix
and --name
at the same time.
To see list of environments, use the following:
conda env list
To activate an environment type:
conda activate <env_name> or <env_path>
Note that, for Conda versions prior to 4.6 we need to use
source
instead of conda
to activate an env.
Now, use the following to install new packages in the activated
env:
conda install <package1> <package2=version> ...
To uninstall packages use:
conda uninstall <package1> <package2=version> ...
To see list of the installed packages within the env, use:
conda list
To keep a list of the packages (commonly called
requirements.txt
) that are installed in the env, use:
conda list --export > requirements.txt
We can deactivate the env by:
conda deactivate
And remove the deactivated virtual environment by:
conda env remove --name <env_name> or --prefix <env_path>
To remove cache files we can use:
conda clean --all
We can regenerate an env from a requirements.txt
file:
conda create --name <env_name> --file requirments.txt
Note that there is a short form for most of the Conda options that
can be used instead of the long form. For instance, -n
,
-p
, -c
, -y
, and -e
can be used instead of --name
, --prefix
,
--channel
, --yes
, and --export
,
respecively. You may find more information by using
conda -h
or conda <command> -h
.
Conda channels
Whenever we use conda create
or
conda install
without mentioning a channel name, Conda
package manager search its default channels to install the packages.
If you are looking for specific packages that are not in the default
channels you have to mention them by using:
codna create --name <env_name> --channel <channel1> --channel <channel2> ... <package1> <package2> ...
For example the following creates new_env
and installs
r-sf, shapely and bioconductor-biobase from r
,
conda-forge
and bioconda
channels:
codna create --name new_env --channel r --channel conda-forge --channel bioconda r-sf shapely bioconductor-biobase
Ideally, you should create one environment per project and include all the required packages when you create the environment and try to use a single channel as much as possible. It is important using the same channels for updating the environment.
Conda packages
To find the required packages, we can visit anaconda.org and search for packages
to find their full name and the corresponding channel. Another option
is using conda search
command. Note that we need to
search the right channel to find pakages that are not in the default
channels. For example:
conda search --channel bioconda biobase
HPC workflow
In a HPC cluster system, first we need to load
miniconda3
module to be able to use conda
.
We can use the following as a template for building a new
environment:
module load miniconda3
if [ ! -d <env_path> ]; then
conda create --yes --prefix <env_path> <package1> <package2> ...
fi
source activate <env_path>
To keep Conda packages and caches somewhere except the home
directory, we can update conda pathes by exporting new
CONDA_PKGS_DIRS
and CONDA_ENVS_DIRS
or
creating/updating ~/.condarc
file. Review here
to learn more.