18. Python packages and environments#

18.1. Introduction#

This chapter provides an introduction to installing python packages and using python environments.

In the first part of the chapter you will learn how to

  • search the database of Python packages

  • install Python packages from the Python Packaging Index (PyPI)

  • create virtual Python environments to separate different projects

In the second part, we provide additional information for users of the Anaconda distribution, in particular

  • using environments within conda

  • installing packages with conda

  • interplay between conda and pip

We mention pyenv as an advanced tool at the very end.

This chapter does not comment on the creation of Python packages.

18.1.1. Shell commands in the Jupyter notebook#

This chapter is written in a Jupyter notebook. This can be helpful for the reader as the notebook can be executed, and so the command can be replayed and varied easily.

In this particular chapter, we have lots of interaction with the shell of the operating system, and need to know two things:

  1. We use the exclamation mark (!) to tell Jupyter to send the following command to the shell (rather than interpret in the Python environment of this notebook). Here is an example:

!date
Wed Jan  5 11:55:11 CET 2022
  1. If we modify shell variables (such as the PATH), these are only set within the same cell. This will lead to repitition of some commands. This is mildly annoying (and will not be the case if the same commands are used outside the Jupyter Notebook).

    Here is an example to illustrate the issue: First we set a variable value and then we display it:

!export NEW_VAR="test" && echo $NEW_VAR
test

The && operator instructs the shell to carry out the command to the right of && if the command on the left succeeded.

Repeating the “echo” command, we find the variable is not defined anymore:

!echo $NEW_VAR

So if we want to make use of those variable values, we need to repeat the setting of them:

!export NEW_VAR="test" && echo "The value of NEW_VAR is $NEW_VAR."
The value of NEW_VAR is test.

We need to make use of this when activiting virtual environments (below).

18.1.2. Prerequisits#

We assume that you have already python installed on your system. (And we assume that you are using Python3.) If you haven’t got Python 3 yet, then either install the Anaconda distribution, follow the instructions of the Hitchhiker’s Guide to Python or take some other action.

Check that you have python installed:

!python --version
Python 3.9.7

We also assume you have a somewhat recent Python version (3.8 and above).

The commands below are tested for Linux and OSX operating system. If you use windows, please check the corresponding commands from here: https://packaging.python.org/en/latest/tutorials/installing-packages/

18.2. Python virtual environments#

Before we install packages (be it our own or those from somebody else), we should create a new virtual environment. This is good practice because

  • we can delete it when we don’t need it any more

  • we can not break other projects using python we may be working on

  • we have no constraints about versions of particular libraries (it could be that one application needs version 2.x of a library and another application needs the version 1.8: if the two applications are installed in different environments, then this is no problem)

18.2.1. Creating virtual enviroments#

We can create a virtual environment using this command

!python -m venv myvirtualenv

This command creates a subdirectory with name myvirtualenv in the current directory which contains the new virtual environment:

!ls -dl myvirtualenv/
drwxr-xr-x  6 fangohr  staff  192 Jan  5 11:55 myvirtualenv/

The virtual environment will use the same Python interpreter that we have used above when creating it. On Linux/OSX system, we can find out which interpreter this is by using the which command:

!which python
/Users/fangohr/anaconda3/bin/python

If the basics are sufficient for you, you can skip to the next section about activating a virtual environment.

We also be more specific, and choose a particular Python interpreter when using the -m venv command to force use of a particular python version. On a Mac (OSX), if python3 was installed via brew, there is a python executable in /usr/local/bin/python3. To force creation of a virtual environment using this interpreter, we could use

/usr/local/bin/python3 -m venv myvirtualenv

There is no need to know what is happening inside this folder, but as we are curious, we’ll have a very brief look anyway:

!ls myvirtualenv/
bin        include    lib        pyvenv.cfg

The pyvenv.cfg file contains information about the Python interpreter we are using

!cat myvirtualenv/pyvenv.cfg
home = /Users/fangohr/anaconda3/bin
include-system-site-packages = false
version = 3.9.7

Out of interest, we can check the total Disk Usage of that subdirectory:

!du -hs myvirtualenv/
 15M	myvirtualenv/

To use this virtual environment, we need to activate it:

18.2.2. Activating a virtual environment#

To activate the virtual environment, we need to know in which folder it is installed (in our case in myvirtualenv). On Linux and OSX, we run the following shell command:

!source myvirtualenv/bin/activate

This changes the PATH variable, which the operating system uses search for the python executable: it puts the directory that contains the python interpreter in our virtual enviromnent to the beginning of the PATH valiabl. We can check that this works, by using the which command again:

!source myvirtualenv/bin/activate && which python
/Users/fangohr/git/introduction-to-python-for-computational-science-and-engineering/book/myvirtualenv/bin/python

(As outlined in the introduction, the repitition of the activate command is only necessary because we work from a Notebook here: if you go through these steps in a shell, you can ignore this, and write which python straight away.)

18.2.3. Using the virtual environment#

Once we have activated the virtual enviroment, we can use it as we would use the default Python environment provided on the system.

For example to install some Python packages.

18.2.4. Name of the virtual environment#

We have used myvirtualenv as the name of the virtual environment. In general, the name can be chosen freely. Commonly used names include env or venv. Ocasionally, the environment is installed in a hidden subdirectory (such as .env or .venv).

We have not used the venv name for pedagogical to avoid confusion with the venv module.

18.3. Python Package Index (PyPI)#

The Python Package Index provides a searchable web interface (https://pypi.org) that provides all Python packages registered with PyPI.

PyPI is the standard way of distributing (open source) python packages, and commonly used in science and engineering as well.

18.3.1. Installing packages with pip#

The command to install one or more of these package is pip. We will activate our virtual environment and install some example packages:

!source myvirtualenv/bin/activate && pip install cowsay
Collecting cowsay
  Using cached cowsay-4.0-py2.py3-none-any.whl (24 kB)
Installing collected packages: cowsay
Successfully installed cowsay-4.0
WARNING: You are using pip version 21.2.3; however, version 21.3.1 is available.
You should consider upgrading via the '/Users/fangohr/git/introduction-to-python-for-computational-science-and-engineering/book/myvirtualenv/bin/python -m pip install --upgrade pip' command.

As we get a warning that suggests to upgrade the pip package itself, we shall follow the instructions and run the recommended command:

!source myvirtualenv/bin/activate && pip install --upgrade pip
Requirement already satisfied: pip in ./myvirtualenv/lib/python3.9/site-packages (21.2.3)
Collecting pip
  Using cached pip-21.3.1-py3-none-any.whl (1.7 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 21.2.3
    Uninstalling pip-21.2.3:
      Successfully uninstalled pip-21.2.3
Successfully installed pip-21.3.1
!source myvirtualenv/bin/activate && cowsay Hellooo World
  _____________
| Hellooo World |
  =============
             \
              \
                ^__^
                (oo)\_______
                (__)\       )\/\
                    ||----w |
                    ||     ||

We can confirm the list of packages we have installed (together with their version number) using pip list:

!source myvirtualenv/bin/activate && pip list
Package    Version
---------- -------
cowsay     4.0
pip        21.3.1
setuptools 57.4.0

18.3.2. Learn more about an installed package using pip show#

Once a package is installed, we can use pip show to learn more about it:

!source myvirtualenv/bin/activate && pip show cowsay
Name: cowsay
Version: 4.0
Summary: The famous cowsay for GNU/Linux is now available for python
Home-page: https://github.com/VaasuDevanS/cowsay-python
Author: Vaasudevan Srinivasan
Author-email: vaasuceg.96@gmail.com
License: GNU-GPL
Location: /Users/fangohr/git/introduction-to-python-for-computational-science-and-engineering/book/myvirtualenv/lib/python3.9/site-packages
Requires: 
Required-by: 

For packages that are not installed yet, we need to search https://pypi.org to learn more about them. This includes the list of available packages (under “release history”).

(There are command line tools such as pip-search that can help to find package names, but they do not provide the same depth of information as the web page [at the time of writing]).

18.3.3. Uninstalling packages with pip#

(The -y is short for yes and tells pip uninstall not to ask for confirmation if cowsay should be uninstalled.)

!source myvirtualenv/bin/activate && pip uninstall -y cowsay 
Found existing installation: cowsay 4.0
Uninstalling cowsay-4.0:
  Successfully uninstalled cowsay-4.0
!source myvirtualenv/bin/activate && pip list
Package    Version
---------- -------
pip        21.3.1
setuptools 57.4.0

18.3.4. Installing packages with additional dependencies#

As a second example, we’ll install the wikipedia package. We will see that it needs additional python packages as dependencies, which will be installed automatically:

!source myvirtualenv/bin/activate && pip install wikipedia
Collecting wikipedia
  Using cached wikipedia-1.4.0.tar.gz (27 kB)
  Preparing metadata (setup.py) ... ?25ldone
?25hCollecting beautifulsoup4
  Using cached beautifulsoup4-4.10.0-py3-none-any.whl (97 kB)
Collecting requests<3.0.0,>=2.0.0
  Using cached requests-2.27.0-py2.py3-none-any.whl (63 kB)
Collecting idna<4,>=2.5
  Using cached idna-3.3-py3-none-any.whl (61 kB)
Collecting charset-normalizer~=2.0.0
  Using cached charset_normalizer-2.0.10-py3-none-any.whl (39 kB)
Collecting urllib3<1.27,>=1.21.1
  Using cached urllib3-1.26.7-py2.py3-none-any.whl (138 kB)
Collecting certifi>=2017.4.17
  Using cached certifi-2021.10.8-py2.py3-none-any.whl (149 kB)
Collecting soupsieve>1.2
  Using cached soupsieve-2.3.1-py3-none-any.whl (37 kB)
Using legacy 'setup.py install' for wikipedia, since package 'wheel' is not installed.
Installing collected packages: urllib3, soupsieve, idna, charset-normalizer, certifi, requests, beautifulsoup4, wikipedia
    Running setup.py install for wikipedia ... ?25ldone
?25hSuccessfully installed beautifulsoup4-4.10.0 certifi-2021.10.8 charset-normalizer-2.0.10 idna-3.3 requests-2.27.0 soupsieve-2.3.1 urllib3-1.26.7 wikipedia-1.4.0
!source myvirtualenv/bin/activate && python -c "import wikipedia; print(wikipedia.summary('cowsay'))"
cowsay is a program that generates ASCII art pictures of a cow with a message. It can also generate pictures using pre-made images of other animals, such as Tux the Penguin, the Linux mascot. It is written in Perl. There is also a related program called cowthink, with cows with thought bubbles rather than speech bubbles. .cow files for cowsay exist which are able to produce different variants of "cows", with different kinds of "eyes", and so forth. It is sometimes used on IRC, desktop screenshots, and in software documentation. It is more or less a joke within hacker culture, but has been around long enough that its use is rather widespread. In 2007, it was highlighted as a Debian package of the day.

It is worth noting that if we uninstall wikipedia, the dependencies that wikipedia needs (such as beautifulsoup4) are not uninstalled:

!source myvirtualenv/bin/activate && pip uninstall -y wikipedia
Found existing installation: wikipedia 1.4.0
Uninstalling wikipedia-1.4.0:
  Successfully uninstalled wikipedia-1.4.0
!source myvirtualenv/bin/activate && pip list
Package            Version
------------------ ---------
beautifulsoup4     4.10.0
certifi            2021.10.8
charset-normalizer 2.0.10
idna               3.3
pip                21.3.1
requests           2.27.0
setuptools         57.4.0
soupsieve          2.3.1
urllib3            1.26.7

This can lead to an accumulation of (partly unneeded) python packages. Also for this reason, it is good practice to create a virtual environment from scratch when startinga new project, and to discard it afterwards.

18.3.5. Installing particular versions with pip#

Occasionally, we need to install a particular version of a package. For example, imagine we need version 2.0 of cowsay. In that case, we can use the == operator to specify this requirement:

!source myvirtualenv/bin/activate && pip install cowsay==3.0
Collecting cowsay==3.0
  Using cached cowsay-3.0-py2.py3-none-any.whl (19 kB)
Installing collected packages: cowsay
Successfully installed cowsay-3.0
!source myvirtualenv/bin/activate && cowsay --version
3.0

18.3.6. Upgrading a pip-installed package#

!source myvirtualenv/bin/activate && pip install -U cowsay
Requirement already satisfied: cowsay in ./myvirtualenv/lib/python3.9/site-packages (3.0)
Collecting cowsay
  Using cached cowsay-4.0-py2.py3-none-any.whl (24 kB)
Installing collected packages: cowsay
  Attempting uninstall: cowsay
    Found existing installation: cowsay 3.0
    Uninstalling cowsay-3.0:
      Successfully uninstalled cowsay-3.0
Successfully installed cowsay-4.0
!source myvirtualenv/bin/activate && cowsay --version
4.0

Let’s remove cowsay again:

!source myvirtualenv/bin/activate && pip uninstall -y cowsay
Found existing installation: cowsay 4.0
Uninstalling cowsay-4.0:
  Successfully uninstalled cowsay-4.0

18.3.7. Installing a package from github#

If we want to install the latest development version of the (Python) cowsay package, we have two options.

The first option is to pip install directly from the github. The github repository is at https://github.com/VaasuDevanS/cowsay-python

!source myvirtualenv/bin/activate && pip install git+https://github.com/VaasuDevanS/cowsay-python.git
Collecting git+https://github.com/VaasuDevanS/cowsay-python.git
  Cloning https://github.com/VaasuDevanS/cowsay-python.git to /private/var/folders/wc/d1lyft3x2jn29b6yffrzh4vw0000gq/T/pip-req-build-7gxyjbt6
  Running command git clone --filter=blob:none -q https://github.com/VaasuDevanS/cowsay-python.git /private/var/folders/wc/d1lyft3x2jn29b6yffrzh4vw0000gq/T/pip-req-build-7gxyjbt6
  Resolved https://github.com/VaasuDevanS/cowsay-python.git to commit 767c09425d813b80d67cdebba02ce387ca2eb4e8
  Preparing metadata (setup.py) ... ?25ldone
?25hUsing legacy 'setup.py install' for cowsay, since package 'wheel' is not installed.
Installing collected packages: cowsay
    Running setup.py install for cowsay ... ?25ldone
?25hSuccessfully installed cowsay-4.0
!source myvirtualenv/bin/activate && cowsay --version
4.0
!source myvirtualenv/bin/activate && pip uninstall -y cowsay 
Found existing installation: cowsay 4.0
Uninstalling cowsay-4.0:
  Successfully uninstalled cowsay-4.0

The second option is to clone the git repository to our local machine, and then to install the package from that local directory:

!cd /tmp && git clone https://github.com/VaasuDevanS/cowsay-python.git
Cloning into 'cowsay-python'...
remote: Enumerating objects: 170, done.
remote: Counting objects: 100% (82/82), done.
remote: Compressing objects: 100% (40/40), done.
remote: Total 170 (delta 41), reused 77 (delta 40), pack-reused 88
Receiving objects: 100% (170/170), 79.19 KiB | 772.00 KiB/s, done.
Resolving deltas: 100% (72/72), done.
!source myvirtualenv/bin/activate && cd /tmp/cowsay-python && pip install .
Processing /private/tmp/cowsay-python
  Preparing metadata (setup.py) ... ?25ldone
?25hUsing legacy 'setup.py install' for cowsay, since package 'wheel' is not installed.
Installing collected packages: cowsay
    Running setup.py install for cowsay ... ?25ldone
?25hSuccessfully installed cowsay-4.0
!source myvirtualenv/bin/activate && cowsay --version
4.0

18.3.8. Pip install a user-editable package from a local directory#

This example carries on from the git clone example above.

If we pip-install python packages, these are normally installed in the directory tree in the virtual environment. For example:

!ls myvirtualenv/lib/python3.*/site-packages/cowsay
__init__.py   __pycache__   main.py
__main__.py   characters.py test.py

If we intend to edit the python files in the package (for example because we want to develop it further, or explore it), and we want those edits to be visible in the ‘installed’ package, we can ask pip to carry out an editable install using the -e flag:

!source myvirtualenv/bin/activate && pip uninstall -y cowsay
Found existing installation: cowsay 4.0
Uninstalling cowsay-4.0:
  Successfully uninstalled cowsay-4.0
!source myvirtualenv/bin/activate && cd /tmp/cowsay-python && pip install -e .
Obtaining file:///private/tmp/cowsay-python
  Preparing metadata (setup.py) ... ?25ldone
?25hInstalling collected packages: cowsay
  Running setup.py develop for cowsay
Successfully installed cowsay-4.0

In this case, only a link to our local package is created:

!ls -l myvirtualenv/lib/python3.*/site-packages/cowsay*
-rw-r--r--  1 fangohr  staff  28 Jan  5 11:55 myvirtualenv/lib/python3.9/site-packages/cowsay.egg-link
!cat myvirtualenv/lib/python3.*/site-packages/cowsay*
/private/tmp/cowsay-python
.

18.3.9. Advance pip use: freeze, -r requirements.txt and creating reproducible environments#

If you want to record (and later re-use) a combination of python packages with their specific version numbers, you can use the pip freeze command to provide such a list.

!source myvirtualenv/bin/activate && pip freeze
beautifulsoup4==4.10.0
certifi==2021.10.8
charset-normalizer==2.0.10
-e git+https://github.com/VaasuDevanS/cowsay-python.git@767c09425d813b80d67cdebba02ce387ca2eb4e8#egg=cowsay
idna==3.3
requests==2.27.0
soupsieve==2.3.1
urllib3==1.26.7

We can re-direct the output into a file (which by convention is called requirements.txt):

!source myvirtualenv/bin/activate && pip freeze > requirements.txt
!cat requirements.txt
beautifulsoup4==4.10.0
certifi==2021.10.8
charset-normalizer==2.0.10
-e git+https://github.com/VaasuDevanS/cowsay-python.git@767c09425d813b80d67cdebba02ce387ca2eb4e8#egg=cowsay
idna==3.3
requests==2.27.0
soupsieve==2.3.1
urllib3==1.26.7

We can now create a new virtual environment, and install all the packages listed in the requirements.txt file into this new virtual environment:

!python -m venv myvirtualenv-copy
!source myvirtualenv-copy/bin/activate && pip install -r requirements.txt
Obtaining cowsay from git+https://github.com/VaasuDevanS/cowsay-python.git@767c09425d813b80d67cdebba02ce387ca2eb4e8#egg=cowsay (from -r requirements.txt (line 4))
  Cloning https://github.com/VaasuDevanS/cowsay-python.git (to revision 767c09425d813b80d67cdebba02ce387ca2eb4e8) to ./myvirtualenv-copy/src/cowsay
  Running command git clone -q https://github.com/VaasuDevanS/cowsay-python.git /Users/fangohr/git/introduction-to-python-for-computational-science-and-engineering/book/myvirtualenv-copy/src/cowsay
  Running command git rev-parse -q --verify 'sha^767c09425d813b80d67cdebba02ce387ca2eb4e8'
  Running command git fetch -q https://github.com/VaasuDevanS/cowsay-python.git 767c09425d813b80d67cdebba02ce387ca2eb4e8
  Resolved https://github.com/VaasuDevanS/cowsay-python.git to commit 767c09425d813b80d67cdebba02ce387ca2eb4e8
Collecting beautifulsoup4==4.10.0
  Using cached beautifulsoup4-4.10.0-py3-none-any.whl (97 kB)
Collecting certifi==2021.10.8
  Using cached certifi-2021.10.8-py2.py3-none-any.whl (149 kB)
Collecting charset-normalizer==2.0.10
  Using cached charset_normalizer-2.0.10-py3-none-any.whl (39 kB)
Collecting idna==3.3
  Using cached idna-3.3-py3-none-any.whl (61 kB)
Collecting requests==2.27.0
  Using cached requests-2.27.0-py2.py3-none-any.whl (63 kB)
Collecting soupsieve==2.3.1
  Using cached soupsieve-2.3.1-py3-none-any.whl (37 kB)
Collecting urllib3==1.26.7
  Using cached urllib3-1.26.7-py2.py3-none-any.whl (138 kB)
Installing collected packages: urllib3, soupsieve, idna, charset-normalizer, certifi, requests, cowsay, beautifulsoup4
  Running setup.py develop for cowsay
Successfully installed beautifulsoup4-4.10.0 certifi-2021.10.8 charset-normalizer-2.0.10 cowsay-4.0 idna-3.3 requests-2.27.0 soupsieve-2.3.1 urllib3-1.26.7
WARNING: You are using pip version 21.2.3; however, version 21.3.1 is available.
You should consider upgrading via the '/Users/fangohr/git/introduction-to-python-for-computational-science-and-engineering/book/myvirtualenv-copy/bin/python -m pip install --upgrade pip' command.

It is good practice to use the freeze command to store the list of packages and versions required for important projects (including those such as scientific publications, reports, theses), and the requirements.txt file should be archived together with the data and software.

It is even better practice if the creation of the virtual enviroment is done in a scripted way (based on a requirements.txt file which should be part of the archived [and version-controlled] files of the analysis), and before all the required processing/simulation/analysis is in that environment.

In practice, achieving full and guaranteed reproducibility is difficult. There are a variety of problems that could occur, such as for example the disappearance of the pypi.org service. How to achieve full reproducibility is an active research area, and deserves a separate chapter or book.

In any case, recording the Python packages used is a very good first step.

18.3.10. Deactivate a virtual environment#

To de-activate a virtual enviroment, use the deactivate command.

!source myvirtualenv/bin/activate && deactivate && which python
/Users/fangohr/anaconda3/bin/python

18.3.11. Deleting a virtual enviroment#

To completely remove the virtual environment, we can delete the subfolder in which it was installed:

!rm -rf myvirtualenv

18.3.12. Further reading#

  • Installing Python packages: https://packaging.python.org/en/latest/tutorials/installing-packages

  • venv module documentation: https://docs.python.org/3/library/venv.html

18.4. Anaconda#

18.4.1. Introduction#

The Anaconda software distribution brings its own packaging, which is controlled through the conda command.

Anaconda is well known as a Python distribution, but is in no way limited to Python packages: it is a generic package manager. Together with the community-run conda-forge project, there is a multitude of packages available. A particular bonus of the the conda packages is that they can be provided for the three major operating systems is use (Linux, OSX, Windows).

Conda provides conda packages for some Python packages that are available from the Python Package Index (PyPI). One thus needs to ask: should I install a package using the conda command (conda install spyder) or through pip (pip install spyder). See below for the answer.

conda provides its own (conda) environments (see https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html). They show many similarities with our (basic) discussion of (Python) virtual environments. We have no space to discuss the conda environments further here.

The following comments are meant to be helpful for those who have installed their Python interpreter through anaconda. If you are not using Anaconda, you can ignore this section.

18.4.2. Can I use Python virtual environments when using the anaconda distribution?#

This can be done and is a good way to create virtual environments. (All the examples above in this chapter use the Python3 interpreter from an anaconda installation on an OSX system).

18.4.3. Should I install a python package through conda or pip?#

The typical scenario is that one installs Anaconda, and most Python packages needed are available: somewhat standard tools such as numpy, scipy, matplotlib, pandas, jupyter, ipython and spyder already come with the anaconda distribution. Then some package is missing that needs to be installed additionally.

For example, the package xarray: this can be installed through conda or through pip.

Experience-based rough guidance is as follows if working within an anaconda installation of Python:

  • avoid mixing pip installs with conda installs

  • if conda can install the required packages, then use that

  • if conda cannot install the required packages, we have to use pip. In that case:

    • install the requirements that need to come from conda (if any)

    • then install the desired packages through pip

    • after having used pip, do not use conda again to install more packages.

      The reason for this is that conda and pip cannot interact perfectly, and so the changes that one package manager made, maybe overriden or accidentally repeated slightly differently by the other one.

A more detailed discussion is available on the Anaconda blog.

18.4.4. Can I create a conda environment, and then create python virtual environments from this?#

Yes.

This is also an option to install different Python versions.

For example: create a conda (!) environment providing Python 3.8:

conda create -y -n python38 python=3.8

Then

conda activate python38

and then create a virtual environment using

python -m venv myvirtualenv38

18.5. Managing many different environments - pyenv#

If you use many different python environments, posssibly with different interpreter versions, you may want to learn about pyenv (home page at https://github.com/pyenv/pyenv).

Pyenv can install a multitude of python interpreters, create virtual environments for each of those. It is further possible to define an a per-directory basis which environment should be used in that directory. This is convenient when using different enviroments for different projects, as one does not need to manually activate the virtual environments.

Tidy up: remove files created in this section

!rm -rf /tmp/cowsay-python
!rm -rf myvirtualenv-copy
!rm -f requirements.txt