Computational Science and Data Science

Hans Fangohr

Installation of Python, Spyder, Numpy, Sympy, Scipy, Pytest, Matplotlib via Anaconda (2021)


The most recent version of this document is available here.

Introduction

These notes are provided primarily for students of graduate schools IMPRS and DASHH, staff and students at the Max Planck Institute for the Structure and Dynamics of Matter and others at DESY, as well as students at the University of Southampton (United Kingdom).

The objective of these introductory notes is to help readers install Python on their own computers, and to support their learning of programming, computational science and data science, and subsequently their studies, particular in natural sciences, mathematics, engineering, and computer science.

In short, we suggest to use the Anaconda Python distribution.

By the nature of the information provided, the content is likely to become partially outdated over time. For reference: this mini-introduction was written in September 2016, where Anaconda 4.1 was available, and Python 3.5 is the default Python provided, and revised in March 2021, where Anaconda 2020.11 and Python 3.8 were the defaults.

What is what: Python, Python packages, Spyder, Anaconda

Python

Python is a programming language in which we write computer programs. These programs are stored in text files that have the ending .py, for example hello.py which may contain:

print("Hello World")

Python is also a computer program (the technical term is ''interpreter'') which executes Python programs, such as hello.py. On windows, the Python interpeter is called python.exe and from a command window we could execute the hello.py program by typing:

python.exe hello.py

On Linux and OS X operating systems, the Python interpreter program is called Python, so we can run the program hello.py as:

python hello.py

(This also works on Windows as the operating system does not need the .exe extension.)

Python packages

For scientific computing and computational modelling, we need additional libraries (sometimes called packages) that are not part of the Python standard library. These allow us, for example, to create plots, operate on matricies, and use specialised numerical methods.

The packages we often need include are

  • numpy (NUMeric Python): matrices and linear algebra
  • pandas: Python data science tools (Series and Dataframes)
  • scipy (SCIentific Python): many numerical routines
  • matplotlib: (PLOTting LIBrary) creating plots of data

We also use in this training:

  • sympy (SYMbolic Python): symbolic computation
  • pytest (Python TESTing): a code testing framework

The packages numpy, scipy, pandas and matplotlib are essential components computational work with Python and widely used.

Sympy has a special role as it allows SYMbolic computation rather than numerical computation.

The pytest package and tool supports regression testing and test driven development -- this is generally important, and particularly so in best practice software engineering for computational studies and research.

Spyder

Spyder (home page) is s a powerful interactive development environment for the Python language with advanced editing, interactive testing, debugging and introspection features. There is a separate blog entry providing a summary of key features of Spyder, which is also available as Spyder's tutorial from inside Spyder (Help -> Spyder tutorial).

The name SPYDER derives from "Scientific PYthon Development EnviRonment" (SPYDER).

We will use it as the main environment to learn about Python, programming and computational science and engineering.

Useful features include

  • provision of the IPython (Qt) console as an interactive prompt, which can display plots inline
  • ability to execute snippets of code from the editor in the console
  • continuous parsing of files in editor, and provision of visual warnings about potential errors
  • step-by-step execution
  • variable explorer

Anaconda

Anaconda is a Python distributions. Python distributions provide the Python interpreter, together with a list of Python packages and sometimes other related tools, such as editors. To be more precise, Anaconda is not limited to packaging Python packages, but initially emerged to cater for Python-based applications and packages.

The packages provide by the Anaconda Python distribution include all of those that we need, and for that reason we suggest to use Anaconda here.

A key part of the Anaconda Python distribution is Spyder, an interactive development environment for Python, including an editor.

Installation

In general, the installation of the Python interpreter (from source/binaries) is fairly straightforward, but installation of additional packages can be a bit tedious.

Instead of doing this manually, we suggest on this page to install the Anaconda Python distribution using these installation instructions, which provides the Python interpreter itself and all packages we need.

The Anaconda Python distribution is available for download for Windows, OS X and Linux operating systems (and free).

For Windows and OS X you are given a choice whether to download the graphical installer or the next based installer. If you don't know what the terminal (OS X) or command prompt (Windows) is, then you are better advised to choose the graphical version.

If you are using Linux, you probably want what is called "Linux" and not what is "Linux POWER". The "Linux" target refers to the common x86 architecture.

Download the installer, start it, and follow instructions. Accept default values as suggested.

During the installation, you may have the option to install additional editing environments. You don't need to install these for this course, but it shouldn't do any harm either.

If you are using Linux and you are happy to use the package manager of your distribution -- you will know who you are --, then you may be better advised to install the required packages indivdually rather than installing the whole Anaconda distribution.

Test your installation

Once you have installed Anaconda or the Python distribution of your choice, you can download a testing program and execute it.

Running the tests with Spyder

  1. Start Spyder

    This can be done either by typing spyder in a terminal or inside the Anaconda Prompt, or by starting Spyder through the Anaconda Navigator.

    The current version of Spyder is 4.1.

    Spyder may ask you if you want to install kite. This is not necessary for the course.

  2. Download the testing file.

  3. Open the file in Spyder via File -> Open.

  4. The execute the file via Run -> Run.

    If you get a pop up window, you can accept the default settings and click on the run button.

You should see output similar to this in the lower right window of spyder (you may also see a plot appearing):

Running using Python 3.8.5 (default, Sep  4 2020, 02:22:02)
[Clang 10.0.0 ]
Testing Python version-> py3.8 OK
Testing numpy...      -> numpy OK
Testing scipy...      -> scipy OK
Testing pandas...     -> pandas OK
Testing matplotlib... -> pylab OK
Testing sympy...      -> sympy OK
Testing pytest...     -> pytest OK

If the test program produces these outputs, there is a very good chance that Python and the six listed packages are installed correctly.

Running the tests from the console

  1. Open a console:

    • Windows: type cmd in the search box
    • Mac OS X: Start the Terminal application that is located in the Utilities folder in Applications
    • Linux: start one of the shells you have available, or an xterm or so.
  2. Download the testing file to your machine.

  3. Change directory into the folder you have downloaded the file to, and type:

    python test-python-installation-2021.py
    

If all the tests pass, you should see output similar to this:

Running using Python 3.8.5 (default, Sep  4 2020, 02:22:02)
[Clang 10.0.0 ]
Testing Python version-> py3.8 OK
Testing numpy...      -> numpy OK
Testing scipy...      -> scipy OK
Testing pandas...     -> pandas OK
Testing matplotlib... -> pylab OK
Testing sympy...      -> sympy OK
Testing pytest...     -> pytest OK

Missing packages

If you install Python in other ways than through the Anaconda distribution and, for example, you have only installed the numpy, scipy and matplotlib package, the program's output would be:

Testing numpy...      -> numpy OK
Testing scipy...      -> scipy OK
Testing matplotlib... -> pylab OK
Testing sympy...      Could not import 'sympy' -> fail
Testing pytest...     Could not import 'pytest' -> fail

Updating packages in the Anaconda installation

To update, for example, spyder and python, follow these steps:

  1. Open a terminal (see step 1 in Running the tests from the console)

  2. Update the conda program (this manages the updating) by typing the following command into the console:

    conda update conda
    

    Confirm updates if asked to do so. More than one package may be listed to be updated.

  3. Update individual packages, for example spyder:

    conda update spyder
    

This introductory page from the Anaconda team may contain useful material to get started with the Anaconda.

Further reading

To lean more about Anaconda, try the documents and introductory tutorials offered at https://docs.anaconda.com/anaconda/ .

Comments