Computational Science and Data Science

Hans Fangohr

Installation of Python, Spyder, Numpy, Sympy, Scipy, Pytest, Matplotlib via Anaconda (2023)


This is the most recent version of the installation instructions. (Older versions from 2014/2013, where we have used Python 2 (!) are available here.)

Introduction

These notes are provided primarily for students of graduate schools IMPRS and DASHH, staff and students at the Max Planck Institute for the Structure and Dynamics of Matter and others at DESY, as well as students at the University of Southampton (United Kingdom).

The objective of these introductory notes is to help readers install Python on their own computers, and to support their learning of programming, computational science and data science, and subsequently their studies, particular in natural sciences, mathematics, engineering, and computer science.

In short, we suggest to use the Anaconda Python distribution.

By the nature of the information provided, the content is likely to become partially outdated over time. For reference: this mini-introduction was written in September 2016, where Anaconda 4.1 was available, and Python 3.5 was the default Python provided, and last revised in December 2022, where conda was version 22.9.0, and Python 3.9.13 is the default interpreter.

What is what: Python, Python packages, Spyder, Anaconda

Python

Python is a programming language in which we write computer programs. These programs are stored in text files that have the ending .py, for example hello.py which may contain:

print("Hello World")

Python is also a computer program (the technical term is ''interpreter'') which executes Python programs, such as hello.py. On windows, the Python interpreter is called python.exe and from a command window we could execute the hello.py program by typing:

python.exe hello.py

On Linux and OS X operating systems, the Python interpreter program is called python, so we can run the program hello.py as:

python hello.py

(This also works on Windows as the operating system does not need the .exe extension.)

Python packages

For scientific computing and computational modelling, we need additional libraries (sometimes called packages) that are not part of the Python standard library. These allow us, for example, to create plots, operate on matricies, and use specialised numerical methods.

The packages we often need include:

  • numpy (NUMeric Python): matrices and linear algebra
  • pandas: Python data science tools (Series and Dataframes)
  • scipy (SCIentific Python): many numerical routines
  • matplotlib: (PLOTting LIBrary) creating plots of data

We also use in this training:

  • sympy (SYMbolic Python): symbolic computation
  • pytest (Python TESTing): a code testing framework

The packages numpy, scipy, pandas and matplotlib are essential components computational work with Python and widely used.

Sympy has a special role as it allows SYMbolic computation rather than numerical computation.

The pytest package and tool supports regression testing and test driven development -- this is generally important, and particularly so in best practice software engineering for computational studies and research.

Spyder

Spyder (home page) is s a powerful interactive development environment for the Python language with advanced editing, interactive testing, debugging and introspection features. There is a separate blog entry providing a summary of key features of Spyder, which is also available as Spyder's tutorial from inside Spyder (Help -> Spyder tutorial).

The name SPYDER derives from "Scientific PYthon Development EnviRonment" (SPYDER).

We will use it as the main environment to learn about Python, programming and computational science and engineering.

Useful features include

  • provision of the IPython (Qt) console as an interactive prompt, which can display plots inline
  • ability to execute snippets of code from the editor in the console
  • continuous parsing of files in editor, and provision of visual warnings about potential errors
  • step-by-step execution
  • variable explorer

Anaconda

Anaconda is a Python distribution. Python distributions provide the Python interpreter, together with a list of Python packages and sometimes other related tools, such as editors. To be more precise, Anaconda is not limited to packaging Python packages, but initially emerged to cater for Python-based applications and packages.

The packages provided by the Anaconda Python distribution include all of those that we need, and for that reason we suggest to use Anaconda here.

A key part of the Anaconda Python distribution is Spyder, an interactive development environment for Python, including an editor.

Installation

In general, the installation of the Python interpreter (from source/binaries) is fairly straightforward, but installation of additional packages can be a bit tedious.

Instead of doing this manually, we suggest on this page to install the Anaconda Python distribution using these installation instructions, which provides the Python interpreter itself and all packages we need.

The Anaconda Python distribution is available for download for Windows, OS X and Linux operating systems (and free).

For Windows and OS X you are given a choice whether to download the graphical installer or the next based installer. If you don't know what the terminal (OS X) or command prompt (Windows) is, then you are better advised to choose the graphical version.

If you are using Linux, you probably want what is called "x86" (unless you own a "Power8" or "Power9" machine, etc.)

If you have a Mac with M1 or M2 processor, select the M1 option. Either graphical or command line installer is fine.

Download the installer, start it, and follow instructions. Accept default values as suggested.

During the installation, you may have the option to install additional editing environments. You don't need to install these for this course, but it should not do any harm either.

If you are using Linux and you are happy to use the package manager of your distribution -- you will know who you are --, then you may be better advised to install the required packages indivdually rather than installing the whole Anaconda distribution.

Test your installation

Once you have installed Anaconda or the Python distribution of your choice, you can download a testing program and execute it.

Running the tests with Spyder

  1. Start Spyder

    This can be done either by typing spyder in a terminal or inside the Anaconda Prompt, or by starting Spyder through the Anaconda Navigator.

    Spyder should be installed when you install the anaconda distribution. However, if spyder has not been installed (observed once on a Mac M2 in December 2022), then use the command conda install spyder to install it.

    The current version of Spyder is 5.3 (at the time of writing).

    Spyder may ask you if you want to install kite. This is not necessary for the course.

  2. Download the testing file.

  3. Open the file in Spyder via File -> Open.

  4. The execute the file via Run -> Run.

    If you get a pop up window, you can accept the default settings and click on the run button.

    You should see output similar to this in the lower right window of spyder (you may also see a plot appearing):

    Running using Python 3.9.13 (main, Aug 25 2022, 18:24:45)
    [Clang 12.0.0 ]
    Testing Python version-> Python     OK 3.9.13
    Testing numpy...      -> numpy      OK 1.21.5
    Testing scipy...      -> scipy      OK 1.9.1
    Testing pandas...     -> pandas     OK 1.4.4
    Testing matplotlib... -> matplotlib OK 3.5.2
    Testing sympy...      -> sympy      OK 1.10.1
    Testing pytest...     -> pytest     OK 7.1.2
    Completed 2022-12-27 16:21:47.249747 in 17.4 seconds.
    

If the test program produces these outputs, there is a very good chance that Python and the six listed packages are installed correctly.

Running the tests from the console

  1. Open a console:

    • Windows: type cmd in the search box
    • Mac OS X: Start the Terminal application that is located in the Utilities folder in Applications
    • Linux: start one of the shells you have available, or an xterm or so.
  2. Download the testing file to your machine.

  3. Change directory into the folder you have downloaded the file to, and type:

    python test-python-installation.py
    

If all the tests pass, you should see output similar to this:

Running using Python 3.9.13 (main, Aug 25 2022, 18:24:45)
[Clang 12.0.0 ]
Testing Python version-> Python     OK 3.9.13
Testing numpy...      -> numpy      OK 1.21.5
Testing scipy...      -> scipy      OK 1.9.1
Testing pandas...     -> pandas     OK 1.4.4
Testing matplotlib... -> matplotlib OK 3.5.2
Testing sympy...      -> sympy      OK 1.10.1
Testing pytest...     -> pytest     OK 7.1.2
Completed 2022-12-27 16:21:47.249747 in 17.4 seconds.

Missing packages

If you install Python in other ways than through the Anaconda distribution and, for example, you have only installed the numpy, scipy, pandas and matplotlib package, the program's output might be:

Testing Python version-> Python     OK 3.9.13
Testing numpy...      -> numpy      OK 1.21.5
Testing scipy...      -> scipy      OK 1.9.1
Testing pandas...     -> pandas     OK 1.4.4
Testing matplotlib... -> matplotlib OK 3.5.2
Testing sympy...      Could not import 'sympy' -> fail
Testing pytest...     Could not import 'pytest' -> fail

Updating packages in the Anaconda installation

To update, for example, spyder and python, follow these steps:

  1. Open a terminal (see step 1 in Running the tests from the console)

  2. Update the conda program (this manages the updating) by typing the following command into the console:

    conda update conda
    

    Confirm updates if asked to do so. More than one package may be listed to be updated.

  3. Update individual packages, for example spyder:

    conda update spyder
    

This introductory page from the Anaconda team may contain useful material to get started with the Anaconda.

Further reading

To lean more about Anaconda, try the documents and introductory tutorials offered at https://docs.anaconda.com/anaconda/ .

Comments