p-norms, and how to compute them in principle¶
Disclaimer: this notebook is not a tutorial on norms, or numerics or symbolic methods or Python. It just brings a few of these things together, in the context of the $p$-norms. And it contains a large number of $\infty$ symbols. Other than that, only read on if you have nothing better to do.
What is a norm?¶
Given a vector, which could be a tuple of numbers (such as a position vector, or a phase space vector in $R^n$) or a function (in a function vector space), we have different ways available to measure the length or norm of the vector. These are called norms - this Wikipedia entry provides a useful overview.
For a vector $\vec{a}= (a_1, a_2, a_3) \in R^3$, the definition of the widely spread $p$-norm is $||\vec{a}||_p = \left(|a_1|^p + |a_2|^p + |a_3|^p\right)^\frac{1}{p}$ which we can also write as $||\vec{a}||_p = \sqrt[p]{|a_1|^p + |a_2|^p + |a_3|^p}$. (See also Wikipedia entry on $p$-norm at http://en.wikipedia.org/wiki/Norm_(mathematics)#p-norm).
2-norm¶
Probably most commonly used is the $p$-norm $||\phantom{x}||_p$ with $p=2$.
Suppose, we have a vector $\vec{a} = (1, 2, 3)$, then the 2-norm of this vector is written as $||\vec{a}||_2$ and computed as $||\vec{a}||_2 = \sqrt{|1|^2 + |2|^2 + |3|^2}$. Let's implement this in Python (writing code that is easy to digest and not optimised for performance):
import math
def pnorm_2(a):
"""Given a sequence a, returns the 2-norm of a."""
tmp = 0
for elem in a:
tmp = tmp + abs(elem)**2
return math.sqrt(tmp)
Let's test this with our example:
a = [1, 2, 3]
print(pnorm_2(a))
Let's compare this floating point number with the exact result. We use the occasion to introduce the symbolic python package sympy
:
import sympy
a_pnorm_2 = sympy.sqrt(1**2 + 2**2 + 3**2)
print(a_pnorm_2)
So the exact answer is that $||\vec{a}||_2 = \sqrt{14}$, and that is what we expect because $1 + 4 + 9 = 14$. We can ask symbolic python to convert $\sqrt{14}$ to a floating point number:
print(a_pnorm_2.evalf()) # evalf() evaluates the symbolic expression to a number
We briefly cross check that symbolic and numeric computation give the same result (we expect 0 to be printed):
print(pnorm_2(a) - a_pnorm_2.evalf())
So why is this norm $||\vec{a}||_2 = \sqrt{|1|^2 + |2|^2 + |3|^2}$ called 2-norm? Because we can write $\sqrt{x} = \sqrt[2]{x}$ as $x^{1/2}$, or in general we can write $\sqrt[p]{x} = x^{1/p}$. The 2-norm is the special case of the $p$-norm where $p=2$.
The meaning of this so called 2-norm, is that it measures Euclidian length of a vector. Imagine we have a vector pointing from (0, 0) to (1, 1), i.e. along the diagonal of the unit square with corners (0, 0), (0, 1), (1, 0) and (1, 1), then the length of the diagonal is $\sqrt{2}$:
pnorm_2([1, 1])
Okay, so we have digested what the 2-norm is, and written some Python code to compute it. As the 2-norm is one case of the more generic $p$-norm, let us write a more generic pnorm()
function that works for all values of p
:
def pnorm(a, p=2):
"""Returns p-norm of sequence a. By default, p=2.
p must be a positive integer.
"""
tmp = 0
for elem in a:
tmp = tmp + abs(elem) ** p
return tmp ** (1.0 / p)
Does it what we expect for $p=2$? (Yes)
pnorm(a, p=2)
1-norm¶
The 1-norm is also known as the Taxicab or Manhattan norm, see http://en.wikipedia.org/wiki/Norm_(mathematics)#Taxicab_norm_or_Manhattan_norm. In an euclidean vector space (such as $R^n$), it measures the distance from the origin to the point the vector points to assuming we can only move along orthogonal paths that are aligned with the orthogonal basis of the space. So for a vector $\vec{a} = (1, 2, 3)$, the distance should be 6 (that's one unit in x-direction, two in y, and three in z-direction):
pnorm(a, p=1)
Maximum or infinity norm¶
The $\infty$-norm or maximum norm is another somewhat commonly used measure. For a vector such as $\vec{a} = (1, 2, 3)$, it simply identifies the largest (i.e. maximum) element, so that $|| \vec{a}||_\infty = 3$.
Why is this maximum norm also called infinity norm? Because it is the $p$-norm with $p=\infty$. The argument goes such that we compute $|| \vec{a}||_\infty = \sqrt[\infty]{|1|^\infty + |2|^\infty + |3|^\infty}$, and that - in this case - $3^\infty$ grows a lot faster than $1^\infty$ and $2^\infty$, so that we can ignore $1^\infty$ (which evaluates just to 1 anyway) and we can also ignore $2^\infty$ in comparison to $3^\infty$. Once we have made this leap, we compute the infinite root of 3 to the power of infinity, i.e. $\sqrt[\infty]{3^\infty} = 3$. So we get the maximum component of $\vec{a}$, having computed it through the $p$-Norm, with $p=\infty$.
Some notes on infinity in numerical computations (and the $\infty$-norm)¶
Can we use our Python code to compute the maximum norm, though?
The first question is how to express the concept of infinity in numbers. There are some useful notes available at http://pymotw.com/2/math on the IEEE infinity flag for floating point numbers in Python, noting in particular an interesting inconsistency that with x = 1e160; y = x*x
the variable y
will hold the flag for infinity, but using x = 1e160; y = x**2
will create an OverflowError.
In any case, we can use the following expression to create a floating point variable with the IEEE Floating Point Representation of the infinity flag:
inf = float('inf')
Let's investigate this new object:
print("My type is {}.".format(type(inf)))
print("My value is {}.".format(inf))
This seems a good place for a health warning: the inf flag has been introduced to record particular types of errors (for example overflows), and it is not intended to be used for calculations.
Although the IEEE standard does try to show sensible behaviour. Here are some examples:
inf + inf # twice infinity is still infinity
10 + inf # any finite number plus infinity should be infinity
inf - inf # this is a difficult one as it depends on what inifinty means exactly
The answer "nan
" stands for Not A Number (NAN), and basically tells us that the computer cannot answer the question we have given to it.
The three examples above (happen to) do what we expect. So can we just use infinity as the order $p$ in our Pyton function to compute the p-norm? (No!)
pnorm(a, p=inf)
We expect 3.0 to be the correct answer, so 1.0 is wrong. So in short, we cannot just put the IEEE infinity representation in this equation to get the right result.
We can try to analyse this further
tmp1 = 1**inf + 2**inf + 3**inf # Adding up infinity flags,
print("tmp1 = {}".format(tmp1))
tmp2 = tmp1 ** (1./inf) # infinite root of infinity
print("tmp2 = {}".format(tmp2))
So it seems to boil down to the computation of $\sqrt[\infty]{\infty}$ and the IEEE floating point behaviour appears to be to report 1.0
.
Mathematically, we expect $\sqrt[\infty]{a^\infty} = ( a^\infty )^{\frac{1}{\infty}} = a $ to hold.
So in particular, we should find that
$\sqrt[\infty]{1^\infty} = (1^\infty )^{\frac{1}{\infty}} = 1$
and
$\sqrt[\infty]{2^\infty} = (2^\infty )^{\frac{1}{\infty}} = 2$
What is the behaviour if we (inappropriately!) try to do these computations with the infinity flag?
print((1.0**inf) ** (1/inf)) # expect 1.0
print((2.0**inf) ** (1/inf)) # expect 2.0
So this is not working.
Just to re-iterate: it is generally not a good idea to try to numerically evaluate definitions that depend on infinity, and it is also not right to attempt to carry out arithmetic operations using infinity, in particular not with the IEEE infinity flag. We are really just being curious here, exploring the behaviour of the system - knowing that it is not meant to work like this.
A better implementation of the p-norm function might be the following, where we treat the $p=\infty$ case separately:
import numpy
def pnorm(a, p=2):
"""Returns p-norm of sequence a. By default, p=2.
p must be a positive integer.
"""
if math.isinf(p):
# take absolute value for all numbers in a
a_abs = [abs(x) for x in a]
# return largest absolute value
return max(a_abs)
else:
tmp = 0
for elem in a:
tmp = tmp + abs(elem) ** p
return tmp ** (1.0 / p)
pnorm(a, float("inf"))
In any case, what triggered this whole notebook was the discussion during a coding session last Friday wether symbolic packages could compute the infinity norm correctly from the given p-norm definition. Here is the answer for our example of $\vec{a} = [1, 2, 3]$:
p = sympy.Symbol('p')
infnorm = (1**p + 2**p + 3**p) ** (1 / p)
print(infnorm)
Let's substitute $p$ with $\infty$, and we get the (wrong) result:
infnorm.subs(p, sympy.oo) #oo is infinity in sympy
To get the right result, we need to use the limit of $p \rightarrow \infty$.
In fact, we should never have written $|| \vec{a}||_\infty = \sqrt[\infty]{|1|^\infty + |2|^\infty + |3|^\infty}$, above. I use this opportunity to apologise profoundly to all mathematicians reading this for the loose use of infinity on this page, and the lack of limits above.
It would be better to say it like this: $|| \vec{a}||_\infty = \lim\limits_{p \rightarrow \infty} \sqrt[p]{|1|^p + |2|^p + |3|^p}$, i.e. we take the limit of $p$ approaching infinity. We have in fact used this picture above already, where we said "$2^\infty$ grows slower than $3^\infty$": what we mean is of course that $2^p$ grows slower than $2^p$ as we increase $p$.
We can also take this limit using the symbolic package, to (finally) arrive at the correct result:
sympy.limit(infnorm, p, sympy.oo) # take limit of infnorm for p -> infinity
Let's emphasise (again) that there is no need at all to carry out the calculations above; other than the curiosity regarding our computational (numeric and symbolic) capabilities and incapabilities.
About this document ...¶
This blog entry has been produced using the incredible IPython Notebook. When I completed the little entry, I was concerned I may have invested a lot of time (that reads "wasted my time") putting this together. After all, what is the value? Well, it tries to explain the idea of the p-norms, and at the same time allows to explore a little how representations of infinity are used in computational tools such as Python (based on CPython here) and symbolic packages such as sympy. Is it justified to go through all the effort of putting a webpage together that documents this? How much fun is one possibly allowed to have?
Interestingly, it didn't take that much time! It would have taken me a lot of time if I had to compile a C programme to explore how the infinity flag behaves, and if I had to record, understand and summarise this behaviour, before I could start to write it up in some way. I would then have to include code snippets and output from the program, and if I wanted to double check I got everything right, I would have to compile my test programs again, etc.
Using the IPython Notebook, I benefitted from (i) Python which - through its interpreter - makes testing little code snippets a lot easier than using a compiled language, and (ii) the IPython Notebook which helped greatly in writing this up: the explorative computing [that's the code cells with their output] are automatically integrated in the document, and can be executed within a second (select Cell -> Run All
).
This reflects patters we see in reseach and teaching increasingly: having the code, the documentation (including latex equations and URLs), and resulting plots in one place gives a significiant boost in reproducibility, but is also more efficient than having to put together code, data and figures manually in a report, say.