Evidently, functions do play an ubiquitous role in a functionally oriented language. So, it is a good idea to spend a bit more time with them. In a nutshell, the contents of this lesson may be summarized as: "with very few exceptions, functions just behave in the natural way one should expect them to behave."
In particular, this means:
Now, all this may sound very obvious. Actually, it is not, as there are many programming languages which do not provide us with functions that have all these properties. Let us just look at the third point to give an example. In C, one could write:
Anonymous entities in C |
---|
|
The major differences between OCaml functions and the mathematical concept of a function as a mapping are:
f(5)
will always be the
same value, no matter where the evaluation of f
at the
point 5
occurs in a line of reasoning. In OCaml, there
are functions for which this is not the case, such as
e.g. Random.int: (int ->int)
.
Unix.unlink: (string
-> unit)
may delete a file.
From the C example above, we see that there is a fundamental
difference between functions and other values in C: Functions cannot
be constructed anonymously. Actually, this is just a symptom of a much
deeper problem: The C programming language does not have any
"functions"!. What do I mean by that? Even though C terminology
calls constructions like "function_number_five
" above a
"function", technically speaking it is not: it merely is a pointer to
a block of memory where machine code instructions lie that go through
a sequence of operations and can be called with arguments. From our
point of view, this is rather just a subroutine than a proper
function.
It is very instructive to look at the history of the Perl programming language in order to understand what is going wrong here. Indeed, the Perl developers got this issue of anonymous values wrong over and over again, first for arrays, then for file handles, then for regular expressions, and at some point in time had to repair it. This is one reason why the Perl programming language is so uneven.
Initially, Perl did not have a concept of arrays as other
programming languages do: instead of a container value to which one
could refer as this array, perl provided an unique concept of
"plurality", which is, there are means to refer to that value
(a single one), or those values (a collection). Now, this
concept of plurality did not allow to talk about e.g. using an array
of arrays to represent a matrix: in no useful sense could those
values
be composed of elements which represented these
other values each. Putting a "group" of values into another group
of values just gives a larger group in Perl. One give collections of
values a name, but one could not treat them as a separate new entity
which could be passed around freely in a program.
So, what people did here was to introduce a concept of "symbolic references", which roughly means that they introduced a way to interpret an ordinary string as a variable name, so that one then could just pass around the string name of an array and use that whenever needed. This seemed to somewhat solve the problem, but surely, it neither was elegant, nor overly bright: When one wanted to write a function that created a matrix, one had to invent internal-only unique name strings for every row, and as this methods really introduced a set of new variables whose values had to be the rows. This led to all sorts of problems, such as that it was difficult to take care that on the one hand, making a new matrix did not over-write a pre-existing one and at the other hand, the entries of a matrix were freed properly once it was no longer in use.
Making an array of arrays in old perl (schematically) |
---|
|
What's changed with modern Perl is that they introduced two new concepts: first, that of a proper array in the sense of "this collection of values", second, this new container data type comes with an "anonymous array constructor", that is, a way to create an array without having to give it a name.
What changed with Perl5 |
---|
|
This example serves us as a double warning, more so as in particular with Perl, the very same problem re-occurred in other guises (as mentioned), where the initial (or even present) approaches were about as bad as these "symbolic references": First, one must admit that it is surprisingly easy to be confused about the issue of why and how to properly support anonymous composite values in a programming language. Second, it usually is not a good idea not to think enough about the underlying abstract concepts. In the end, it will just turn out that one cannot avoid them anyway, and inventing some ad-hoc approaches that don not get the fundamental idea right might bite back badly.
We already have seen how to create anonymous functions in OCaml in the last lesson. As a reminder:
Reminder: Anonymous functions in OCaml |
---|
|
Here, we created a list of three functions mapping integers to
integers. As the type of such a function is int ->
int
, the type of that particular list has to be (int
-> int) list
. Note that the parentheses around (fun
...)
are actually necessary here (but admittedly for somewhat
unfortunate reasons).
This settles another point: how to put functions into lists (, arrays, tuples, ...), and how to retrieve them and evaluate them. Whenever we want to evaluate a function, we just put the argument after the function. This is a very important general rule and we will have more to say about this very soon.
What does it mean then that the behaviour of a function, when evaluated, must not depend on where that evaluation takes place? Actually, this is something very simple and natural, and it may just be that we are not familiar enough yet with functions so that it does not sound strange. So, let us demonstrate the concept with arrays, where it is just the same: Suppose I give the following definition:
Arrays and Scope |
---|
|
If we go back to our list of operators from the first lesson, we
see that there is a difference between "being the same" and "being
equal", and we can express this difference in meaning in OCaml by
using either the "==
" or the "=
" comparison
operator, the former testing for same-ness. Now, in our situation, we
get (look sharp):
Arrays and Scope, continued |
---|
|
So, my_array_v1
and my_array_v2
may look
very similar at first, but actually, they are structurally very
different: In the first case, the first and second entry of this array
are the very same entity, while in the second case, they are not. So,
if I just replaced my_array_v2.(0).(1)
with 500, I would
get what I asked for, but if I did this with
my_array_v1.(0).(1)
, then
my_array_v1.(1).(1)
would change accordingly!
This may not be as unexpected if we manipulated the rows in the
scope where a1, a2
are defined, but it might require some
getting used to that in fact, these row entities survive beyond the
scope where they are defined via being referenced in the final
array-of-arrays value. After the definition of
my_array_v1
, the names a1
and
a2
are gone forever - they were only valid locally. But
through the value returned from that scope, we retain a handle on
them.
It is just the same with functions. To demonstrate this, let us look at a function which is returned from a local scope that contained a variable which entered in the definition of that function. Here is an equivalent of the last example using functions instead of arrays:
Functions and Scope |
---|
|
I suppose there should be little debate that the particular behaviour of scoping issues and anonymously created arrays is both desirable and useful. But then, what would be more natural than to demand that with anonymously created functions, where these issues arise in just the same form again, things work just in the same way?
In fact, we already have seen something like this in the last
lesson: remember the "vivify_polynomial
" function which
briefly showed up in the Horner example and the exercises? Here as
well, we define a function which retains some memory over values from
the scope it was defined in. This is a very powerful technique that
allows us to do all sorts of tricks. To give a very simple example, we
can define functions such as the following that map a number to a
"machine" adding that number to another number:
Functions and Scope: another simple example |
---|
|
Actually, it's the simplest thing in the world. Or maybe not. Just for the sake of looking over the fence, here is the equivalent in LISP, in a form which is understood by both the Emacs interpreter as well as by a proper Common LISP system, such as Bruno Haible's CLISP:
The same example in LISP (Common LISP and Emacs LISP) |
---|
|
Not infrequently, one encounters the situation that, given a function of multiple arguments, we want to fix some of them and treat it as a function of fewer arguments. For example, kinetic energy is a function of mass and velocity. Usually, we want to consider a given body with fixed mass. Then, it is just a function of velocity. This can be modeled very nicely and directly with these techniques:
Example: fixing parameters |
---|
|
Let us consider something more convoluted, but actually very
useful: as we have seen, we can map values like the number five above
to functions that "know" about that value. Of course, the initial
values can also be functions, and there are some very interesting
mathematical concepts which may be understood in terms of mappings
from functions to functions which we can capture that way. Actually,
what we can do is limited by the fact that all we can do in order to
get information about what's "inside" a function is to call it. So, we
will not be able to model, say, symbolic derivatives, in this way. We
can, however, model taking a derivative, or gradient, numerically. Let
us try this for the one-dimensional gradient of a function which is
given as a float -> float
function f
in
OCaml. Now, how does one compute such a gradient? We will have to
evaluate the function f
, as that is all we can do to
f
anyway. It will not help us to know f
only
at one place if we want to know something about how strongly it
varies. So, we have to evaluate it at least at in two places. And two
is already sufficient to get a numerical approximation for the
gradient. But we may want to do more in some situations. For the sake
of simplicity, let us be content with only two evaluations here. What
we furthermore have to know is the distance epsilon
of
the locations where we evaluate f
. We would get the
proper gradient in by taking the limit where epsilon
goes
to zero, but we clearly cannot do this on the machine. So we will
content ourselves with something much more primitive and just choose
epsilon=10^(-8)
. (There are deeper reasons for choosing a
value just about this large.) So, when we know epsilon
,
and the function f
, we can then make a function that maps
a position x0
to the numerically determined derivative of
f
at x0
with step width
epsilon
. Let us have a try:
Defining the gradient - first try |
---|
|
Here, we made epsilon
a parameter, which is slightly
more useful than having it fixed as 10^(-8)
. One could
always apply the parameter fixing technique shown above then. In
particular, we may then introduce a function
grad_with_epsilon_fixed
which takes epsilon
as a parameter and returns a gradient-taking function f
that uses this epsilon
:
Defining the gradient - a slight generalization |
---|
|
But if this is useful - a function that allows us to specify epsilon beforehand, why don't we just eliminate the intermediate pair, as well as the function taking that pair as an argument?
Defining the gradient - second try (simplification) |
---|
|
Note that we have seen earlier that "Whenever we want to
evaluate a function, we just put the argument after the
function"? This works here as well. grad_v2 1.0e-8
is a function, so if we just put the argument (fun x ->
x**1.5)
behind it, we evaluate the function for this
argument. Extra parentheses are not necessary here. This is indeed the
preferred notational style in OCaml for evaluating functions which
themselves are the result of a function evaluation. Simple and
convenient. And actually, it is not really worse than passing a pair
for us as well, as there is no reason at the conceptual level why one
should treat epsilon
and f
as belonging
together in a pair. But then, we may just as well write e.g.:
Evaluation by putting arguments after functions |
---|
|
Here, evaluating epsilon_half
and
inv_epsilon
is not really a lot of computational effort,
so it does not hurt if we do not pre-compute that value, but compute
1/epsilon
whenever we actually do compute a gradient
value. Let me just put this at the innermost level to demonstrate
something else, as we then get a nice repetitive (fun ... (fun
...
structure:
Repetitive fun after fun |
---|
|
Remember that let something = fun argument -> body
can be re-written as let something argument = body
without a change in meaning? (Actually, there may be a difference if
one looks very closely at memory requirements and execution speed. But
even if there is, there should be no reason why it should be - an
optimizer should be able to recognize this situation). We actually
can even do this repeatedly. Look how nicely all this then simplifies:
Multiple arguments at the left hand side of a let |
---|
|
Of course, one could just as well throw out the
inv_epsilon
here - there no longer is any point in
carrying this around. But that's a very minor issue. Now, let us look
at the type of any of those functions: OCaml reports it as
"float -> (float -> float) -> float -> float
". Here, one
should know that for functions returning functions, the convention for
types is that "a -> b -> c
" will always mean
a -> (b -> c)
, that is, parentheses have to be
inserted to the right. So, fully parenthesized, this type would read:
"float -> ((float -> float) -> (float ->
float))
". Indeed, we map a float
(namely,
epsilon
to a function mapping float -> float
functions to other such functions - which is just what the gradient
does. As we have seen, there seems to be kind of a duality between
functions whose arguments are tuples and functions which produce other
functions: one may regard a function such as addition either as a
mapping from a pair of numbers to a number, or alternatively, as a
mapping from numbers to mappings from numbers to numbers, where every
number N
is mapped to the increase-by-N
function. Both ways to express addition contain the same amount of
information, but the advantage of the purely functional point of view
is that it can do without any notion of tuples and such. In fact, it
is easy to define a notion of a tuple purely in terms of such
functions producing functions.
For the example at hand, this means that we could just as well
re-interpret this a bit more superficially as a function taking a
float and a float -> float
function f
, as
well as a position x0
, and producing a
float
, which is the corresponding numerical
epsilon
-approximation to the gradient of f
at x0
.
Functions like these whose values are again functions are known as "higher-order functions", and the number of arguments one may feed into such a function generally is called the "arity" of that function. However, that terminology may be considered as slightly misleading, as the major distinction is whether a language does support functions properly or not, and not on the maximal arity of a function.
This convenient way to deal with functions of multiple arguments as
functions mapping functions to functions comes at a small price,
however. First of all, there are no functions of zero arguments:
whenever a "function" does not really depend on an argument, the best
we can do is to make clear that we deal with an evaluation by using
()
as a pseudo-argument, which of course is not used in
the computation. (How could it be anyway?) Second, we cannot really
have "variable argument" functions. That is, while in Common LISP,
there are functions that can be evaluated with an arbitrary number of
arguments such as (gcd 60 80 100)
, something similar
cannot exist in OCaml. Usually, neither really is a noticeable problem.
One should note that the vast majority of OCaml's library functions
use precisely this style to pass multiple arguments. The order of
arguments usually is that of most reasonable increasing
specialization. For example, String.concat
takes as
arguments a string representing a "glue sequence" and a list of
strings, and produces a new string which is all the strings from the
list concatenated to one another, with the glue sequence placed
between any two adjacent strings. One most likely would want to use
this with a given glue sequence, such as "\n"
, or
":"
, or ", and "
, etc. on a variety of
string lists, so it seems appropriate to have the glue string as the
first argument.
A very interesting library function which I wanted to point out is
"Array.init
". This will map a length N
and a
function mapping an index to the corresponding array element to an
array that consists of the values of that function for all indices
from 0
up to and including N-1
:
Array.init example |
---|
|
Unfortunately, we cannot yet fully understand the type of this function - just as that of quite some other library function. But with a little bit of intuition on what they should do, this should not be a problem: if we just use them, they will normally work as expected.
One more notational detail: instead of fun x -> fun y ->
body
, we may also for anonymous functions always write
fun x y -> body
, and likewise for higher orders of
arguments.
Actually, all this really was cheating a bit. From this lesson, one may get the impression that OCaml behaves in a much more systematic way than it actually does. I want to warn my audience that there indeed are quite some dark corners where things do not work out as expected.
Now that we got somewhat proficient with using OCaml from within an
Emacs shell, it is perhaps appropriate to point out some other ways to
interact with the OCaml interpreter that sometimes are more
useful. The most important one is Emacs' caml-mode
. As
the audience may have noticed, whenever we load a .ml
file into an emacs buffer, this will activate Emacs' caml
major mode. As in every Emacs major mode, Control-h m
will give a short list of the most important mode-specific
keystrokes. What's interesting is that we can just use Control-c
Control-e
whenever the cursor is somewhere on a lengthy OCaml
expression to send this expression to an OCaml sub-process attached to
emacs and see what it evaluates to. Emacs will even be so nice to
first start that ocaml process for us if necessary. Another useful
keystroke is Control-c Control-h, which will show the part of
the OCaml documentation belonging to the function the cursor is
on. Also useful is
Control-c tab
for auto-completion.
There is another Emacs mode for editing OCaml, the so-called
tuareg-mode
, usually available as a separate
package. This claims to be more intelligent than bare
caml-mode
, but to some extent this will also mean that it
will more easily show strange opinions on OCaml code.
Within Emacs, there are multiple ways to access the OCaml
documentation: either via Control-c Control-h
in
caml-mode
, or via the info system, via navigating to the
OCaml entry. (The Emacs info system can be opened with Control-h
i
.) Another useful utility to navigate the OCaml documentation
is the ocamlbrowser
program, as this can often also refer
directly to the implementation of a given function.
Define a function that counts the number of occurrences of a given character (such as space, tab, newline, etc.) in a string.
Define a "scalar product" function on float arrays.
(Hint: this is structurally somewhat similar to the Horner's method example in the last lesson: Here, we walk through two arrays, remembering a partial sum in every step.)
Define a function that uses Random.float
to
numerically determine the value of the integral of a float ->
float
function on a given interval. (Note:
Random.float
might give random numbers of too low quality
for serious applications.)
Define a function that maps a nonnegative integer number
N
to a floatingpoint unit matrix (in the form of an array
of arrays).
Define a function that generalizes Array.init
to
matrices: given a number of rows and columns, and a function mapping a
row and column number to the corresponding entry, it will produce a
matrix.
Define a function that multiplies matrices of the forementioned form.
Use Array.init
and String.concat
(and
maybe some other library functions which you will find in the
documentation of the Array
, List
, and
String
modules) to define a plotting function that
behaves as follows:
Array.init example |
---|
|