With very few exceptions, our discussion of OCaml so far only covered the "core language", as well as some of the standard libraries. We can by now write and even compile small OCaml scripts, but we cannot yet build our own libraries. It is now time to change this. In this lesson, we will not always be as detailed with our explanations as in the previous ones, but rather take the pragmatic route and see that we get going - even though this perhaps will inevitably mean that we will occasionally have to deal with exotic situations which we cannot fully understand using only the material presented here. On the other hand, some very basic and general things have to be covered in great detail, as it is essential to develop an understanding for the underlying machinery.
We first have to take a closer look at the compiler. One should
keep in mind that actually, OCaml is not one language, but two, which,
however, are syntactically almost the same (up to toplevel directives,
that is). For many compiler languages, the process of turning an idea
into working code will utilize files of many different types, some of
them source files, some of them intermediate files. This is just as
true for a C compiler (.c
, .h
,
.o
, .so
, .s
) as it is for, say,
TeX (.tex
, .dvi
, .aux
,
.log
), and also holds for OCaml. For some systems, file
endings are just a convention, while others enforce certain
names. OCaml is of the latter type: a file named
something.ml
will - for example - always be treated as an
OCaml source file. What other file types does the compiler know about?
OCaml file types | |
---|---|
Type (machine code equivalent given in parens if different) | Meaning |
.ml | OCaml source file |
.mli | OCaml interface file |
.cmi | Compiled interface |
.cmo (.cmxo) | Compiled object |
.cma (.cmxa) | Compiled object archive (library) |
.c | C source code |
.o | C object code |
.so | C shared object library |
It is nice that the OCaml compiler knows about C source files as
well and will call the C compiler when necessary. One detail worth to
know is that the manpage of the ocamlc(3)
compiler is
incomplete. One may for example wish to explicitly specify which C
compiler to use (say, gcc
or intel's icc
, or
use a C compiler wrapper such as mpicc
). This is
possible, and ocamlc --help
as well as the OCaml online
documentation tell us that we can use the -cc
option for
this, but this is not mentioned in the manpage. Also note that the
order of objects given to the compiler does matter.
What is this .mli
interface thing about? A
.ml
file provides a compilation unit. We may now choose
not to export all the definitions in that compilation unit to the
outside world, or make some type definitions opaque (that is, we just
tell the outside about the existence of a given type, but not its
complete realization). This is what the .mli
file is
for. Furthermore, we may want to put extra documentation into that
file - which is provided in the form of comments that adhere to
certain conventions. ocamldoc
will then allow us to
automatically generate HTML and latex documentation for our module. A
.mli
file is more or less just the list of variable types
and type definitions which the toplevel prints out if we load a
.ml
file. It can be auto-generated from a
.ml
file (for later editing) via ocamlc -i
code.ml
. The fine print says that we acually often can also go
without such an interface file, but as a matter of good practice, we
should always provide one.
The behaviour of OCaml with respect to re-compilation of modules
often is somewhat picky. It will include cryptographic hash
fingerprints in compiled interface .cmi
definitions and a
few other places, and as a consequence, if library/module B uses some
independent module A, which undergoes a change (and if this is even
only the addition of one more function), then B will complain that its
idea of the interface of A no longer matches reality, that is, B has
to be recompiled because A was. Such behaviour is somewhat unexpected
in particular to C programmers, and there have been long discussions
whether this makes sense and is a good thing or not. As one may guess,
it does not especially make life easy for module maintainers and in a
sense, Ocaml tries to be "holier than CVS version control" here.
Due to very similar issues (especially with component
dependencies), OCaml may feel quite a bit unnatural for seasoned C
developers when it comes to writing Makefiles. Indeed, many newcomers
seem to experience major difficulties here. Hence, one normally is
much better off using a pre-existing tool that deals with most of this
makefile complexity: OCamlMakefile
. This is a Makefile
that is to be included in our own Makefiles and provides quite a lot
of intelligence that does most of the really dirty work. In Debian,
it's part of the ocaml-tools
package. OCamlMakefile
cooperates quite well with
ocamlfind
, which is a package and dependency management
system for OCaml. (Objective Caml provides a simple library location
and loading framework right out of the box, as can be seen by giving a
directive like #load "unix.cma"
to the toplevel, but
findlib
is much more flexible.) The
ocamlfind
Debian package is called
ocaml-findlib
.
When installing ocamlfind
, one may want to make a few
adjustments, especially if multiple users in the Unix group
ocaml
are supposed to be able to install libraries
system-wide. Oh the author's system, this looks as follows:
/etc/ocamlfind.conf |
---|
|
Further adjustments may have to be made to the file
/usr/lib/ocaml/3.08.3/ld.conf
(This is unfortunate and
should not be, as configuration files should always reside under
/etc
in Debian):
/usr/lib/ocaml/3.08.3/ld.conf |
---|
|
The /usr/local/lib/ocaml
structure then was re-built
in such a way that the stublibs
directory is a direct
subdirectory of this. /usr/local/lib/ocaml
is owned by
user root, group ocaml, and has mode 2775. Packages are installed as
direct subdirectories.
In the project we are working on right now at the University of Southampton, there is a catch-it-all module which collects small useful snippets that are not present in the OCaml standard library and do not justify creating an individual new module either. In this module, one can find functions for degree to radian conversion just as well as a function to generate a random number with gaussian distribution. The directory looks like this:
In the snippets module's directory |
---|
|
Note that by default, the all:
target will build both
a bytecode and native-code library. Other interesting dependencies we
may want to include here are "doc
" (auto-generate
documentation) and "top
" (build a toplevel) - maybe even
"native-code
" to build a fast compiled standalone
executable. These are the most common ones; see the OcamlMakeFile
documentation for information on what else there is.
Note that we furthermore define a "mrproper
" symbolic
target for complete cleanup. This is nice and convenient. The name, by
the way, was taken from the Linux kernel source makefiles. Now, if
this builds correctly, we can use the power of
OCamlMakefile
and findlib
to install it with
a simple make libinstall
and remove it again with a
make libuninstall
. If some other package now depended on
snippets
, we would add snippets
to the
PACKS=
line in the Makefile, as well as to the
requires
line of the META
file, which may
then e.g. look like this: requires = "snippets qhull
mt19937"
. The META file is used by findlib.
This more or less tells us how to build and install simple OCaml modules. "Simple" in the sense that we do not use sophisticated foreign language interface techniques, but just basic plain OCaml.
If we want to use such an installed package from the toplevel, the magical incantations are:
Loading findlib packages from the toplevel - example |
---|
|
...where the open
just imports all the symbols into our
namespace so that we can refer to them directly instead of having to
use names such as Snippets.deg_rad
. Note that the
compiler does not understand toplevel directives. So, whenever we have
the situation that some piece of OCaml code (a small script, say) is
to be fed both into the toplevel and the compiler, it makes sense to
add a small toplevel loader wrapper script which just contains the
package loading directives, plus a final #use "mycode.ml";;
directive which then loads the "interesting" code.
Even if we had the most elegant, most efficient, most effective language available, its value would be reduced greatly if it did not come with an interface to C. The very simple reason for this is that nowadays, a lot of important functionality is available in the form of C libraries - especially if speed matters for the task. Sooner or later, we want to tap that resource, and hence, every language must provide a C interface, just as every serious programmer should be somewhat proficient with C (even if he does not use it often).
Concerning foreign language interfaces in general, here are different levels of sophistication, and the answer to the question what can be achieved depends just as much on the ability of the language as well as on that of the programmers of both the interface code and the code to be interfaced. In fact, many things can go wrong, and should insurmountable problems arise, they are practically always a consequence of bad design. So, it pays to spend quite some time thinking about foreign language issues, no matter if one takes on the role of library implementor, language designer, or interface code writer.
One very important point to keep in mind is that it is very easy to build large and complex systems by combining components which were never intended to interoperate well. As a rule of thumb, the amount of internal interface friction in a software project with N components usually is proportional to N^a, with the exponent a being closer to 2 than to 1. So, the most problematic moments in the development of a N-component application are whenever there is a version update/change of at least one component. From this perspective, especially the philosophy underlying the Debian GNU/Linux system to provide a stable platform as well as a large library of components whose behaviour is and stays frozen for long times (up to important bugfixes) comes as a blessing, and presumably such a concept of behavioural (version) stability and reliability should find wider recognition as a vital quality factor.
Another important general observation: designing the interface to a
foreign language library is a task that often needs a lot of thought
and sometimes a certain amount of experience. One of the major
problems is that the fundamental philosophy underlying the two
languages which are to be bridged is different. (If it were not, at
least one of them would be completely unnecessary.) So, the big
question is how to catch the spirit - the key ideas - underlying the
piece of code to be interfaced and map this in the best possible way
to something that feels smooth from the perspective of the new
language. There is no patent recipe to that question, but there are a
few common observations one should know about. First of all, it is
often a good idea to keep a foreign language interface as direct and
as low level as possible. While it may be tempting to put more
intelligence into the interface and employ the power of the new
language to make things more convenient, this is a double-edged sword,
as it may easily lead to a violation of the principle of least
surprise. In particular if the library in question is well known
and frequently used in its natural environment, one must assume that
many users of the interface will have expectations that were shaped by
the original behaviour. In fact, the author of this lesson still
remembers quite well the shock of finding out that there is a subtle
difference in the behaviour of Perl's built-in fork()
and
Unix fork(2)
: In case of a fork failure, the latter
returns -1
(and sets errno
appropriately),
while the former returns undef
! That may certainly have
been well meant, but such unexpected surprises may have disastrous
consequences. Another aspect is: the simpler the interface, the less
effort to adjust it to new versions.
If, however, the target language gives strong safety guarantees (type safety, bounds checks, crash stability and such), one must assume that the user of the interfaced library will expect those safety belts to also work with that particular library. The same holds for automatic dynamical resource management, such as garbage collection. So, one usually would like to provide at least these features in a manually written interface - but again, depending on the situation, there may be exceptions (e.g. if this turns out to be prohibitively complicated, or if it is important to stay at the lowest level, or if it greatly simplifies things, or just if it's too small a problem to be worth the effort).
Writing interface code manually often is a tedious task with many repetitive steps. True, there are situations where one should use as much intelligence and wisdom as possible, but there also are situations where one has to interface dozens of functions in an uniform way. Then, it is often a good idea not to write all that code by hand, but use code that automatically generates that part of the interface code.
Nowadays, there are tools such as the Simplified Wrapper and Interface Generator (SWIG) that can help a lot.
Just because something can be done in principle, it need not be a good idea. If something looks challenging, this may be for a variety of reasons, one of them being that the author whose work we decided to build on had chosen an excessively unelegant and clumsy approach, maybe because he did not fully understand the nature of the problem. (This is not necessarily a fault of the author. If we only did things we understand well all the time, there would be virtually no progress at all! Hence, we desperately need brave souls that tackle problems which nobody understands properly, and many a breakthrough was achieved only after a lot of confusion.) Nevertheless, one should always keep the question in mind: does it really have to be that way? Can't we achieve the same or a better result with a more elegant approach, maybe using some other piece of code?
In particular, whenever you have the impression that you have to fight against the original author, as his intentions do not at all match yours, and his code evolves in a different direction than the one you are interested in, better look for an alternative.
As mentioned above, there are different levels of sophistication in
the art of calling foreign functions - from just executing a C
function to print a value to stderr
to writing code that
allows one to turn C callback functions into callback functions in the
higher-level language. Other exotic applications may include using the
C compiler at run time to map C code strings into dynamically loaded
shared objects, or telling a C library to use the dynamical memory
management of the higher-level language instead of
malloc(3)/free(3)
in order to put its dynamical state into
serializable strings (cf. PCRE).
Quite in general, one may want a good foreign function interface to provide the following capabilities:
main()
function.mpicc
).
.so
libraries (that are to be used in
conjunction) utilize code written in the extension language at the
same time.
Fortunately, OCaml provides us with a quite powerful C interface
that allows us to do most of the things mentioned above - even if this
sometimes seems to be just by accident. (For example, it is possible
to turn OCaml code into a C-linkable .so
shared library
that almost behaves like any other C library, which is great, but this
does not work for 64-bit x86 code.) One problem, however, is that at
present, the documentation is somewhat scattered, and there are
important things to be known which are not well documented at all.
After these general remarks, it is perhaps appropriate to look at something more practical and explain the details by means of a few typical examples. What one should know about the C interface is:
.c
source file arguments
and know what to do with them. (See above.)
alien-funcall
and extern-alien
to
directly call code from a C library (i.e. all the wrapping information
is provided in Lisp, not C).
Gc.full_major ();;
"
will result in a program segfault. ("Garbage collection" customarily
is abbreviated as "GC".)
But let us have a look at the perhaps simplest example of a C
function exported to OCaml. We will wrap this up in a dedicated
package, which we call "c_examples
". We start out by
creating a corresponding directory, into which we put the following
files:
Directory Structure |
---|
|
META |
|
Makefile |
|
c_examples.mli |
|
c_examples.ml |
|
c_examples_impl.c |
|
Next, to demonstrate that this actually works, we do the following:
|
Now that we have seen that this actually works, let us look in some
more detail at what is going on here. Clearly, the .mli
file just specifies what to export - as we only provide one interfaced
function anyway, this is very straightforward. In the .ml
file, we declare our function as external
,
i.e. implemented by a piece of code adhering to C linking conventions,
whose linker name we give as well. The implementation of that function
takes as argument an OCaml value, and has to return an OCaml
value. Here, this is supposed to encode an integer, and we need
conversion functions to map OCaml values to C values and back. For
int
, this is pretty straightforward, but we have to keep
in mind that the OCaml integer range is strictly smaller than the C
integer range!
Note the use of CAMLparamX()
and
CAMLreturn
macros to declare and handle entities of type
value
that represent OCaml values. These are necessary to
live in harmony with the garbage collector. Indeed, there are more of
this type, the next most important ones being the
CAMLlocalX()
macros. More about this later.
We will now proceed to extend our example with further definitions that demonstrate a few basic techniques. First, let us see how to wrap up higher order functions, how to pass floatingpoint numbers, and how to add primitive debugging facilities to the C code: we add the following definitions and then rebuild:
Extending our example |
---|
c_examples.mli |
|
c_examples.ml |
|
c_examples_impl.c |
|
Indeed, after rebuilding:
|
The process of wrapping up and unwrappng values from one language
for another language sometimes is called marshalling, but
nowadays this more often refers to "serialization", that is, mapping a
piece of data with potentially complex structure to a string to store
it and retrieve it later on. (Incidentally, OCaml provides a
Marshal
library which is all about serialzation.) As we
see, the names of the functions and macros we use to do the mapping is
somewhat non-uniform, but so is their internal mechanics:
Int_val
is not much more than a very simple bit-shifting
macro, while copy_double
will have to heap-allocate space
to hold a double-float value.
One should know that higher order functions can only be wrapped in this way if they have up to five arguments. (Usually, this is enough.) Other techniques have to be used for functions with more arguments.
Let us try something slightly more challenging next: we want to
wrap some functions from a library other than libc
or
libm
and pass around strings. Let us use the low level
X11 library xlib
here. In particular, we want to be able
to open and close a connection to an X display and obtain the X server
vendor identification string. We hence add the following to our example:
Extending our example: xlib |
---|
Makefile |
|
c_examples.mli |
|
c_examples.ml |
|
c_examples_impl.c |
|
Testing the xlib functions |
---|
|
This is already quite nice, but it opens up new questions. If we "lose" an Xlib pointer, it will be garbage collected, but the connection stays open. We might instead prefer to have that particular case handled in such a way that an incidentally forgotten active Xlib connection that is garbage collected will be closed automatically. Furthermore, might even want to have Xlib functions that are called on an inactive/invalid display raise an exception. All this can indeed be implemented, and will be our next major example. But before we consider this, let us make an excursion that explains some more of the background mechanics underlying the low level implementation and inparticular the C interfaces of many functional languages.
If look under the hood of all the fancy syntax and ignore code generator issues for now, the relevant questions at the lowest level are: how are the fundamental data types implemented and mapped to machine data types, and what conventions are in place that have to be respected? One important component in this game is the Garbage Collector, which will from time to time scan the heap (= all the memory managed by the language where values can reside) and recycle pieces of data that have become un-reachable and hence ballast.
What type information has to be available at run time? At the very least, the system has to be able to find out whether a certain OCaml value, stored in a given region of memory, contains references to other OCaml values or not. The Garbage Collector has to know this so that it can scan all the memory that has been allocated in our running program for "live" objects and declare all other data as "dead", that is, unreachable. This evidently means that the memory representation of an OCaml array (or tuple, say), which may reference (i.e. contain pointers to) other OCaml values, must contain information about the length of the array (or tuple). One could imagine that from the perspective of the garbage collector, the world of hierarchically constructed types is much simpler, and that indeed, arrays and tuples might even have precisely the same representation in memory: Both represent vectors of OCaml values, and even if they behave very differently from the programmer's perspective, there is no reason why they should not be just the same internally: after all, the question what one can do with these data is resolved entirely at compile time.
So, we may imagine an internal data representation scheme where all
constant-time addressable vectors (tuples, arrays) appear as a region
of memory that contains a single header word (or at most a few words)
that provides length information, followed by pointers to the
contents. This actually would be quite similar to the way how data are
represented internally in the GCL (Gnu Common Lisp) system (see object.h
in the GCL sources, especially the definition of "union
lispunion
"), only that the structure of the header is a little
bit more complicated, and we retain enough information to derive the
actual concrete type at run time - which we have to, as LISP is
dynamically typed. Non-compiler scripting languages like Perl or
Python, which also are dynamically typed, use similar approaches, but
typically are way more verbose in their internal value data structures
(see e.g. The
corresponding definition of typedef struct _object (...)
PyObject
and the corresponding comments in the Python
sources), and frequently include in particular a reference count, as
they usually do not have a proper garbage collection (which, by the
way, is a shame, given the existence of the very powerful Boehm-Demers-Weiser
garbage collector library).
Suppose we stick with such a scheme where every value is represented by a pointer to a piece of memory that holds all the data. Whenever we pass even the smallest piece of data - like an ordinary machine integer number - into a function, the system first has to do dynamic memory allocation to obtain space where to put the number, adorn it with some header that says, basically, "there is only this single one word of data, and it is not a pointer to further values", and then pass a pointer to that piece of memory. The recipient will then have to look up the number through that pointer. Now, this "boxing" and "unboxing" is quite a lot of time consuming overhead, as it is ubiquituous and hence has to be done over and over again. Therefore, it is evidently desirable to have a compiler that is intelligent enough to avoid unnecessary boxing (maybe via inlining) for purely internal functions that are not visible to the outside. However, when calling a function from an independent binary-code library, we presumably will have to go through this boxing and unboxing.
Imagine creating something as simple as an array of one million integers. If OCaml used the scheme suggested right above, we would require two data words (32 bit on 32 bit machines) to represent every integer, and have an array of pointers, so we would need three words of memory to encode a single word of data! Clearly, this is a highly unsatisfactory situation. (Indeed, this is just exactly what happens with GCL: see!) Can this be avoided? Actually, one might think so, as we have all the type information available at compile time that allows us to discern what's a pointer to a value and what's just raw data. But consider the following example:
|
(Incidentally, this is also a nice example that shows that the complexity of the type of an expression can grow at least exponentially with the size of the expression.) What code would the compiler have to generate so that the garbage collector can know which entries of all the tuples in this example hold raw data, and which hold pointers to tuple values? If you think about it long enough, you will come to the conclusion that actually, we require one bit of information for every tuple slot. We might conceive collecting these in a bit-vector which we place right after the tuple header word. This may indeed be possible, but would make the garbage collector somewhat clumsy. The approach usually taken instead makes use of the observation that pointers to values are aligned to divisible-by-four addresses. That is, the two least significant bits of these pointers are unused, and always zero. Suppose now we implement the following scheme: value references will not take the form of ordinary memory pointers to the address where the referenced value lies, but instead be pointers to that address plus one. When we want to use this as a pointer, we use CPU instructions with fixed-offset addressing that cancel this off-by-one. (This is not a problem for CISC CPUs, which have such addressing modes in their assembly language opcode set, and also not a problem for superscalar RISC CPUs, which just have to do one more offset calculation on one of their integer units - and actually, speeding up offset calculations is just one of the major reasons why they do have more than one integer unit (and usually just one memory access unit) in parallel.) We now declare that every tuple entry whose least significant bit is a "one" is such a special pointer, and everything that has a zero as its least significant bit is an "immediate value", that is, the word itself carries all the data.
In particular, we may encode true
and
false
as the binary values 0b00
and
0b10
. The integer N
we encode as
N*2
. Addition and subtraction will still work as usual,
but when we multiply or divide, we have to do one additional
bit-shifting operation (which usually is quite little effort in
comparison to the multiplication). This means that we will not be able
to discern the memory representations of, say, 1
and
true
, but this does not matter, as it is of no relevance
to the garbage collector, and all conflicts that may happen have been
prevented by the compile-time type checking. Likewise, we can encode
single characters as immediate values. Functions such as
Char.code
then may be just eliminated by the compiler.
Such a pointer tagging scheme is what most functional compiler systems use nowadays. There are, however, differences in the tagging schemes implemented. CMUCL/SBCL for example align all memory cells to 8-byte boundaries and use the three least significant bits as type tags to discern cons cells, characters, structures, arrays, etc. See object.tex in the CVS sources. OCaml chooses to use a least significant bit of 1 to denote integers, which is very unusual. Other systems may implement other slight variations on the general subject, such as using high bits instead of low bits. An interesting but very useful curiosity that works without any extra pointer tag bits is the Boehm-Demers-Weiser conservative Garbage Collector for C, which comes as a drop-in malloc() replacement - indeed this is so efficient that some quite reasonable functional languages (Bigloo Scheme, for example) decided not to implement their own GC, but rely on this library instead. How can this possibly work? Basically: if something looks like a pointer, we just assume it could be a pointer and scan the corresponding region.
Now, one might say that if a CPU were especially designed to support
functional languages, it should provide extra type tag bits for every
value. With modern 64-bit CPUs, there usually is little need for fast
full 64-bit integers, so we may well afford providing only 62-bit
arithmetics, and pointers will not use the full 64-bit address range
anyway due to MMU limitations. (Typically, a page will consist of 1024
8-byte entries, hence use up 13 address bits. The usual three-level
MMUs then only can use 10*3+13=43 address bits. Seen that way, going
to 48 instead of 64 bits may have been more reasonable.) What is
slightly special about OCaml is that its implementors have
deliberately chosen to use internal representations that do not allow
one to re-derive enough type information to print the value in a
meaningful way. Internally, there is no distinction between
"false
" and "0
", say. This is somewhat
unfortunate, as it means that there is no way to implement an ad-hoc
polymorphic debug-printing function of type 'a ->
string
that just prints out some OCaml value in a meaningful
way - similar to Perl's Data::Dumper
.
Even though we might not be meant to know what is going on in this file, it is nevertheless worthwile to have a look at file:///usr/lib/ocaml/3.08.3/caml/mlvalues.h to see how some of the low-level definitions work. Note that we are not supposed to rely on that particular realization, as this may change in the future!
What are the ML-specific macros such as CAMLprim
,
CAMLparam1
, CAMLlocal1
,
CAMLreturn
for? Roughly speaking, CAMLprim
has to do with exporting our functions properly for OCaml. The
CAMLparam/CAMLlocal
macros are required for garbage
collection. We can imagine situations where some allocated piece of
memory "almost becomes garbage" in the sense that all references to it
are lost, except some that are passed into C. As a garbage collection
may be triggered at almost any point in time, we must make sure that
even if we are inside C code that holds the last references to a given
value, the Garbage Collection will know that this value still is in
use. In the c_ex_x_open_display_v1
C function in our
example, we first introduce a variable of type value named
block
, which is made visible to the GC. Then, we allocate
a block which will have a header tag (Abstract_tag
) that
tells the GC that this region of memory does not hold a string, an
array or tuple, or any other kind of value OCaml may want to deal with
in some special way: it will just contain "raw data", and it will
always be up to our code to interpret this in the proper way. We
generally are requested to use the Field
and
Store_field
macros to retrieve and store data from the
fields of a block, if these fields contain ML values. For C data (like
pointers) stored in a custom block, we must not use
Store_field
, as this would tell the GC to keep track of
the value in that slot and regard it as a ML value that has to be
scanned - with disastrous consequences. Hence, we introduce our own
Store_c_field
macro to make explicit that we do want to
store a value without making the GC worry about it. This macro
actually is implemented in a somewhat hackish way and perhaps should
rather be part of official OCaml, but at the time of this writing, it
is not.
Every entry in a block will be large enough to hold one ML value, and
in our example, we implicitly use the slightly dangerous assumption
that a value is at least as large as a pointer. (However, this
actually seems to be true on all platforms.) Note that if we were to
construct a float array from within C on a 32-bit system, we would
have to allocate a block with twice as many value
slots
as the number of entries of our floatingpoint array, and we should use
the Double_field
and Store_double_field
macros to access them. Here, our payload data is either a C
Display
pointer, or a null pointer, denoting an invalid
display. The other C implementations of functions operating on X
displays will extract that value and handle it in an appropriate
way. Note that the example code sometimes is a bit more verbose than
strictly necessary. This is just to clarify the general structure.
Quite often, when we wrap C-controlled resources in such a way, we
may want to provide means that the resource is freed automatically
should it become garbage. Quite in general, it is a good idea not to
rely on such GC finalization as the solitary mechanism to free
resources but to at least provide explicit de-allocation
means. Depending on the resource, we may even want to consider it an
error that should be reported if it ever ends up being
GC-finalized. One way to implement finalization would be to use the
Gc.finalise
function on x_display
values and
register x_close_display
as a finalizer and wrap up our
raw x_open_display
function accordingly on the OCaml
side. We may also provide a finalizer written in C. This will be shown
in the next example. In addition, we will make sure that using
x_server_vendor
on an invalid X display will raise a
special exception defined by us, which will provide both a
human-readable problem description and an OCaml tag telling us what
went wrong.
Extending our example: finalization and exceptions |
---|
c_examples.mli |
|
c_examples.ml |
|
c_examples_impl.c |
|
Testing the new xlib functions |
---|
|
Let us briefly discuss what's new. First, we are now using
alloc_final()
to allocate our custom-data blocks. We need
to allocate one extra value entry and use the second slot (having
index 1) for our data, as a pointer to the finalization function will
go into slot 0. Actually, this is not 100% true:
alloc_final()
is just a legacy compatibility function for
the more general and also more flexible alloc_custom()
function that also allows us to specify other custom functions that
handle, say, serialization to strings, hashing, and comparison. None
of this really makes overly much sense on X display pointers, so we
just leave it at the more simplistic approach. The other two
parameters control how frequently the GC is called after allocating
entities of this type. The first value is a measure of the relative
amount of resources used by this entity (we just use 1 here), the
second one is a measure for how many of these we allow the system to
allocate before GC has to be called in order to try to reclaim some
that may have become garbage. These "ten allocations per GC" can be
seen at the end of our transcript.
Concerning exception handling, we first have to introduce an
exception on the OCaml side, and register this with a special name, so
that we can locate it from within C by that name. Then, we build the
argument tuple - we may have used alloc
with the special
tag denoting a tuple, but alloc_tuple
is a more
convenient shorthand. Note that the parentheses in the exception
definition are mandatory!. If we were to do some less
sophisticated exception handling, we might prefer using the much
simpler raise_with_string
instead.
As we have seen, the functional way to think about the
decomposition of an algorithmic problem into sub-tasks which are
realized by specialized little helper functions is very
natural. Consequently, we find such a style of programming also in
some C libraries. One typical application is the specification of
callback C handlers. Suppose that we have a C library that provides an
opaque C structure called - say - "animal
", for which we
can register a callback C function that is called whenever this animal
has to make a sound. Typically, the C library implementor will have
thought about the problem that the library user might need more
flexibility than what can be provided by registering just a C
function. With our background in functional programming, we now may
see it that way: C "functions" are not functions, but just routines. A
proper function is a piece of code specifying what to do, plus maybe
some extra contextual information. We generally just called this a
"proper function" so far. If one wants to emphasize the role of the
contextual data grouped together with the function, this is sometimes
called a "closure". A parameter to a callback-setting function that
provides such context which is passed to the registered callback once
it is executed then is a "closure parameter". This sounds a bit
convoluted - but we will see an example soon.
When we wrap up such a library for OCaml, we will usually want to match the spirit of the original callback approach as closely as possible. On the OCaml side, we will not require a closure parameter to the callback function, as OCaml already has proper functions. On the other hand, we have to use a callback wrapper on the C side that uses the closure parameter to pass around the OCaml function.
The details are best studied by looking at an example. Note that one should pay very close attention here - making things smooth for the user of our code will require some tricky magic under the hood.
This is the C library we want to interface (and a C example):
The animal library |
---|
animal.h : |
|
animal.c : |
|
animal_example.c : |
|
In order to wrap this, we make the following further modifications:
Extending our example: Lifting callback-setting from C to OCaml |
---|
Makefile |
|
c_examples.mli |
|
c_examples.ml |
|
c_examples_impl.c |
|
Quite an example, indeed. Let us see now that this really works as advertised:
Running the Callback example |
---|
|
The key idea is: We have to provide a C data pointer when we
register our callback function. Actually, what we want to pass here is
the ML function, but this may be moved around in memory by the GC. So,
we have to pass a pointer to a C memory region holding the ML
function. But then, we have to make sure that the GC will recognize
this C-allocated memory as a position that holds a ML value, which
should be treated as a root for heap scanning, and modified if the
value is moved around. Therefore, we have to
register_global_root
it - and unregister and free it once
we get rid of the object for which we registered the callback. The
reader should take his time to think this through.
Actually, this unfortunately means that we will encounter an ugly problem if the callback function we register is a closure containing the object for which we registered the callback. The reason is that the callback-holding object will be responsible for removing the global GC root in its finalizer - but if we make this object accessible through that global GC root, it never will be finalized. In other words, if we write code like the following, this means asking for trouble:
Running the Callback example |
---|
|
XXX Actually, if I run this e.g. as (test 1000), I get a segfault, but strangely, the address reported for the callback function always is the same. This should not be possible! Something is wrong with this discussion. Have to investigate!)
This brings us about as far as we want (or have to) go with our
discussion of the C interface. Let us conclude this lesson with this
final pearl: a module providing functionality that allows us to
specify a (Real d-dimensional Space -> Real k-dimensional Space)
function in the form of a string containing C code. This will then be
put into a C source code file, compiled, dynamically loaded, and
linked from within OCaml, so that we can c_register
a
string and in the end obtain a very fast float array -> float
array
OCaml function implementing this computation!
Documentation of the (very small) ML interface is still lacking,
especially concerning re-use of the output vector, and error checking
should be improved (catching compiler errors as well as making sure
the wrapped function comes with array length bounds
checks). Nevertheless, this is a closed example showing many of the
techniques we have discussed here, plus a few new ones, in particular:
using the module system to define a weak hash table and using this to
keep an overview over the still-in-use C-wrapped functions and using
dynamical loading of C libraries.
The complete c_examples
module we discussed above (for completeness)
One very simple way to demonstrate this is to use the shell command
"ulimit -v 200000
" to artificially limit virtual memory
size to 200000 KB and then start GCL. If we try to define a vector of
20 million values, this would require about 80 MB of RAM. If we
provide the initial value, all entries will point to the same
entity. But look what happens if we start putting different numbers
into different places:
GCL and memory management |
---|
|
In comparison, MzScheme in the same situation |
|
So, this hints at GCL doing heap allocation of integers, while MzScheme does not. Neither do quite a lot other functional systems.