The idea of a computer automatically programming itself is a very old, desirable and elusive goal. Automatic Programming has, in the past, had a very bad reputation. This is probably because, intuitively, it would seem that writing software which is able to write software should be easier that writing software able to prove theorems, or paint pictures, etc. However, computer programming is a very difficult task, which involves intelligence, creativity, understanding, cunning and guile. In short, it is as difficult to get a computer to program as it is to get it to do anything else. So, when early attempts at automatic programming largely failed to deliver what they promised, people began to avoid using the term automatic programming, and people generally stayed away from the subject.

Many AI techniques disguise the fact that, to some extent, they are performing automated programming. If we think about decision tree learning techniques, for example, the end product is a decision tree, which if it is to be used to actually make decisions for us, has to be "executed", i.e., information about the system is going to be given as input, and an answer will be computed. In this case, the program doing the computing is fairly simple - decision trees can be easily translated into a bunch of "if-then" statements. It's a similar situation with Artificial Neural Networks. So, many AI techniques are kind-of doing automated programming. The Genetic Programming (GP) people are more explicit about this - they state clearly that their GP engines program software automatically.

As you will have guessed from the name, the way Genetic Programming engines generate programs is by following an evolutionary approach. The general approach is as follows. The user specifies the task to be undertaken (or problem to be solved) using an evaluation function to express what the evolved programs should do. They also specify what kind of things the programs will be able to use during the computation, e.g., whether or not they will be able to multiply two numbers together. Then, an initial population of programs is generated at random. Each program is translated, compiled and executed and how well it performs with respect to the task is assessed. This enables the calculation of a fitness value for each of the programs, and the best of them are chosen for reproduction. Programs are combined or mutated into offspring, which aree Äadded to the next generation of programs. This process repeats until a termination condition is met.

This leaves open the following questions, which we look at in this lecture:

Obviously, there have been many answers to these questions, and some approaches to genetic programming have failed where others have succeeded. A big decision, as with most AI techniques is the choice of representation, because many of the questions above depend on the way in which programs are represented. Hence, we will look at this question first.

17.1 A Graphical Representation of Programs

We look here at how to represent programs in a way that enables one program to be combined with another. To do this, we will need to be able to remove parts of a program in a meaningful way, and to add new parts to programs. We are used to writing programs procedurally as a series of instructions: do this, then do that, if this is true, then do this and that, etc. Those of us lucky enough to have programmed in Prolog also know how to write programs declaratively, so that we specify what we want, and the Prolog interpreter finds answers for us. In both cases, the programs are just lines of code. Therefore, one possible way for two programs to be combined would be to jumble up the lines of code. With the bit strings in genetic algorithms, it made sense to keep regions of the string in-tact, rather than jumble up the bits randomly, as the valueable parts of the solution may be contained in those substrings. The same is true with our programs: presumably we would want to pass on long regions of code to the offspring program.

So, randomly combining lines is ruled out, and an approach similar to the cross-over routines for GAs is required. However, this approach is still problematic, as the combined programs would often not make any sense and hence a compiler would not be able to compile them. At the very least, we need our offspring programs to be compileable, and our representation scheme needs to take this into account. For this reason, in most GP algorithms, they use a graphical representation of programs, where each branch and sub-branch of the tree is syntactically self-contained, so that, if we combine programs by chopping off and adding subtrees, the resulting programs will still be syntactically valid.

The first thing we will do to find a good representation scheme is to say that our programs are going to be thought of as functions: they take in a set of values and output a single value. Such programs are called result producing braches and will typically form part of larger program structures which we will skip over for now. The next thing we will do is to specify that our graphs will be trees (i.e., no cycles), and that each node of the graph itself represents a function or an input to a function such as a variable or a constant. Demonstrating the graphical representation is easiest by example. Suppose we wanted a function which added together two numbers, X and Y, then took the square root of this sum, and, if the answer is less than 10, output X, otherwise output X divided by Y. The following graph represents a program that would do this for us:

Here, we see that there is an IFLTE node, which stands for "If-Less-Than-Else". The first and second nodes below this (counting from left to right), are the values which are tested: if the first is less than the second, then the IFLTE function returns the value of the third node below it, and if the first is not less than the second, the IFLTE function returns the value of the fourth node below it. The four nodes below IFLTE themselves represent functions, for instance the square root node takes a single input, which is the output of the plus function. The plus function takes X and Y as input and outputs their sum. We see that the second node is just the number 10 (because we are checking that the root of X+Y is less than 10). Also, the third node below IFLTE is X, which says that if the property is true, then output X, else output the division of X by Y.

As discussed below, the functions such as addition, multiplication, etc,. and the constants allowed in the evolved programs are defined in advance by the user. Also, the programming constructs, such as if-then-else nodes, and for-loop nodes are specified in advance. There is no agreed upon formalism for the programming constructs, and how they are defined will affect the expressibility of the programs containing them. For instance, below is an alternative representation for the above function:

Here, we see that the IFLTE node has been replaced by a more general IF node, which returns the value of the second node if the first node returns "true", else it returns the value of the third node below it. We see how this can be more expressive, because any boolean function, not just <, can be substituted into the program. Higher expressivity means that more programs can be found as potential solutions to the problem at hand. However, it also means that the search space of programs gets bigger, so solutions may not be found as quickly. Hence, we should think hard about what our evolved programs should do: if they only need to check whether one number is less than another, then we can perfectly well get by using just an IFLTE node. If, however, we need to check for equality, divisibility, and lots of other things, then perhaps we should use the second representation scheme.

17.2 Specifying the Program Task

Now that we know the kind of representation scheme to use, we can look at the question of how we will evolve programs in that represenation. Remember that we are going to try to evolve a program to undertake a task. So, we must specify what that task is, in order for the fitness function to determine which programs are doing well at the task, and for the GP engine to know when to stop. As with Genetic Algorithms, this is often one of the hardest parts of working with GPs. One possibility for specifying the task is simply to give a set of input-output pairs. Then, a program will calculate an output for each of the inputs, and will get a certain number of them right, i.e., the program outputs the same as in the user-given input-output pair. The evaluation function is then defined as the proportion of inputs for which the correct output is generated.

For example, in this paper:

Discovery of Understandable Math Formulas Using Genetic Programming

the author evolves a series of programs which can find the highest common factor (HCF) of two integers (the point of the research is to show how this can be done so that the programs produced can be understood). To check how well the programs are doing with respect to finding the HCF of two integers, a set of triples (X,Y,Z) are supplied, where the HCF of X and Y is Z. Towards the end of the evolutionary process, the programs were getting all the examples correct, and when the programs were analysed, they did indeed calculate the HCF as per its mathematical specification. Notice how similar this is to machine learning programs being given sets of positive and negative examples to learn over (which re-inforces the fact that we can see evolutionary approaches to AI as machine learning efforts).

Genetic programming has been used for applications where the real reason to evolve programs is to enjoy the output from the programs, i.e., the artefacts produced by the programs are more interesting than the programs themselves. For these kinds of applications, a more sophisticated fitness function is often required. For example, in their paper "Learning to Colour Greyscale Images", Penousal Machado, Andre Dias and Amilcar Cardoso used a GP approach to generate programs able to take a greyscale image and colour it in. They tried a variety of fitness functions to gauge how well the colouring in process had done. These functions used information such as pixel hue and intensity. For example, the following scary piece of mathematics was used as a fitness function:

This is obviously very specific to the task at hand. As with Genetic Algorithms, the specification of the evaluation function is nearly always problem specific.

17.3 Other Specifications

In addition to giving details of what task the evolved programs are supposed to undertake, the GP user must specify some more details before they can start a session. These include:

This is the set of functions, such as addition, multiplication, taking square roots etc., which will be the component parts of the evolved programs. As with the evaluation function, the set of functions will be hand-carved for the particular task. For instance, if you are evolving a program to control a robot as it tries to find its way out of a maze, the functions will include things like turning left, turning right, going forward, etc. If you are evolving functions to manipulate images (see the application to generating art below), the functions will involve mathematical functions such as sine and cosine, and pixel functions, such as finding pixel colours, hues and intensities, and setting pixel colours, hues and intensities. The function set also includes the set of programmatic functions such as if-then-else and for-loops. It is often instructive to see if good programs can be evolved with, for example, while-loops but no for-loops.

The terminal set contains all the variables and constants which will apppear in the evolved programs. This will typically include some numbers which may be randomly generated, and, as with the function set, it will include problem specific details. In a robot controlling scenario, it may be that movement functions are parameterised by directions such as left, right, forward, backwards, which would form part of the terminal set for that GP application. Similarly, in a graphics application, constants such as pi might be put into the terminal set. The terminal set is so called, because in the tree representations, the constants and variables are found at the end of the branches.

There are many possibilities for how the search will proceed, and the user should tweak various parameters to optimise the performance of the GP engine. The main consideration will be the size of the population, as this will effect the GP the most: larger populations will mean fewer generations in the time available, but will mean larger diversity within the population of programs. Given the programs will grow as they evolve, another important parameter will be a cap on the length of the programs that can be produced. One criticism of GP approaches is that the programs produced are too large and complicated to be understood, so, if being able to understand the resulting programs is a consideration, the length of the programs should be kept relatively small. Other parameters will control various probabilities, including the probability that each genetic operator (see later) will be employed.

The ways to specify when the GP engine should stop are very similar to those for Genetic Algorithms. One possibility is to let the process run for a certain amount of time, or until it has produced a certain number of generations, then take the best individual produced in any generation. Many GP implementations enable the user to monitor the process and click on the stop button when it appears that the fitness of the individuals has reached a plateau. Alternatively, the user may specify that populations are continually produced until an individual which is above a certain fitness is produced.

17.4 Evolving New Populations

Genetic Programming engines begin by generating a initial population of programs randomly. Each tree is generated by randomly choosing a function from the function set for every internal node in the tree, and setting the inputs to this to be constants and variables as terminals, again chosen randomly. Then, terminal points are chosen to be altered by adding in functions, etc. The programs will have many different shapes and sizes, subject to the maximum program size parameter specified by the user. Care must be taken to make sure that the input types to functions is correct. After the seeding of the initial population, individuals are selected to produce offspring. The production process uses one of a set of genetic operators, as described below. The old population is killed off, and the process is started again with the new population.

The user has specified an evaluation function which calculates a value for each individual in the population. Different GP implementations use this in different ways, but always do so in such a way that the chance of an individual being chosen increases as the score it gets from the evaluation function increases. As with GAs, individuals are selected to go into an elite intermediate population (IP), from which individuals will be chosen to produce offspring. One approach to doing this is similar to the approach with Genetic Algorithms: the evaluation function assigns a probability to each individual in a mathematically principled way, and each one is allowed into the IP in a probabalistic fashion. So, for example, if an individual is assigned 0.8 by the fitness function (using the evaluation function), then a random number between 0 and 1 will be generated. If it's over 0.8, the individual will be allowed to reproduce, if not it will be unlucky.

Another approach is to apply tournament selection: pairs of individuals are chosen at random and the most fit one of the two is chosen for reproduction. This is meant to simulate the kind of competition that occurs in reproduction, and means that if two fairly unfit individuals are paired against each other, one of them is guaranteed to reproduce. A third way of choosing individuals is to rank them using the evaluation function and choose the ones at the top of the ranking, which means that only the best will be chosen for reproduction. As with most AI applications, it's a question of trying out different approaches to see which works for a particular problem.

Individuals are chosen from the intermediate population and genetic operators are used to produce new individuals from old ones. Which operator is used at a particular time is chosen probabalistically, with the user specifying the probabilities. There are two types of genetic operators: ones which generate a new individual from a single parent, and ones which generate a new individual from a pair of parents. The simplest operator is called reproduction, which copies a single parent into the new generation. This means that copies of individuals from the old generation can make it into the new generation, which is why the original individuals in the old generation can be killed off entirely.

The other genetic operator which produces offspring from a single individual is mutation. As with genetic algorithms, this operation is performed sparingly, and serves to help the population get down from local maxima. A point on the individual's tree is chosen at random, and the subtree below that point is removed, to be replaced by a randomly generated subtree, with the generation done in the same way as for the initial population. Sometimes the mutation is constrained so that functions in the tree can only be replaced by functions and terminal nodes can only be replaced by other terminal nodes. Below is an example of a random mutation on an individual program.

We see that the square root subtree has been removed and replaced, so that the program calculates root(17)*root(x) instead of root(x+y) down the left hand side of the tree.

Another genetic operator is called crossover. This takes two parent individuals and chooses a point on the first and a point on the second at random. It then swaps the two subtrees which start at the point. The subtrees to be swapped are called the crossover fragments. This produces two offspring: the first parent with a fragment from the second, and the second parent with a fragment from the first. The following highlights this process:

We see that two children have been generated from the two parents. Note that the parents could be two copies of the same individual, in which case the operator has to make sure that the point chosen on the first copy is different to the point chosen on the second copy, otherwise the operator simply produces two copies of the original.

We have concentrated here on simple programs which consist of a single main routine which produces the output (the whole program is called a result producing branch). In more complicated programs, there will be more complex structures such as iterations (for-loops) and subroutines. A final set of genetic operators which operate on the more complicated structures are called architecture-altering operations. These change aspects of the subroutines, including deleting and copying them, and altering the arguments passed to them.

17.5 Applications of Genetic Programming

John Koza, a professor at Stanford and CEO of Genetic Programming Inc. is perhaps the person most responsible for making GP more acceptable in the eyes of the AI community. He and his team have successfully applied genetic programming techniques to a variety of applications ranging from bioinformatics to distributed systems. One of their most successful endeavours has been to the generation of electronic circuit designs. Here, the programs are actually all about the flow of information around the circuits, so the function set contains functions which mimic the actions of transistors, resistors, etc., on the flow of electricity. According to the web site at Genetic Programming Inc:

"there are now 36 instances where genetic programming has automatically produced a result that is competitive with human performance, including 15 instances where genetic programming has created an entity that either infringes or duplicates the functionality of a previously patented 20th-century invention, 6 instances where genetic programming has done the same with respect to a 21st-century invention, and 2 instances where genetic programming has created a patentable new invention."

For more information about Koza's work, visit

One of the most exciting and creative areas in which genetic programming is being is applied is evolutionary art. In contrast to most GP applications, in evolutionary art, the user often acts directly as the fitness function. That is, the GP engine generates a set of programs which can produce images (JPG's etc.), either by transforming a given image, or generating pixel data from scratch. These images are then shown to the user, who performs the selection by choosing those which they most like. The GP engine then generates a population from the chosen images and selects from it images which fairly closely resemble the ones chosen by the user, or which have some properties similar to the chosen ones, e.g., colour distribution. The user then selects those with most appeal again, and the process continues until the user is so happy with the image that they put it on their homepage. The evolutionary art community includes many artists and computing professionals, and the artworks their programs produce generate much interest (similar to how everyone was amazed by fractal images when they first came out). Such an approach was recently used to generate images for an ad-campaign by Absolut Vodka, for example.

One evolutionary art program is called Nevar, which is written and maintained by Penousal Machado of Coimbra University, Portugal (he is also the person who researched how to colour in greyscale images - as part of the Nevar project). The images given below were generated by Nevar:

©Penousal Machado ©Penousal Machado

Check out more images at the EvoArt web page: EvoArt.