|
|
"How evolution works" (Expressed as an algorithm)
Background reading:
Living things contain a diversity of DNA, a diversity of ways of persisting. Some things persist, some things don't.
The question we always ask is: "If we come back in a million years, will these creatures be long extinct, or will they actually still be here?" e.g. Now on earth. Go away and come back in 5 million years. What will be here? Go away again and come back in 50 million years. Same question. What will persist?
Even bodies cannot be taken as a given. The chicken is an egg's ludicrous, elaborate, and yet successful, way of making another egg. Death cannot be taken as a given. Nor sex. Nor even a population of individuals. Nor even a mechanism of inheritance (DNA). You have to explain why these things evolved. Why they persist. This view of evolution, exemplified by the writings of Richard Dawkins, is very much how Darwin saw it.
Background reading:
But is just seeing what persists of any interest to engineers? Surely we want to orient the machine towards some particular problem we want solved? The ALife people point out that the most impressive machines ever all arose this way. But their success in emulating this has been limited. Generally very simple things arise - although there have been plenty of insights into the evolutionary process.
The simplest, crudest way to express an evolutionary process as an algorithm is to use an explicit fitness function:
We have a population of "creatures". Their "DNA" is the instructions on how to build a creature. They are not all identical, but rather there is a diversity of DNA. Each represents a way of solving a problem or problems. Each time step: Each creature has to solve some problem or problems. The creature's "fitness" is rated according to this test. The fittest get to reproduce. Reproduction involves constructing a new creature from their DNA. Crucially, some change in the DNA must be possible, otherwise no new solutions can arise. One of the major sources of change is sex (we mix up the DNA of two successful creatures to make a new one). Another source of change is simply small mutations of the DNA. The less fit simply don't reproduce. The new population of creatures arises from the fittest old ones and so, statistically, are more like the fitter end of the previous population. The old population (fit and unfit alike) die to make way for the new. The new population is diverse itself, and is tested in its turn, and so on.
Genotype - The "DNA". The information that is inherited.
The instructions for how to build a new solution.
Phenotype - The "body". What the genotype constructs.
The machine that actually solves the problem.
e.g. The genotype is a list of weights.
The phenotype is a neural net with these weights built-in.
Genome - An instance of the genotype.
Gene - A subsection of the genotype that can be isolated
and identified to have some function.
Sometimes in nature a gene can be related 1-1 to some function
like eye colour. More often, phenotype attributes arise from
the complex interaction of multiple genes
under a developmental process itself defined by other genes.
In engineering, we normally enforce a tidy separation of genes.
The genotype consists of a list of all the parameters needed
to build the machine (e.g. with a Neural Net:
no. of nodes, links, weights, thresholds,
learning rate, weight initialisation parameter).
Each parameter is a gene.
Certain combinations of parameters/genes may be better than others.
Evolution is a search in this multi-dimensional space.
Allele - The value that a gene has.
e.g. The gene for eye colour can take values {blue,brown,green}.
Chromosome - Genes are often grouped together on structures
called chromosomes, so combinations of them can be easily
reshuffled, especially by sex.
Certain gene combinations are fitter than others, and evolution is a search in the multi-dimensional space of gene combinations to find fitter solutions.
This was recognised by evolutionary biologists long before hill-climbing computer programs (or indeed any computer programs):
You will recognise from our previous hill-climbing/descending that this a form of "blind" search on a landscape where we only get feedback on particular points that we try out. We never get to see a picture of the landscape as above. And even worse than the case with Neural Networks, here we do not even get a measure of the slope. We are only told the y-value. We do not know the direction in which we should move the genes to improve matters.
No method that works like this can guarantee to find the global maximum without infinite exploration. The global maximum could be almost completely hidden, with almost no landscape leading towards it (it must have some landscape leading towards it if the landscape is continuous - it can't be a point).
But of course how do we know in advance that the landscape is continuous? If we know nothing about it, it may not be.
The main feature of evolution by hill-climbing is that we use a population of hill-climbers. We start the experiment by "dropping" our hill-climbers at random points scattered all over the landscape (hopefully giving good coverage). Many may be starting in useless places, but if we know nothing about the landscape what else can we do except drop them randomly? They then begin to move. (Of course, as before, they move by discrete jumps, not continuously.)
For instance, let the genotype encode a real numbered value x in the interval 1-10, and the "fitness" of x is:
f(x) = sin(x) + sin(2x) + sin(5x) + cos(x)Then all we are doing is maximising this function over this interval. We start with a population of random x. Generation 1 drops them all over the interval:
Now, the fittest get to reproduce. Not the single fittest solution, but a number of the better solutions. This means that instead of following a single path up the landscape, we keep multiple solutions alive. Generation 7:
Why is keeping multiple solutions alive a good idea? Because there is no guarantee that the best solutions right now are on the slopes of the global maximum. We give many different areas of the landscape a chance to prove themselves.
It is a "noisy" search over an unknown landscape, where "noise" = step size / size of mutations that we make.
Should the noise decrease over time, as we converge on a solution? (Probability of or size of mutation should decrease over time.)
On Internet since 1987.