|
Why not have a low temperature / let only the fittest reproduce? - In extreme case, if only the single most fit individual reproduces, they do so with themselves - cloning. Crossover of the same chromosome - crossover may as well not happen. Just asexual reproduction with very slow evolution by mutation only.
- Population is very rapidly converged in 1 place. 1-individual hill-climbing.
- Diversity comes from the less fit members. Often they have some (but not all) genes to contribute to the fit.
Also, the less fit at a given point may be on the slopes of the global maximum, while the fittest at this moment are on a local maximum. The less fit have to prove this quick though.
Often have:
Probability of mutation starts high and declines.
Probability of crossover starts high and declines.
Note Pm=1 doesn't work.
But Pc=1 can still work.
State-Space Search - often have a heuristic "distance from goal".
Here - we have no idea how far away we might be from the optimal solution.
Neural Networks - we have measure of error E.
Here we don't.
Absolute fitness gives us less info than an E measure.
More on this in Comparison of Neural Net and GA.
Yes, in the machine version we actually know where the genes start and end on the chromosome (which nature doesn't know in general when doing crossover). So why not just reshuffle combinations of genes instead of cutting in the middle of a gene?
The reason is - Where does the diversity of genes come from? It would only come from mutation if we never cut in the middle of a gene. Slow evolution.
In fact, people have found that if they have crossover, they don't need mutation! "Sex drives evolution. Mutation is irrelevant."? May make sense - less than 1 in a million genes mutates, whereas crossover happens non-stop.
What if Pm=0?
And we rely entirely on crossover to give new diversity.
Imagine if by chance all initial population have a "0" in position i.
Then crossover can never construct an individual with "1" in position i.
Note in nature a crossover may break the gene at the crossover site. However, in the 2-copy system, this is only 1 of the copies. The other may have undergone its own crossover, but very unlikely to be the same point, so we still have a working copy just in case.
For many problems, it is not simply a matter of constructing the individuals out of random values of the n parameters or genes, and then testing the fitness. Say the individual will simply not run at all if gene 1 is not > 3.
If you just encode the parameters as real numbers, it may be that 99 percent of parameter combinations won't even run. Our initial random population will all have zero fitness, and so will almost any mutation or crossover of a fit individual.
We could change the genotype coding to reflect this complexity, so the bitstring in gene 1 was simply interpreted as from 3 upwards, the bitstring in gene 2 was interpreted as 0-7, etc. But how about constraints like: "Unless gene 3 < gene 7 < gene 11, the individual will simply not run"? This knocks out a huge chunk of that multi-dimensional space.
One solution is simply a loop when we are constructing new chromosomes:
given parents:
do
{
crossover, constructing child1 and child2
cout << "searching for viable crossover .. \n"
}
while ( ! ( child1.viable() && child2.viable() ) )
where viable() is defined in the application.
repeat n times:
do
{
construct new randomised individual x
cout << "searching for viable randomisation .. \n"
}
while ( ! x.viable() )
This can work,
but only if chromosome interpretation has already done
most of the job
and we just have 1 or 2 further constraints to satisfy.
If 99 percent produced still aren't viable,
these loops will take a long time to terminate.
This time may be very small compared to the time needed
to actually test the solution.
If so, loops probably still ok.
If not:
Useful for optimising n parameters, all of which influence each other.
Useful for scheduling that minimises some cost function. e.g. Timetabling of lectures/exams. Minimise no. of clashes (options). Minimise no. of exams on consecutive days, etc.
Here, we're trying to find
the n-dimensional Input that leads to the highest Output.
We don't need to build up a representation of the function.
Requirements: Given any Input, tell me the Output.
We can combine these: Do our GA search,
and, as we go along, build up a network to represent the map.
But of course, we won't be representing it over the whole range,
because we'll concentrate only on the higher peaks.
We get random coverage
at the start, but it quickly becomes non-random.
Really need a different algorithm to represent the entire function.
GAs are for when we want to find the best combination of n parameters
and then just use it.
We're not particularly interested in the Outputs
for other combinations.
Typically, we initialise our population, and make probabilistic decisions thereafter, using a pseudo-random number generator.
If we use the same seed each time, then each run of evolution will be exactly the same. Therefore, use a unique seed each time, e.g. seed based on the clock time.
On Internet since 1987.