Dep. Informatique & Réseaux
J-L. Dessalles ← Home page
Module Athens TPT-09
Please provide answers in English.
The idea underlying GAs consists in keeping not only one candidate solution, but a set of several solutions simultaneously in the computer memory. These solutions are regarded as a "population" that the algorithm will evolve like a living species. These solutions are set randomly at first. They are quite bad, but some of them are still slightly better that the others. This is where Darwinian selection operates. Only a fraction of the population, the top performing individuals, are kept for reproduction. Selected individuals are paired and procreate. Children are used to reconstitute a whole population, and the cycle goes on.
Genetics is involved at the reproduction step. Each individual is characterized by a list of parameters. This list is coded as a binary vector that we will call DNA by analogy. It is also called genome or genotype (though these names are more appropiate to designate the semantics of the DNA). When two individuals breed, the child’s DNA is formed by mixing its parents’ DNA. This mixing operation is named crossover, after the corresponding biological processus. During crossover, random positions are drawn along DNA, and segments cut at these positions in both parental DNAs are exchanged. The resulting vector is the child’s DNA.
Designing a GA requires three phases.
Click on image to see animation again
Click on image to see animation againClick to animate again --> Run Starter (in the Evolife directory). It opens the Configuration Editor. Load the configuration SumBits.evo (from the directory Expe). Click on the Run button (it automatically saves the configuration as Evolife.evo and runs Evolife).
Observe the convergence toward the maximum (= 100, as there are 100 bits in the genome).
|Set the mutation rate to zero, and rerun the simulation after having set the DNAFill parameter to -1 (to start from a random population). Explain what happens.
It may be useful to run the simulation step by step. You may magnify the genome window (shortcut ‘+’; ‘–’ to shrink).
|Conversely, set the mutation rate to a high value, such as 5 or 10% (i.e. 50 or 100 per thousand), and observe what happens. What do you see? Why?|
|Restart Evolife with a normal mutation rate, but now set the number of crossover points to zero. What happens? Why?|
|Increase the size of the population significantly. Besides the fact that the program runs slower, what is the effect on the convergence?|
|Keep a large population and split it into several groups by allowing a lower maximum size to groups, and set migration rate to zero. What do you observe when the population is segmented?
(Note that when a group disappears, another is likely to split up.)
|In the Configuration Editor, locate the DNAFill parameter in the genetics section (it defines how DNA is initialized) and set it to –1 or to 1.
Locate the S_SumBits.py file (in the directory Scenarii) and make changes so as selection rewards individuals with the maximum number of zeros instead of ones. Copy these changes in the box below. Don’t forget to set this parameter back to 0 after having observed the effect.
|Let’s make a slight temporary modification of the program.
Open DNA.py. Locate the mutate function. Change it so that a mutation sets the bit to zero 97% of the time and to one 3% of the time, regardless of its previous value (copy the modified python line in the box below).
|Observe the consequence on the convergence toward 100. Do you notice variations in the slope of the curve?|
|Change the evaluation function in S_SumBits.py to reward individuals that are far away from the middle value 50 (try to keep scores always positive). Copy the modified python line in the box below.|
|Run the simulation several times. What do you observe? Why?|
|Set PopulationSize to something like 300. Introduce two groups, by changing the value of NumberOfGroups. Set MigrationRate to 0. Make several runs. What do you observe? What happens if you wait long enough?|
|Change again the evaluation function to reward individuals depending on how close they are to the middle value 50. Copy the modified python line in the box below.|
|Observe that genomes tend to resemble each other. The vertical patterns that result from this impoverishment seem to be quite stable through time. Increase the display period and observe that it is not the case in the long term. How do you interpret this phenomenon?|
|Introduce a one-bit long gene in the function genemap in S_SumBits.py by adding a couple ('Interpretation', 1) to the list that the function is returning. You can read the value of this gene in evaluation by calling indiv.gene_value('Interpretation'). Use this value to compute the score of individuals. If 1, the score is equal to the sum of the bits of the gene sumbit (as now), and if 0, the score is 100 minus that value. Copy the modified evaluation in the box below.|
|Display the average value of the gene Interpretation in the population by adding
('red', 'interpretation') in the list returned by display_ at the end
You should observe that Interpretation always evolves to a definite value. Why?
Click on image to see animation again
Load the configuration Labyrinth_noshare.evo in the Configuration Editor. The evaluation function can be set up in the configuration: see parameters PenaltyU_turn, PenaltyWall and RewardExiting. If walls and u_turns are the numbers of times the robot hit walls or made a U-turn, then:
Click on image to see animation again
Evaluation = PenaltyWall * (MaxSteps – walls) + PenaltyU_turn * (MaxSteps – u_turns) + RewardExit * (MaxSteps – step –1)
Run Evolife and observe what happens for reasonable values of the parameters. You may display trajectories (shortcut 'T') in real time. When the robot happens to bump into a wall, it is moved in a random direction. This behaviour allows certain individuals to find an exit by chance, without being able to pass on this success to their offspring.
Evaluation = PenaltyPoison * (MaxPheromon * MaxSteps +1 - poison) + RewardExit * (MaxSteps - step - 1)
Load the configuration Zip.evo in the Configuration Editor. This scenario (open S_Zip.py to see it) uses a genetic algorithm to simplify a string, or conversely make it more complex. Here, complexity is assessed by the size of its compressed length (using compressor gzip or bzip2). The point is to understand why this problem is difficult for standard genetic algorithms.
|In theory, incompressible strings are, by definition, random. Do you think that the string you can see is random? (remember that vertical stripes just result from the similarity between individuals; we are concerned with horizontal lines exclusively.) Try the BitString mode and the not-BitString mode.|
|Run the simulation while visualizing the genome (shortcut 'g'). Remember that the display slows down execution. You may set Display Period to a value such as 30. The program attempts to simplify initially random strings. Observe that though simplification does occur, the program does not reach periodic strings. Do you have an idea why?
Does changing the compressor (from gzip to bzip2) lead to observable differences?
The following experiment will help us understand why the problem is far from trivial. Compressors like Zip look for copies in the string. The following simplistic function splits the input string into chunks of equal length, and then computes the average Hamming distance between chunks.
def CheckPeriodicity(self, StrIn):
" splits a string into chunks and computes the average distance between chunks "
def hamming(Str1, Str2):
" computes the distance between two strings "
H = 0
for c in range(len(Str1)):
H += abs(ord(Str1[c]) - ord(Str2[c]))
NbChunk = 5
Chunks = 
ChunkLength = len(StrIn)//NbChunk
for NrChunk in range(NbChunk):
Chunks.append(StrIn[NrChunk*ChunkLength : (NrChunk+1)*ChunkLength])
Distance = 0
for Ch1 in Chunks:
for Ch2 in Chunks:
if Ch1 == Ch2: continue
Distance += hamming(Ch1, Ch2)
return '.' * (Distance/NrChunk**2) # output a string, as Zip does
Use this function as compressor in S_Zip.py, with parameter BitString set to 0. Observe that the algorithm more or less duplicates the first fifth of the string (you may take a photo (shortcut 'P' as the simulation is running), then print the generated file ___Genome__000nnn.gif and use scissors).
This is a coordination game, as all chunks have to ‘agree’ on which pattern to converge to.
Unfortunately, when the number of chunks increases, coordination becomes more difficult. You may see this by making NbChunk = 20.
|Thanks to this small experiment, can you better explain why evolution with Zip is unable to lead to a periodic pattern?|