Telecom Paris
Dep. Informatique & Réseaux

J-L. Dessalles Home page

November 2019



Module Athens TPT-09

    Emergence in Complex Systems      From nature to engineering
                                other AI courses

Contents

Please provide answers in English.

Genetic Algorithms

Basics

Genetic Algorithms were invented by John Holland in the nineteen sixties. His idea was to reproduce Darwinian selection in the computer to solve engineering problems. Condiser a problem, such as finding an optimal shape for a jet engine fan blade. Suppose that tentative solutions to the problem depend on a list of parameters. The point is to find parameter values that meets some efficiency criterion optimally.

The idea underlying GAs consists in keeping not only one candidate solution, but a set of several solutions simultaneously in the computer memory. These solutions are regarded as a "population" that the algorithm will evolve like a living species. These solutions are set randomly at first. They are quite bad, but some of them are still slightly better that the others. This is where Darwinian selection operates. Only a fraction of the population, the top performing individuals, are kept for reproduction. Selected individuals are paired and procreate. Children are used to reconstitute a whole population, and the cycle goes on.

Genetics is involved at the reproduction step. Each individual is characterized by a list of parameters. This list is coded as a binary vector that we will call DNA by analogy. It is also called genome or genotype (though these names are more appropiate to designate the semantics of the DNA). When two individuals breed, the child’s DNA is formed by mixing its parents’ DNA. This mixing operation is named crossover, after the corresponding biological processus. During crossover, random positions are drawn along DNA, and segments cut at these positions in both parental DNAs are exchanged. The resulting vector is the child’s DNA.

Designing a GA requires three phases.

GAs are a generic optimization method that proves particularly efficient to explore new problems. Note that additional techniques, such as phenotypic sharing, may be necessary to escape from local optima. GAs are also valuable to simulate and understand natural evolutionary phenomena. Theses two aspects, optimization and modelling, will be explored in this course.

Binary sum

The following example is meant as a didactic approach to genetic algorithms (and certainly not as a realistic application, as the problem is much too simple for a GA). The problem is to maximize the number of bits set to 1 in a binary vector. Starting from a blank population, in which individuals have all the bits equals to 0 in their genome, evolution through natural selection should rapidly produce ‘perfect’ individuals whose genome is uniformly made of 1s.

Click on image to see animation again

Click to animate again --> Run Starter (in the Evolife directory). It opens the Configuration Editor. Load the configuration SumBits.evo (from the directory Expe). Click on the Run button (it automatically saves the configuration as Evolife.evo and runs Evolife).

Observe the convergence toward the maximum (= 100, as there are 100 bits in the genome).

MutationRate 0
Set the mutation rate to zero, and rerun the simulation after having set the DNAFill parameter to -1 (to start from a random population). Explain what happens.
It may be useful to run the simulation step by step. You may magnify the genome window (shortcut ‘+’; ‘–’ to shrink).


    

MutationRate 10
Conversely, set the mutation rate to a high value, such as 5 or 10% (i.e. 50 or 100 per thousand), and observe what happens. What do you see? Why?


    

Crossover
Restart Evolife with a normal mutation rate, but now set the number of crossover points to zero. What happens? Why?


    

Population Size
Increase the size of the population significantly. Besides the fact that the program runs slower, what is the effect on the convergence?


    

Segmented Population
Keep a large population and split it into several groups by allowing a lower maximum size to groups, and set migration rate to zero. What do you observe when the population is segmented?
(Note that when a group disappears, another is likely to split up.)


    

Selection of zeros
In the Configuration Editor, locate the DNAFill parameter in the genetics section (it defines how DNA is initialized) and set it to –1 or to 1.
Locate the S_SumBits.py file (in the directory Scenarii) and make changes so as selection rewards individuals with the maximum number of zeros instead of ones. Copy these changes in the box below. Don’t forget to set this parameter back to 0 after having observed the effect.


    

Mutation Bias 1
Let’s make a slight temporary modification of the program.
Open DNA.py. Locate the mutate function. Change it so that a mutation sets the bit to zero 97% of the time and to one 3% of the time, regardless of its previous value (copy the modified python line in the box below).


    

Mutation Bias 2
Observe the consequence on the convergence toward 100. Do you notice variations in the slope of the curve?


    

     Don’t forget to restore the function back to its initial code.
For the remainder of this section, set parameter DNAFill to -1 to start from a random population.

Away from 50 1
Change the evaluation function in S_SumBits.py to reward individuals that are far away from the middle value 50 (try to keep scores always positive). Copy the modified python line in the box below.


    

Away from 50 2
Run the simulation several times. What do you observe? Why?


    

Away from 50 3
Set PopulationSize to something like 300. Introduce two groups, by changing the value of NumberOfGroups. Set MigrationRate to 0. Make several runs. What do you observe? What happens if you wait long enough?


    

Close to 50 1
Change again the evaluation function to reward individuals depending on how close they are to the middle value 50. Copy the modified python line in the box below.


    

Close to 50 2
Observe that genomes tend to resemble each other. The vertical patterns that result from this impoverishment seem to be quite stable through time. Increase the display period and observe that it is not the case in the long term. How do you interpret this phenomenon?


    

Genetic interpretation 1
Introduce a one-bit long gene in the function genemap in S_SumBits.py by adding a couple ('Interpretation', 1) to the list that the function is returning. You can read the value of this gene in evaluation by calling indiv.gene_value('Interpretation'). Use this value to compute the score of individuals. If 1, the score is equal to the sum of the bits of the gene sumbit (as now), and if 0, the score is 100 minus that value. Copy the modified evaluation in the box below.


    

Genetic interpretation 2
Display the average value of the gene Interpretation in the population by adding ('red', 'interpretation') in the list returned by display_ at the end of S_SumBits.py.
You should observe that Interpretation always evolves to a definite value. Why?


    

The Labyrinth

In this simple didactic application, a mobile robot enters the above labyrinth in (1,6) (see arrow) and looks for an exit. The simplest coding is implemented in Scenarii/S_Labyrinth.py. It consists, for each position in the labyrinth (we suppose that its maximal dimension is known in advance), in two bits that indicate which absolute direction to take. We thus have a 200 bit genome for a 10x10 grid.

Click on image to see animation again


Load the configuration Labyrinth_noshare.evo in the Configuration Editor. The evaluation function can be set up in the configuration: see parameters PenaltyU_turn, PenaltyWall and RewardExiting. If walls and u_turns are the numbers of times the robot hit walls or made a U-turn, then:

Evaluation = PenaltyWall * (MaxSteps – walls) + PenaltyU_turn * (MaxSteps – u_turns) + RewardExit * (MaxSteps – step –1)

Run Evolife and observe what happens for reasonable values of the parameters. You may display trajectories (shortcut 'T') in real time. When the robot happens to bump into a wall, it is moved in a random direction. This behaviour allows certain individuals to find an exit by chance, without being able to pass on this success to their offspring.

Labyrinth no share 1
In many trials, we observe a slow stepwise evolution. Observe (possibly by running the experiment several times) that the population may get trapped in cyclic trajectories. Why?


    

Labyrinth no share 2
Wait until an exit is eventually found (it may be long, restart the simulation if you loose patience). Observe the stepwise evolution of performance. Set MaxSteps to 40 instead of 55. It should increase the stability of cyclic traps. Why is it so?


    
An alternative evaluation function uses phenotypic sharing: Individuals that guide their robot on the same tracks as previous individuals get penalized. To achieve this, poisonous pheromone is laid down by robots. This has the nice side-effect of preventing robots from following cyclic paths.

Evaluation = PenaltyPoison * (MaxPheromon * MaxSteps +1 - poison) + RewardExit * (MaxSteps - step - 1)


Labyrinth share
Load the configuration Labyrinth_share.evo in the Configuration Editor. Choose reasonable values for the new parameters PenaltyPoison and MaxPheromon (the other parameters of the evaluation function may be set to zero). Run Evolife. What are the consequences of laying down more poison?

    Faster evolution     Slower evolution     Inconclusive    

    More exploration     Less exploration    

    More exploitation     Less exploitation    

    

Complexity

This experiment explores the (limited) ability of G.A. to solve a simplification/complexification problem.

Load the configuration Zip.evo in the Configuration Editor. This scenario (open S_Zip.py to see it) uses a genetic algorithm to simplify a string, or conversely make it more complex. Here, complexity is assessed by the size of its compressed length (using compressor gzip or bzip2). The point is to understand why this problem is difficult for standard genetic algorithms.

Complexifying

Set the Simplify parameter (see under Scenarios/Genetic Algorithms/Zip) to 0. The program will evaluate individuals depending on their compressed size, as calculated by a compressor such as gzip or bzip2. Set parameter DNAFill to 0 or to 1 to start from a blank population. After a while, the algorithm finds DNAs that gzip or bzip2 is unable to compress.

Complexification
In theory, incompressible strings are, by definition, random. Do you think that the string you can see is random? (remember that vertical stripes just result from the similarity between individuals; we are concerned with horizontal lines exclusively.) Try the BitString mode and the not-BitString mode.


    

Simplification

Set the Simplify parameter (see under Scenarios/Genetic Algorithms/Zip) to 1, and start from random genomes by setting parameter DNAFill to –1. The program will evaluate individuals depending on their compressed size, as calculated by a compressor such as gzip or bzip2. Starting from incompressible (because random) strings, we expect the system to evolve constant strings (all 1s or all 0s) because they are the most compressible ones. Or at least to evolve to periodic strings, that are quite compressible too.

Simplification
Run the simulation while visualizing the genome (shortcut 'g'). Remember that the display slows down execution. You may set Display Period to a value such as 30. The program attempts to simplify initially random strings. Observe that though simplification does occur, the program does not reach periodic strings. Do you have an idea why?

Does changing the compressor (from gzip to bzip2) lead to observable differences?



    

The following experiment will help us understand why the problem is far from trivial. Compressors like Zip look for copies in the string. The following simplistic function splits the input string into chunks of equal length, and then computes the average Hamming distance between chunks.

def CheckPeriodicity(self, StrIn):
    " splits a string into chunks and computes the average distance between chunks "
    
    def hamming(Str1, Str2):
        " computes the distance between two strings "
        H = 0
        for c in range(len(Str1)):
            H += abs(ord(Str1[c]) - ord(Str2[c]))
        return H
    
    NbChunk = 5
    Chunks = []
    ChunkLength = len(StrIn)//NbChunk
    for NrChunk in range(NbChunk):
        Chunks.append(StrIn[NrChunk*ChunkLength : (NrChunk+1)*ChunkLength])
    Distance = 0
    for Ch1 in Chunks:
        for Ch2 in Chunks:
            if Ch1 == Ch2: continue
            Distance += hamming(Ch1, Ch2)
    return '.' * (Distance/NrChunk**2)     # output a string, as Zip does

Use this function as compressor in S_Zip.py, with parameter BitString set to 0. Observe that the algorithm more or less duplicates the first fifth of the string (you may take a photo (shortcut 'P' as the simulation is running), then print the generated file ___Genome__000nnn.gif and use scissors).

This is a coordination game, as all chunks have to ‘agree’ on which pattern to converge to.
Unfortunately, when the number of chunks increases, coordination becomes more difficult. You may see this by making NbChunk = 20.

Chunks
Thanks to this small experiment, can you better explain why evolution with Zip is unable to lead to a periodic pattern?


    


            

Back to the main page