Optimization and Genetic Algorithms

Page

When we say we want to optimize some situation or a solution for a problem, we usually mean we are looking for a way to achieve a combination of better of many (or at least two) worlds. We want only better of these worlds (or problem parameters) because achieving the best in each one of them is seen to restrict the other worlds (or problem parameters) from becoming its best.

To give a quick example, to maximize his potential energy, Humpty Dumpty needs to sit (and remain) on the wall. To maximize his kinetic energy, he needs to have a great fall and be at a place just before actually hitting the ground. Obviously the best position for maximizing one parameter restricts the other from becoming maximized. To have forever a better of both worlds of potential and kinetic energy, Humpty Dumpty should have a fall and should not touch the ground forever.

Without contemplating further on how to fall upward and turning back to our optimization, these restrictions on the problem parameters are what are known as constraints in an optimization problem. An optimization without the related constraint(s) doesn’t necessarily optimize. There is an excellent introductory article by Prof. A. K. Mallik in the June 2008 issue of Resonance (see reference [1]) that explains optimization problems in elementary geometry, without much calculus – a necessary mathematical tool for several optimization methods.

So if we want the best of one particular world in our optimization problem solution, then it may not be possible to achieve simultaneously the best in other worlds. Such multi variate optimization problem are abundant in all the engineering disciplines. In fact, to engineer means to optimize something for a given situation with the given (restricted) resources. Close to our context, given the finiteness of time (per day or month or in one’s lifetime), how to do science, research, learn, teach, write, crib, sleep, blog, family, dream and vegetate, all for as much time as possible, is a multi parameter optimization problem that has as many solutions as we desire it to be. And that is where effective solution search techniques enter.

In engineering we encounter an optimization problem for which we anticipate multiple solutions based on restrictive ranges of real life parameters that affect the problem. More than one such solution nevertheless might satisfy the optimization constraints and yield a reasonably correct (or required) global solution within an accepted margin of discrepancy. To find which of these solutions is better (uniqueness based on restrictions) would not be possible through perhaps a linear solution methodology.

Genetic Algorithm [2] is a search technique, proposed by John Henry Holland [3] in the 1970s, that finds for us the better (optimized) solution by iteratively combining an initial pool of solutions, each successive iteration bettering in some way the solution from the previous iteration., in a sequence of ”genetic evolution” process. The final satisfactory solution set or ”genetically evolved population”, fits the imposed optimization constraints for the problem.

Genetic Algorithms involve an ”explorative logic” which ensures that a large number of solutions, marginally better or worse are considered, while avoiding convergence on local maxima. For instance, while seeking a globally optimized solution, if only a single constraint of ”checking whether a variable become greater than a value” is used to verify the arrival of the correct solution, it is possible that such a solution satisfies only a local maximum for that variable when it reaches a maximum value. Obviously, in a problem involving constraints for more than one variable this local maximum solution need not be the correct one.

In genetic algorithm each solution is considered as a genome so the initial solution set (randomly generated and need not contain the optimized solution) is a population of genomes differing unique enough in some minor changes in the parameter set considered to generate the solutions. These solutions are then iteratively changed to better solutions by doing iteratively two things. First the parameters governing the solution are changed slightly by an exhaustive sequence (random and systematic) covering the entire range for all the parameters involved. To facilitate this process usually the solution population is represented as a genome sequence or a computer byte with multiple bits. These binary representation of the ”individual” apportions specific number of bits for parameters involved, depending on the problem.

Each new generation of ”genome population” thus generated are checked with an ”objective or fitness function” for their validity or ”fertility”. Depending on the fitness values, pairs of individuals from this solution set called ”parents” are chosen for ”breeding” using operators borrowed form natural genetics. The level of fitness of an individual dictates its chances of reproducing and surviving through generations. Thus GA starts with a randomly chosen population and refines them over generations.

Breeding is done in two ways. Mimicking the evolution process as an algorithm, mutation [7] introduces some randomness to the solution ”generations”, say, by randomly changing some parameter over a range. Crossover [8] usually combines two parent solutions and produces two more children or offspring solutions that carry ”genetic information” or parameter combinations from the parents. In the binary representation mentioned above, crossover can be understood to involve swapping of some of the bits of the two selected parents from a specified bit position. Mutation randomly alters a bit in the representation of the offspring depending on mutation probability thus aiding the ”explorative logic”. A new population is created thus, which retains the best individuals of the previous generation and replaces the rest of the individuals by the offspring.

To perceive the analogue, a sample of chromosomal mutations out of the five possible ones and the concept of crossover is shown in the accompanying pictures. Source of these pictures are given in reference [6] and those pages carry further explanations. Since the performance of GA is dependent on the choice of various GA parameters like Crossover Probability, Mutation Probability etc., prediction of the choice of parameters for ensuring best results is seldom possible.

The solution subset that deviates away from the desired solution (deduced from their deviation from the objective function) are ”faded out” of subsequent iterations. By fading out we mean the solution subsets are not immediately discarded in the next iteration itself. They are carried over for enough future iterations to ensure their evolutionary refuse towards the required solution – and allowed a ”natural extinction” in the simulated ”survival of the fittest” competition.

The above sequence of operations is repeated for many iterations, smoking a reasonably high end computer depending on the problem parameters or the size of the binary representation of the sample solution. After a few or million such generations GA converges to the best solution. At this simulation instant almost all the individuals in the solution population represent the same solution, revealing identical values for their fitness function. Thus the population is unlikely to evolve further.

To summarize the above explanation: There are three important components in writing a successful genetic algorithm (1) objective function should be properly defined to catch the optimum you are looking for (2) genetic representation or generic solution represented as a "genome" should be properly defined and used (3) definition of the genetic operators that govern the mutation or crossover of one solution into another that is closer to the desired solution.

*****

Further Reading

1. Optimization Problems in Elementary Geometry, A. K. Mallik, Resonance – Journal of science education, June 2008, Vol. 13, Num. 6, pp. 561 – 582. [ link to pdf and the June 2008 issue ]

2. For web resources, check the wikipedia write-up for Genetic Algorithm and the external links given there.

3. John Henry Holland, Adaptation in Natural and Artificial Systems: 2nd edition. MIT Press. 1992 [Wikipedia page on John Holland]

4. Global Optimization Algorithms – Theory and Application; This link is a pdf document by Thomas Weise. A free book on genetic algorithm.

5. When Least is Best by Paul J. Nahin. A good book on optimization solutions without using calculus.

6. Mutation – [http://en.wikipedia.org/wiki/Mutation_(genetic_algorithm)]

7. Crossover – [http://en.wikipedia.org/wiki/Crossover_(genetic_algorithm)]