In this post, we’ll implement a genetic algorithm using Python and NumPy.
A genetic algorithm is a type of optimization algorithm that mimics natural selection to find the optimal point (or points) in the design space. It uses a population of solutions that evolve over several generations toward better solutions. Each individual in the population represents a potential solution to the problem encoded in a way that can be evaluated for its fitness.
Over successive generations, individuals with higher fitness are more likely to pass their traits on to the next generation. An individual’s traits are passed down to the next generation using three main genetic operators: selection, crossover, and mutation.
As its name hints, the selection operator chooses the fittest individuals in the current population. The selected individuals will produce offspring. Next, the crossover operator combines the genetic information of two (or more) parent individuals to create one or more offspring. Thus, crossover introduces variation into future populations and mixes successful traits. Finally, the new population resulting from the crossover operator is subject to mutations, or small, random changes in the individuals’ characteristics. Mutations help introduce diversity into populations and avoid premature convergence to suboptimal solutions.
The Problem
The problem we’ll try to solve can be formally stated as:
Minimize
With respect to
Such that for
The objective function is known as the Michalewicz function, which we can define in Python like this:
Code
import numpy as np# Set the random seed for reproducibilitynp.random.seed(42)def michalewicz_function(X: np.ndarray) -> np.ndarray:""" Computes the Michalewicz function for a given 2D input array. The Michalewicz function is a test problem for optimization algorithms, particularly known for its steep valleys and ridges. It is defined for two input variables, x1 and x2, with the following equation: f(x1, x2) = -sin(x1) * (sin(x1^2 / pi))^20 - sin(x2) * (sin(2 * x2^2 / pi))^20 Parameters ---------- X : np.ndarray A 2D numpy array of shape (n_samples, 2) where each row contains the values of x1 and x2. X[:, 0] corresponds to x1 and X[:, 1] corresponds to x2. Returns ------- np.ndarray A 1D numpy array of shape (n_samples,) containing the computed Michalewicz function values for each (x1, x2) pair. Example ------- >>> X = np.array([[2.20, 1.57], [2.90, 2.30]]) >>> michalewicz_function(X) array([-1.80114072e+00, -2.54559837e-08]) """ x1, x2 = X[:, 0], X[:, 1] term1 =-np.sin(x1) * np.sin(x1**2/ np.pi) **20 term2 =-np.sin(x2) * np.sin(2* x2**2/ np.pi) **20return term1 + term2
The red dot in the plot above indicates the function’s global minimum. It has a value of -1.8013 at . As you can tell, this function will likely give a hard time to gradient-based optimizers because it has several local minima and flat regions, which could hamper these optimizers’ progress. Therefore, the Michalewicz function is a good candidate for trying out a different optimization approach, like the genetic algorithm we’re going to implement.
Selection
Before implementing the selection operator, let’s define how we’ll handle populations and individuals to solve our optimization problem. One straightforward way is to use NumPy arrays. Individuals will be NumPy arrays, and populations will be arrays. Let’s make an example:
As we can see, the population is represented as a NumPy array. Each row in the array corresponds to one individual design in the population. The number of rows is determined by the population_size variable, which we set at 10 above.
Now, we can proceed to define the selection operator. To do this, we’ll rank the population members according to the objective function values they yield and keep only the top individuals. Let’s define a function to do just that:
Code
from collections.abc import Callablefrom typing import Optionaldef selection_operator( objective: Callable[[np.ndarray], np.ndarray], population: np.ndarray, individuals_to_keep: Optional[int] =None,) -> np.ndarray:""" Selects the top individuals from a population based on their objective function values. This function applies a selection operator commonly used in evolutionary algorithms. It evaluates the objective function for each individual in the population, ranks them based on their objective values, and selects the top-performing individuals to keep. Parameters ---------- objective : Callable[[np.ndarray], np.ndarray] A callable function that takes the population as input and returns a 1D array of objective function values. This function is used to evaluate the performance of each individual in the population. population : np.ndarray A 2D numpy array where each row represents an individual in the population. The shape is typically (n_individuals, n_features), where `n_individuals` is the number of individuals in the population, and `n_features` is the number of features or parameters representing each individual. individuals_to_keep : Optional[int], default=None The number of top individuals to keep in the population after selection. If None, 50% of the population is retained. Returns ------- np.ndarray A 2D numpy array containing the selected individuals, with the same number of features as the original population but fewer individuals, determined by `individuals_to_keep`. Example ------- >>> def sample_objective(pop): ... return np.sum(pop ** 2, axis=1) >>> population = np.array([[1, 2], [3, 4], [5, 6], [7, 8]]) >>> selection_operator(sample_objective, population, individuals_to_keep=2) array([[1, 2], [3, 4]]) Notes ----- - The objective function should return lower values for better individuals (e.g., a minimization problem). - The function sorts individuals in ascending order based on their objective function values and selects the best ones to retain. """if individuals_to_keep isNone: individuals_to_keep =int(0.5* population.shape[0]) objective_values = objective(population) sorted_indices = np.argsort(objective_values)[:individuals_to_keep]return population[sorted_indices]
We can check whether the selection_operator function works by verifying if it returns the population members with the lowest objective values. Let’s use the population variable we created above to make a DataFrame with its design variables and corresponding objectives.
We would expect calling the selection_operator function (with individuals_to_keep=4) to return an array with the first four elements of this DataFrame.
As we see here, that’s the case, so the selection_operator function is working as expected.
Crossover
Next, we’ll make a crossover operator. This operator takes a population, groups its individuals in pairs of “parents,” and produces a number of “offspring” by combining traits (design variable values) from each one. Note that we are picking pairs of parents somewhat arbitrarily here; we could just as easily create offspring from a different number of parents here, but we’ll stick to two for simplicity. Let’s see a possible implementation of the crossover operator!
First, we’ll define a function called make_offspring_individual to make a single offspring from two parents.
Code
def make_offspring_individual(parents: np.ndarray) -> np.ndarray:""" Creates a single offspring individual through crossover from two parent individuals. Each gene (feature) is randomly selected from either parent to create a new offspring. The function ensures that at least one gene is taken from each parent. Parameters ---------- parents : np.ndarray A 2D numpy array of shape (2, n_genes), where each row represents one parent and each column represents a gene. There must be exactly two parent individuals, and each must have the same number of genes (design variables). Returns ------- np.ndarray A 1D numpy array representing the offspring individual, with the same number of genes as the parents. The genes are randomly selected from the two parents. Example ------- >>> parents = np.array([[1, 2, 3, 4], [5, 6, 7, 8]]) >>> make_offspring_individual(parents) array([1, 6, 3, 8]) Notes ----- - The crossover is controlled by a randomly generated binary mask (`crossover_indices`), where a value of 0 selects the gene from the first parent and a value of 1 selects the gene from the second parent. - The function ensures that the crossover mask is not all 0s or all 1s, which would result in the offspring being identical to one of the parents. """ num_genes = parents.shape[1] crossover_indices = np.random.randint(0, 2, num_genes)while np.all(crossover_indices ==0) or np.all(crossover_indices ==1): crossover_indices = np.random.randint(0, 2, num_genes)return parents[crossover_indices, np.arange(num_genes)]
Here, parents is a population array with only two members. First, we determine the number of “genes,” or design variables, in each parent. Then, we use NumPy’s randint to determine which parent will pass on which gene to the offspring. Next, we perform a check to establish that the offspring will not contain genes from only one of the parents. Finally, we return an array representing a new individual with genes that have been “inherited” from its parents.
Now, let’s actually implement the crossover_operator function. This function will take a population NumPy array and an (optional) number of offspring to produce. The function will call make_offspring_batch to shuffle the input population, create pairs of “parents,” and then create the individual offspring using make_offspring_individual, defined above. If more offspring are required, make_offspring_batch is called again as needed.
Code
from copy import deepcopydef make_offspring_batch(population: np.ndarray) -> np.ndarray:""" Creates a batch of offspring individuals from a given population using pairwise crossover. This function performs crossover between randomly shuffled pairs of individuals in the population to produce a new batch of offspring. If the population size is odd, the first individual is duplicated to make the number of individuals even. Each pair of individuals undergoes crossover using the `make_offspring_individual` function to generate one offspring. Parameters ---------- population : np.ndarray A 2D numpy array of shape (n_individuals, n_genes) where each row represents an individual and each column represents a gene. The population is randomly shuffled before forming pairs for crossover. Returns ------- np.ndarray A 2D numpy array of shape (n_individuals, n_genes) containing the offspring individuals. The number of offspring is equal to the size of the input population, with each individual produced from a pairwise crossover between shuffled members of the population. Example ------- >>> population = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> make_offspring_batch(population) array([[1, 8, 3], [7, 5, 9], [4, 2, 6]]) Notes ----- - The population is shuffled randomly before forming pairs, ensuring genetic diversity in the offspring. - If the population size is odd, the first individual is duplicated to make an even number of individuals for pairing. - The function calls `make_offspring_individual` to generate one offspring for each pair. """ pop_copy = deepcopy(population) np.random.shuffle(pop_copy)if pop_copy.shape[0] %2!=0: pop_copy = np.vstack((pop_copy, pop_copy[0])) pop_copy = pop_copy.reshape(-1, 2, pop_copy.shape[-1])return np.array([make_offspring_individual(pair) for pair in pop_copy])def crossover_operator( population: np.ndarray, n_offspring: Optional[int] =None) -> np.ndarray:""" Generates a specified number of offspring through crossover from a given population. This function applies a crossover operation on a population to produce a specified number of offspring. The population is used to generate offspring through the `make_offspring_batch` function, which performs pairwise crossover. If the required number of offspring is greater than the population size, additional batches of offspring are generated until the desired number is reached. Parameters ---------- population : np.ndarray A 2D numpy array of shape (n_individuals, n_genes) representing the population. Each row corresponds to an individual, and each column represents a gene. n_offspring : Optional[int], default=None The number of offspring to generate. If not specified, the number of offspring will be equal to the population size. Returns ------- np.ndarray A 2D numpy array of shape (n_offspring, n_genes) containing the generated offspring. The number of offspring returned will be exactly `n_offspring`. Example ------- >>> population = np.array([[1, 2], [3, 4], [5, 6], [7, 8]]) >>> crossover_operator(population, n_offspring=6) array([[1, 2], [3, 4], [5, 6], [7, 8], [1, 6], [5, 2]]) Notes ----- - If `n_offspring` is not provided, the function will generate the same number of offspring as the population size. - If more offspring are needed than can be produced in a single batch, additional batches are generated and concatenated until the required number of offspring is reached. - The function uses `make_offspring_batch` to generate each batch of offspring via crossover between shuffled pairs of individuals from the population. """if n_offspring isNone: n_offspring = population.shape[0] offspring = make_offspring_batch(population)whilelen(offspring) < n_offspring: offspring = np.vstack( ( offspring, make_offspring_batch(population), ), )return offspring[:n_offspring]
We can test crossover_operator with our previously defined population:
As shown above, crossover_population produces new designs that exhibit a mixture of their parents’ traits.
Mutation
Now, we can move on to the final operator for our genetic algorithm, mutation. Mutation takes an input design vector’s values and adds small, random changes to them. In our implementation, we’ll create our mutation operator with two parameters: probability and stdev. probability Determines how likely it is for a component of the design vector to be subjected to a random change. Meanwhile, stdev represents the standard deviation of the normal distribution that we’ll use to obtain our new mutated values. Both probability and stdev can be scalars or arrays. If they’re arrays, we can assign specific probabilities and standard deviations to each design variable in the design vector.
The mutation_operator function shown below generates a mask to establish which indices will be mutated. Then, it samples from a normal distribution centered at the current design points to generate the new, mutated designs. Finally, these mutated designs are applied to the population using NumPy’s where.
Code
from typing import Tuple, Iterabledef mutation_operator( population: np.ndarray, probability: float| Iterable[float] =0.1, stdev: float| Iterable[float] =0.3, bounds: Tuple[float, float] = (0, np.pi),) -> np.ndarray:""" Applies mutation to a population of individuals with a given probability and standard deviation. This function introduces mutations into the population by adding random noise to individual genes based on a normal distribution. The probability of mutation and the standard deviation of the noise can be specified either as a single value or as an array specifying different values for each gene. The mutated values are clipped to remain within specified bounds. Parameters ---------- population : np.ndarray A 2D numpy array of shape (n_individuals, n_genes), where each row represents an individual and each column represents a gene. probability : float or Iterable[float], default=0.1 The mutation probability for each gene. This can be a single float value (applied uniformly to all genes) or an iterable specifying the mutation probability for each gene separately. stdev : float or Iterable[float], default=0.3 The standard deviation of the normal distribution used to generate mutations. This can be a single float value (applied uniformly to all genes) or an iterable specifying different standard deviations for each gene. bounds : Tuple[float, float], default=(0, np.pi) A tuple specifying the lower and upper bounds for the gene values. Mutated values are clipped to lie within these bounds. Returns ------- np.ndarray A 2D numpy array of shape (n_individuals, n_genes) representing the population after mutation. Each gene has a chance of being mutated according to the specified mutation probability and mutation size. Example ------- >>> population = np.array([[1.0, 2.0], [2.5, 3.5]]) >>> mutated_population = mutation_operator(population, probability=0.2, stdev=0.1) >>> mutated_population array([[1.1, 2.0], [2.4, 3.5]]) Notes ----- - Mutation is applied independently to each gene of each individual with a probability defined by `probability`. Genes that are mutated will have random noise added from a normal distribution centered at the original gene value with a standard deviation defined by `stdev`. - The mutation probabilities and standard deviations can be provided either as scalars (for uniform mutation behavior across genes) or as arrays of the same length as the number of genes (for different mutation behaviors across genes). - After mutation, gene values are clipped to the specified `bounds` to ensure they remain within valid ranges. """ pop_size, n_variables = population.shapeifisinstance(probability, (int, float)): probability = np.full(n_variables, probability) probability = np.asarray(probability)ifisinstance(stdev, (int, float)): stdev = np.full(n_variables, stdev) stdev = np.asarray(stdev) mutation_mask = np.random.random((pop_size, n_variables)) < probability[None, :] mutations = np.random.normal( loc=population, scale=stdev[None, :], size=population.shape ) mutations = np.clip(mutations, *bounds)return np.where(mutation_mask, mutations, population)
We can test our function by calling it on our example population:
The result above shows that a few points were changed, but not by much due to the low default standard deviation. We can achieve bigger changes with respect to the original population by increasing both probability and stdev:
With our selection, crossover, and mutation operators defined, we’re ready to implement the entire genetic algorithm. Let’s start by defining a function optimization_step that returns a new population after applying all three genetic operators to an existing population.
Code
def optimization_step( population: np.ndarray, objective: Callable, individuals_to_keep: Optional[int] =None, n_offspring: Optional[int] =None, mutation_probability: float| Iterable[float] =0.1, mutation_stdev: float| Iterable[float] =0.3, bounds: Tuple[float, float] = (0, np.pi),) -> np.ndarray:""" Performs one step of an genetic optimization process, including selection, crossover, and mutation. This function executes a single optimization step as part of a genetic algorithm. It first selects the top individuals based on their objective values, generates offspring through crossover, and applies mutation to the offspring. The result is a new population ready for the next optimization step. Parameters ---------- population : np.ndarray A 2D numpy array of shape (n_individuals, n_genes) representing the population. Each row corresponds to an individual, and each column represents a gene. objective : Callable A function that takes the population as input and returns a 1D numpy array of objective function values, used to evaluate and rank individuals. individuals_to_keep : Optional[int], default=None The number of top individuals to retain after selection. If None, 50% of the population is retained. n_offspring : Optional[int], default=None The number of offspring to generate after crossover. If None, the number of offspring will be equal to the size of the population. mutation_probability : float or Iterable[float], default=0.1 The probability of mutation for each gene. This can be a single float (uniform probability for all genes) or an iterable specifying the mutation probability for each gene individually. mutation_stdev : float or Iterable[float], default=0.3 The standard deviation of the normal distribution used to generate mutations. This can be a single float or an iterable specifying different mutation sizes for each gene. bounds : Tuple[float, float], default=(0, np.pi) A tuple specifying the lower and upper bounds for the gene values. Mutated values are clipped to lie within these bounds. Returns ------- np.ndarray A 2D numpy array representing the new population after selection, crossover, and mutation. The shape is (n_offspring, n_genes). Example ------- >>> population = np.array([[1.0, 2.0], [2.5, 3.5], [1.5, 2.5], [3.0, 4.0]]) >>> def objective_fn(pop): ... return np.sum(pop ** 2, axis=1) >>> optimization_step(population, objective_fn, individuals_to_keep=2, n_offspring=4) array([[1.1, 2.0], [2.5, 3.4], [1.5, 2.4], [3.0, 4.0]]) Notes ----- - The selection step retains the top individuals based on their objective function values. - The crossover step generates new offspring through crossover between the selected individuals. - The mutation step applies random mutations to the offspring, with mutation size and probability controlled by the `mutation_probability` and `mutation_stdev` parameters. - The number of offspring can be controlled via the `n_offspring` parameter. If not specified, it defaults to the size of the input population. """if n_offspring isNone: n_offspring =len(population) selected_population = selection_operator( objective, population, individuals_to_keep=individuals_to_keep ) offspring = crossover_operator(selected_population, n_offspring=n_offspring)return mutation_operator( offspring, probability=mutation_probability, stdev=mutation_stdev, bounds=bounds )
Now, before putting optimization_step in a loop and getting our results, we’ll make a function wrapper to help us keep track of the points we’ve evaluated and the corresponding objective function values. This wrapper saves the design vectors and objective function values to a Pandas DataFrame that we can then use for visualization and debugging purposes.
Code
import functools# DataFrame to store inputs and outputsdata_log = pd.DataFrame(columns=["x1", "x2", "objective"])def log_function_call(func):""" A decorator that logs the function input and output to a DataFrame. """@functools.wraps(func)def wrapper(*args, **kwargs):global data_log X = args[0] x1, x2 = X[:, 0], X[:, 1] result = func(*args, **kwargs) new_data = pd.DataFrame({"x1": x1, "x2": x2, "objective": result}).dropna( how="all" )ifnot new_data.empty: data_log = ( new_dataif data_log.emptyelse pd.concat([data_log, new_data], ignore_index=True) )return resultreturn wrappermichalewicz_with_logger = log_function_call(michalewicz_function)
We’re finally ready to execute the entire optimization loop. Let’s start by creating a randomly sampled initial population with 50 members:
We ran the genetic algorithm for 100 generations and a population size of 50. Let’s see how the objective function evolved as a function of the number of evaluations.
In this graph, we see that already by around 500 function evaluations the genetic algorithm gets quite close to the known global minimum of the Michalewicz function. We also see that by around 400 evaluations, the rolling mean of the objective stays consistently below -1.5, indicating that the newer generations are converging towards the global minimum.
One more thing we can do is visualize the population’s evolution as a scatter plot over the Michalewicz function’s filled contour plot, as shown below. In this case, the optimizer converges quickly to the global minimum, and later generations stay near it.
In this post, we implemented a genetic algorithm and showed how we can use it to optimize a function that might be hard to tackle for gradient-based optimizers like the Michalewicz function. We developed a version of each of the fundamental genetic operators, namely selection, crossover, and mutation, and put them all together to find the objective function’s global optimum in a simple for loop.
However, despite our success, there are a number of things that we can address in future work. First, there are many different versions of the selection, crossover, and mutation operators, which may be more or less suitable for various problems. Here, we implemented a simple version of each operator, but there are many other versions that can be much more complex and efficient than the ones we used. Also, the code itself may have some inefficiencies that could be ironed out in the future. For instance, the make_offspring_batch function we used for the crossover operator uses Python’s copy.deepcopy, which may not be the best for memory efficiency. Additionally, we presented a single set of results here using the default parameters that we set, but it may be interesting to have a look at what happens when we change the population size, the number of generations, and the mutation probabilities and standard deviations. Also, even though the Michalewicz problem is challenging, there are other problems that may be even more difficult for traditional optimizers. It would be interesting to explore how adding constraints or mixed-discrete design spaces would affect algorithm performance and how, with these additions, we would have to modify our operator definitions.
Finally, I have to mention that there are some very good implementations of genetic algorithms in Python out there. If you’re working on a larger project, you might want to look at mature packages that offer genetic algorithms (and other optimizers) out of the box. I’m particularly keen on Pymoo, which focuses on multi-objective optimization but also offers single-objective algorithms.