Implementing a Genetic Algorithm in Python

code
optimization
Author

Santiago Valencia

Published

November 5, 2024

In this post, we’ll implement a genetic algorithm using Python and NumPy.

A genetic algorithm is a type of optimization algorithm that mimics natural selection to find the optimal point (or points) in the design space. It uses a population of solutions that evolve over several generations toward better solutions. Each individual in the population represents a potential solution to the problem encoded in a way that can be evaluated for its fitness.

Over successive generations, individuals with higher fitness are more likely to pass their traits on to the next generation. An individual’s traits are passed down to the next generation using three main genetic operators: selection, crossover, and mutation.

As its name hints, the selection operator chooses the fittest individuals in the current population. The selected individuals will produce offspring. Next, the crossover operator combines the genetic information of two (or more) parent individuals to create one or more offspring. Thus, crossover introduces variation into future populations and mixes successful traits. Finally, the new population resulting from the crossover operator is subject to mutations, or small, random changes in the individuals’ characteristics. Mutations help introduce diversity into populations and avoid premature convergence to suboptimal solutions.

The Problem

The problem we’ll try to solve can be formally stated as:

  • Minimize \(f(x) = -\sin(x_1)\sin^{20}\left(\frac{x_1^2}{\pi}\right) -\sin(x_2)\sin^{20}\left(\frac{2x_2^2}{\pi}\right)\)
  • With respect to \(x=[x_1, x_2]^\top\)
  • Such that \(x_i \in [0, \pi]\) for \(i=1,2\)

The objective function \(f\) is known as the Michalewicz function, which we can define in Python like this:

Code
import numpy as np

# Set the random seed for reproducibility
np.random.seed(42)


def michalewicz_function(X: np.ndarray) -> np.ndarray:
    """
    Computes the Michalewicz function for a given 2D input array.

    The Michalewicz function is a test problem for optimization algorithms,
    particularly known for its steep valleys and ridges. It is defined for
    two input variables, x1 and x2, with the following equation:

    f(x1, x2) = -sin(x1) * (sin(x1^2 / pi))^20 - sin(x2) * (sin(2 * x2^2 / pi))^20

    Parameters
    ----------
    X : np.ndarray
        A 2D numpy array of shape (n_samples, 2) where each row contains the values
        of x1 and x2. X[:, 0] corresponds to x1 and X[:, 1] corresponds to x2.

    Returns
    -------
    np.ndarray
        A 1D numpy array of shape (n_samples,) containing the computed Michalewicz
        function values for each (x1, x2) pair.

    Example
    -------
    >>> X = np.array([[2.20, 1.57], [2.90, 2.30]])
    >>> michalewicz_function(X)
    array([-1.80114072e+00, -2.54559837e-08])
    """
    x1, x2 = X[:, 0], X[:, 1]
    term1 = -np.sin(x1) * np.sin(x1**2 / np.pi) ** 20
    term2 = -np.sin(x2) * np.sin(2 * x2**2 / np.pi) ** 20
    return term1 + term2

Here’s a plot of the function:

Code
import pandas as pd
import plotly.graph_objects as go

x1_vals = np.linspace(0, np.pi, 400)
x2_vals = np.linspace(0, np.pi, 400)
x1, x2 = np.meshgrid(x1_vals, x2_vals)

x1_vals_flat = x1.flatten()
x2_vals_flat = x2.flatten()

X_flat = np.column_stack((x1_vals_flat, x2_vals_flat))


z_flat = michalewicz_function(X_flat)

z = z_flat.reshape(x1.shape)


df = pd.DataFrame({"x1": x1_vals_flat, "x2": x2_vals_flat, "z": z_flat})


fig = go.Figure(
    data=[
        go.Scatter3d(
            x=df["x1"],
            y=df["x2"],
            z=df["z"],
            mode="markers",
            marker=dict(size=3, color=df["z"], colorscale="Viridis"),
            name="",
            showlegend=False,
        )
    ]
)

fig.add_trace(
    go.Scatter3d(
        x=[2.2],
        y=[1.57],
        z=[-1.803],
        mode="markers",
        marker=dict(size=6, color="red"),
        name="Global minimum at (x1=2.20, x2=1.57)",
        showlegend=False,
    )
)

fig.update_layout(
    title="Michalewicz Function",
    scene=dict(xaxis_title="x1", yaxis_title="x2", zaxis_title="f(x1, x2)"),
)

fig.show()

The red dot in the plot above indicates the function’s global minimum. It has a value of -1.8013 at \(x_1=2.20, x_2=1.57\). As you can tell, this function will likely give a hard time to gradient-based optimizers because it has several local minima and flat regions, which could hamper these optimizers’ progress. Therefore, the Michalewicz function is a good candidate for trying out a different optimization approach, like the genetic algorithm we’re going to implement.

Selection

Before implementing the selection operator, let’s define how we’ll handle populations and individuals to solve our optimization problem. One straightforward way is to use NumPy arrays. Individuals will be \(1 \times 2\) NumPy arrays, and populations will be \(n\times 2\) arrays. Let’s make an example:

Code
bounds = (0, np.pi)

individual = np.array([1.5, 0.2])

population_size = 10

population = np.random.uniform(*bounds, (10, 2))

print(population)
[[1.17665249 2.98675708]
 [2.29962679 1.8807411 ]
 [0.49014701 0.49007124]
 [0.18247505 2.72117262]
 [1.8884585  2.22447561]
 [0.0646681  3.04706167]
 [2.61519568 0.66708299]
 [0.57121998 0.57618226]
 [0.9558052  1.64857095]
 [1.3569953  0.91492333]]

As we can see, the population is represented as a \(2\times 10\) NumPy array. Each row in the array corresponds to one individual design in the population. The number of rows is determined by the population_size variable, which we set at 10 above.

Now, we can proceed to define the selection operator. To do this, we’ll rank the population members according to the objective function values they yield and keep only the top \(n\) individuals. Let’s define a function to do just that:

Code
from collections.abc import Callable
from typing import Optional


def selection_operator(
    objective: Callable[[np.ndarray], np.ndarray],
    population: np.ndarray,
    individuals_to_keep: Optional[int] = None,
) -> np.ndarray:
    """
    Selects the top individuals from a population based on their objective function values.

    This function applies a selection operator commonly used in evolutionary algorithms.
    It evaluates the objective function for each individual in the population, ranks them
    based on their objective values, and selects the top-performing individuals to keep.

    Parameters
    ----------
    objective : Callable[[np.ndarray], np.ndarray]
        A callable function that takes the population as input and returns a 1D array
        of objective function values. This function is used to evaluate the performance
        of each individual in the population.

    population : np.ndarray
        A 2D numpy array where each row represents an individual in the population.
        The shape is typically (n_individuals, n_features), where `n_individuals` is the
        number of individuals in the population, and `n_features` is the number of features
        or parameters representing each individual.

    individuals_to_keep : Optional[int], default=None
        The number of top individuals to keep in the population after selection. If None,
        50% of the population is retained.

    Returns
    -------
    np.ndarray
        A 2D numpy array containing the selected individuals, with the same number of
        features as the original population but fewer individuals, determined by
        `individuals_to_keep`.

    Example
    -------
    >>> def sample_objective(pop):
    ...     return np.sum(pop ** 2, axis=1)
    >>> population = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
    >>> selection_operator(sample_objective, population, individuals_to_keep=2)
    array([[1, 2],
           [3, 4]])

    Notes
    -----
    - The objective function should return lower values for better individuals (e.g., a minimization problem).
    - The function sorts individuals in ascending order based on their objective function values and selects
      the best ones to retain.
    """
    if individuals_to_keep is None:
        individuals_to_keep = int(0.5 * population.shape[0])

    objective_values = objective(population)

    sorted_indices = np.argsort(objective_values)[:individuals_to_keep]

    return population[sorted_indices]

We can check whether the selection_operator function works by verifying if it returns the population members with the lowest objective values. Let’s use the population variable we created above to make a DataFrame with its design variables and corresponding objectives.

Code
df = pd.DataFrame(
    data=dict(
        x1=population[:, 0],
        x2=population[:, 1],
        objective=michalewicz_function(population),
    )
)
df.sort_values(by="objective")
x1 x2 objective
8 0.955805 1.648571 -7.724448e-01
1 2.299627 1.880741 -6.631865e-01
3 0.182475 2.721173 -4.081330e-01
4 1.888459 2.224476 -1.336976e-01
6 2.615196 0.667083 -9.921264e-03
9 1.356995 0.914923 -8.069540e-06
0 1.176652 2.986757 -1.921175e-06
5 0.064668 3.047062 -1.563872e-10
7 0.571220 0.576182 -1.484211e-14
2 0.490147 0.490071 -2.122446e-17

We would expect calling the selection_operator function (with individuals_to_keep=4) to return an array with the first four elements of this DataFrame.

Code
selection_operator(michalewicz_function, population, individuals_to_keep=4)
array([[0.9558052 , 1.64857095],
       [2.29962679, 1.8807411 ],
       [0.18247505, 2.72117262],
       [1.8884585 , 2.22447561]])

As we see here, that’s the case, so the selection_operator function is working as expected.

Crossover

Next, we’ll make a crossover operator. This operator takes a population, groups its individuals in pairs of “parents,” and produces a number of “offspring” by combining traits (design variable values) from each one. Note that we are picking pairs of parents somewhat arbitrarily here; we could just as easily create offspring from a different number of parents here, but we’ll stick to two for simplicity. Let’s see a possible implementation of the crossover operator!

First, we’ll define a function called make_offspring_individual to make a single offspring from two parents.

Code
def make_offspring_individual(parents: np.ndarray) -> np.ndarray:
    """
    Creates a single offspring individual through crossover from two parent individuals.

    Each gene (feature) is randomly selected from either parent to create a new
    offspring. The function ensures that at least one gene is taken from each parent.

    Parameters
    ----------
    parents : np.ndarray
        A 2D numpy array of shape (2, n_genes), where each row represents one parent
        and each column represents a gene. There must be exactly two parent individuals,
        and each must have the same number of genes (design variables).

    Returns
    -------
    np.ndarray
        A 1D numpy array representing the offspring individual, with the same number of
        genes as the parents. The genes are randomly selected from the two parents.

    Example
    -------
    >>> parents = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
    >>> make_offspring_individual(parents)
    array([1, 6, 3, 8])

    Notes
    -----
    - The crossover is controlled by a randomly generated binary mask (`crossover_indices`),
      where a value of 0 selects the gene from the first parent and a value of 1 selects the
      gene from the second parent.
    - The function ensures that the crossover mask is not all 0s or all 1s, which would
      result in the offspring being identical to one of the parents.
    """
    num_genes = parents.shape[1]

    crossover_indices = np.random.randint(0, 2, num_genes)

    while np.all(crossover_indices == 0) or np.all(crossover_indices == 1):
        crossover_indices = np.random.randint(0, 2, num_genes)

    return parents[crossover_indices, np.arange(num_genes)]

Here, parents is a population array with only two members. First, we determine the number of “genes,” or design variables, in each parent. Then, we use NumPy’s randint to determine which parent will pass on which gene to the offspring. Next, we perform a check to establish that the offspring will not contain genes from only one of the parents. Finally, we return an array representing a new individual with genes that have been “inherited” from its parents.

Now, let’s actually implement the crossover_operator function. This function will take a population NumPy array and an (optional) number of offspring to produce. The function will call make_offspring_batch to shuffle the input population, create pairs of “parents,” and then create the individual offspring using make_offspring_individual, defined above. If more offspring are required, make_offspring_batch is called again as needed.

Code
from copy import deepcopy


def make_offspring_batch(population: np.ndarray) -> np.ndarray:
    """
    Creates a batch of offspring individuals from a given population using pairwise crossover.

    This function performs crossover between randomly shuffled pairs of individuals
    in the population to produce a new batch of offspring. If the population size
    is odd, the first individual is duplicated to make the number of individuals even.
    Each pair of individuals undergoes crossover using the `make_offspring_individual`
    function to generate one offspring.

    Parameters
    ----------
    population : np.ndarray
        A 2D numpy array of shape (n_individuals, n_genes) where each row represents
        an individual and each column represents a gene. The population is randomly
        shuffled before forming pairs for crossover.

    Returns
    -------
    np.ndarray
        A 2D numpy array of shape (n_individuals, n_genes) containing the offspring
        individuals. The number of offspring is equal to the size of the input population,
        with each individual produced from a pairwise crossover between shuffled members
        of the population.

    Example
    -------
    >>> population = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
    >>> make_offspring_batch(population)
    array([[1, 8, 3],
           [7, 5, 9],
           [4, 2, 6]])

    Notes
    -----
    - The population is shuffled randomly before forming pairs, ensuring genetic diversity
      in the offspring.
    - If the population size is odd, the first individual is duplicated to make an even
      number of individuals for pairing.
    - The function calls `make_offspring_individual` to generate one offspring for each pair.
    """
    pop_copy = deepcopy(population)

    np.random.shuffle(pop_copy)

    if pop_copy.shape[0] % 2 != 0:
        pop_copy = np.vstack((pop_copy, pop_copy[0]))

    pop_copy = pop_copy.reshape(-1, 2, pop_copy.shape[-1])

    return np.array([make_offspring_individual(pair) for pair in pop_copy])


def crossover_operator(
    population: np.ndarray, n_offspring: Optional[int] = None
) -> np.ndarray:
    """
    Generates a specified number of offspring through crossover from a given population.

    This function applies a crossover operation on a population to produce a specified
    number of offspring. The population is used to generate offspring through the
    `make_offspring_batch` function, which performs pairwise crossover. If the required
    number of offspring is greater than the population size, additional batches of offspring
    are generated until the desired number is reached.

    Parameters
    ----------
    population : np.ndarray
        A 2D numpy array of shape (n_individuals, n_genes) representing the population.
        Each row corresponds to an individual, and each column represents a gene.

    n_offspring : Optional[int], default=None
        The number of offspring to generate. If not specified, the number of offspring
        will be equal to the population size.

    Returns
    -------
    np.ndarray
        A 2D numpy array of shape (n_offspring, n_genes) containing the generated offspring.
        The number of offspring returned will be exactly `n_offspring`.

    Example
    -------
    >>> population = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
    >>> crossover_operator(population, n_offspring=6)
    array([[1, 2],
           [3, 4],
           [5, 6],
           [7, 8],
           [1, 6],
           [5, 2]])

    Notes
    -----
    - If `n_offspring` is not provided, the function will generate the same number of
      offspring as the population size.
    - If more offspring are needed than can be produced in a single batch, additional batches
      are generated and concatenated until the required number of offspring is reached.
    - The function uses `make_offspring_batch` to generate each batch of offspring via
      crossover between shuffled pairs of individuals from the population.
    """
    if n_offspring is None:
        n_offspring = population.shape[0]

    offspring = make_offspring_batch(population)

    while len(offspring) < n_offspring:
        offspring = np.vstack(
            (
                offspring,
                make_offspring_batch(population),
            ),
        )

    return offspring[:n_offspring]

We can test crossover_operator with our previously defined population:

Code
crossover_operator(population, n_offspring=4)
array([[2.29962679, 3.04706167],
       [0.9558052 , 2.22447561],
       [0.57121998, 2.98675708],
       [0.18247505, 0.66708299]])

As shown above, crossover_population produces new designs that exhibit a mixture of their parents’ traits.

Mutation

Now, we can move on to the final operator for our genetic algorithm, mutation. Mutation takes an input design vector’s values and adds small, random changes to them. In our implementation, we’ll create our mutation operator with two parameters: probability and stdev. probability Determines how likely it is for a component of the design vector to be subjected to a random change. Meanwhile, stdev represents the standard deviation of the normal distribution that we’ll use to obtain our new mutated values. Both probability and stdev can be scalars or arrays. If they’re arrays, we can assign specific probabilities and standard deviations to each design variable in the design vector.

The mutation_operator function shown below generates a mask to establish which indices will be mutated. Then, it samples from a normal distribution centered at the current design points to generate the new, mutated designs. Finally, these mutated designs are applied to the population using NumPy’s where.

Code
from typing import Tuple, Iterable


def mutation_operator(
    population: np.ndarray,
    probability: float | Iterable[float] = 0.1,
    stdev: float | Iterable[float] = 0.3,
    bounds: Tuple[float, float] = (0, np.pi),
) -> np.ndarray:
    """
    Applies mutation to a population of individuals with a given probability and standard deviation.

    This function introduces mutations into the population by adding random noise to
    individual genes based on a normal distribution. The probability of mutation
    and the standard deviation of the noise can be specified either as a single value
    or as an array specifying different values for each gene. The mutated values are
    clipped to remain within specified bounds.

    Parameters
    ----------
    population : np.ndarray
        A 2D numpy array of shape (n_individuals, n_genes), where each row represents
        an individual and each column represents a gene.

    probability : float or Iterable[float], default=0.1
        The mutation probability for each gene. This can be a single float value (applied
        uniformly to all genes) or an iterable specifying the mutation probability for
        each gene separately.

    stdev : float or Iterable[float], default=0.3
        The standard deviation of the normal distribution used to generate mutations.
        This can be a single float value (applied uniformly to all genes) or an iterable
        specifying different standard deviations for each gene.

    bounds : Tuple[float, float], default=(0, np.pi)
        A tuple specifying the lower and upper bounds for the gene values. Mutated
        values are clipped to lie within these bounds.

    Returns
    -------
    np.ndarray
        A 2D numpy array of shape (n_individuals, n_genes) representing the population
        after mutation. Each gene has a chance of being mutated according to the specified
        mutation probability and mutation size.

    Example
    -------
    >>> population = np.array([[1.0, 2.0], [2.5, 3.5]])
    >>> mutated_population = mutation_operator(population, probability=0.2, stdev=0.1)
    >>> mutated_population
    array([[1.1, 2.0],
           [2.4, 3.5]])

    Notes
    -----
    - Mutation is applied independently to each gene of each individual with a probability
      defined by `probability`. Genes that are mutated will have random noise added from a
      normal distribution centered at the original gene value with a standard deviation
      defined by `stdev`.
    - The mutation probabilities and standard deviations can be provided either as scalars
      (for uniform mutation behavior across genes) or as arrays of the same length as the
      number of genes (for different mutation behaviors across genes).
    - After mutation, gene values are clipped to the specified `bounds` to ensure they
      remain within valid ranges.
    """
    pop_size, n_variables = population.shape

    if isinstance(probability, (int, float)):
        probability = np.full(n_variables, probability)
    probability = np.asarray(probability)

    if isinstance(stdev, (int, float)):
        stdev = np.full(n_variables, stdev)
    stdev = np.asarray(stdev)

    mutation_mask = np.random.random((pop_size, n_variables)) < probability[None, :]

    mutations = np.random.normal(
        loc=population, scale=stdev[None, :], size=population.shape
    )

    mutations = np.clip(mutations, *bounds)

    return np.where(mutation_mask, mutations, population)

We can test our function by calling it on our example population:

Code
mutation_operator(population)
array([[1.17665249, 2.98675708],
       [2.29962679, 1.8807411 ],
       [0.49014701, 0.49007124],
       [0.18247505, 2.72117262],
       [1.8884585 , 2.14188534],
       [0.0646681 , 3.04706167],
       [2.61519568, 0.66708299],
       [0.57121998, 0.57618226],
       [0.9558052 , 1.64857095],
       [1.3569953 , 0.91492333]])

The result above shows that a few points were changed, but not by much due to the low default standard deviation. We can achieve bigger changes with respect to the original population by increasing both probability and stdev:

Code
mutation_operator(population, probability=0.8, stdev=1)
array([[2.97093333, 3.14159265],
       [2.59727724, 0.85262533],
       [0.        , 0.68040822],
       [0.31822888, 3.14159265],
       [1.8884585 , 2.22447561],
       [0.        , 3.14159265],
       [2.50993855, 0.        ],
       [0.15645535, 0.        ],
       [0.61172465, 2.39935684],
       [1.3569953 , 0.05332527]])

Putting It All Together

With our selection, crossover, and mutation operators defined, we’re ready to implement the entire genetic algorithm. Let’s start by defining a function optimization_step that returns a new population after applying all three genetic operators to an existing population.

Code
def optimization_step(
    population: np.ndarray,
    objective: Callable,
    individuals_to_keep: Optional[int] = None,
    n_offspring: Optional[int] = None,
    mutation_probability: float | Iterable[float] = 0.1,
    mutation_stdev: float | Iterable[float] = 0.3,
    bounds: Tuple[float, float] = (0, np.pi),
) -> np.ndarray:
    """
    Performs one step of an genetic optimization process, including selection, crossover, and mutation.

    This function executes a single optimization step as part of a genetic algorithm.
    It first selects the top individuals based on their objective values, generates offspring
    through crossover, and applies mutation to the offspring. The result is a new population
    ready for the next optimization step.

    Parameters
    ----------
    population : np.ndarray
        A 2D numpy array of shape (n_individuals, n_genes) representing the population.
        Each row corresponds to an individual, and each column represents a gene.

    objective : Callable
        A function that takes the population as input and returns a 1D numpy array of objective
        function values, used to evaluate and rank individuals.

    individuals_to_keep : Optional[int], default=None
        The number of top individuals to retain after selection. If None, 50% of the population
        is retained.

    n_offspring : Optional[int], default=None
        The number of offspring to generate after crossover. If None, the number of offspring will
        be equal to the size of the population.

    mutation_probability : float or Iterable[float], default=0.1
        The probability of mutation for each gene. This can be a single float (uniform probability
        for all genes) or an iterable specifying the mutation probability for each gene individually.

    mutation_stdev : float or Iterable[float], default=0.3
        The standard deviation of the normal distribution used to generate mutations. This can be a
        single float or an iterable specifying different mutation sizes for each gene.

    bounds : Tuple[float, float], default=(0, np.pi)
        A tuple specifying the lower and upper bounds for the gene values. Mutated values are
        clipped to lie within these bounds.

    Returns
    -------
    np.ndarray
        A 2D numpy array representing the new population after selection, crossover, and mutation.
        The shape is (n_offspring, n_genes).

    Example
    -------
    >>> population = np.array([[1.0, 2.0], [2.5, 3.5], [1.5, 2.5], [3.0, 4.0]])
    >>> def objective_fn(pop):
    ...     return np.sum(pop ** 2, axis=1)
    >>> optimization_step(population, objective_fn, individuals_to_keep=2, n_offspring=4)
    array([[1.1, 2.0],
           [2.5, 3.4],
           [1.5, 2.4],
           [3.0, 4.0]])

    Notes
    -----
    - The selection step retains the top individuals based on their objective function values.
    - The crossover step generates new offspring through crossover between the selected individuals.
    - The mutation step applies random mutations to the offspring, with mutation size and probability
      controlled by the `mutation_probability` and `mutation_stdev` parameters.
    - The number of offspring can be controlled via the `n_offspring` parameter. If not specified,
      it defaults to the size of the input population.
    """
    if n_offspring is None:
        n_offspring = len(population)

    selected_population = selection_operator(
        objective, population, individuals_to_keep=individuals_to_keep
    )

    offspring = crossover_operator(selected_population, n_offspring=n_offspring)

    return mutation_operator(
        offspring, probability=mutation_probability, stdev=mutation_stdev, bounds=bounds
    )

We can test it on our population array:

Code
optimization_step(population, michalewicz_function)
array([[0.18247505, 1.8807411 ],
       [2.61519568, 2.22447561],
       [0.9558052 , 1.8807411 ],
       [1.8884585 , 0.66708299],
       [2.29962679, 1.64857095],
       [2.61519568, 2.72117262],
       [0.18247505, 0.66708299],
       [0.9558052 , 1.8807411 ],
       [0.21508859, 2.22447561],
       [2.61519568, 2.22447561]])

Now, before putting optimization_step in a loop and getting our results, we’ll make a function wrapper to help us keep track of the points we’ve evaluated and the corresponding objective function values. This wrapper saves the design vectors and objective function values to a Pandas DataFrame that we can then use for visualization and debugging purposes.

Code
import functools

# DataFrame to store inputs and outputs
data_log = pd.DataFrame(columns=["x1", "x2", "objective"])


def log_function_call(func):
    """
    A decorator that logs the function input and output to a DataFrame.
    """

    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        global data_log

        X = args[0]
        x1, x2 = X[:, 0], X[:, 1]


        result = func(*args, **kwargs)

        new_data = pd.DataFrame({"x1": x1, "x2": x2, "objective": result}).dropna(
            how="all"
        )
        if not new_data.empty:
            data_log = (
                new_data
                if data_log.empty
                else pd.concat([data_log, new_data], ignore_index=True)
            )

        return result

    return wrapper


michalewicz_with_logger = log_function_call(michalewicz_function)

We’re finally ready to execute the entire optimization loop. Let’s start by creating a randomly sampled initial population with 50 members:

Code
population_size = 50

population = np.random.uniform(0, np.pi, size=(population_size, 2))

And now, let’s just call optimization_step repeatedly in a loop. Let’s do 100 iterations:

Code
generations = 100

for _ in range(generations):
    population = optimization_step(population, michalewicz_with_logger)


data_log["generation"] = data_log.index // population_size + 1

We ran the genetic algorithm for 100 generations and a population size of 50. Let’s see how the objective function evolved as a function of the number of evaluations.

Code
cumulative_min = np.minimum.accumulate(data_log["objective"])
rolling_mean = (
    data_log["objective"].rolling(window=population_size, min_periods=1).mean()
)

fig = go.Figure()

fig.add_trace(
    go.Scatter(
        y=cumulative_min,
        mode="lines",
        name="Cumulative minimum objective",
        line=dict(color="blue"),
    )
)

fig.add_trace(
    go.Scatter(
        y=rolling_mean,
        mode="lines",
        name="50-evaluation objective rolling mean",
        line=dict(color="green"),
    )
)

fig.add_trace(
    go.Scatter(
        y=[-1.8013] * len(data_log["objective"]),
        mode="lines",
        name="Global minimum",
        line=dict(dash="dash", color="grey"),
    )
)

fig.update_layout(
    title={
        "text": "Genetic Algorithm on Michalewicz Function",
        "x": 0.5,
        "xanchor": "center",
        "yanchor": "top",
    },
    xaxis_title="Number of evaluations",
    yaxis_title="Michalewicz function value",
    showlegend=True,
)

fig.show()

In this graph, we see that already by around 500 function evaluations the genetic algorithm gets quite close to the known global minimum of the Michalewicz function. We also see that by around 400 evaluations, the rolling mean of the objective stays consistently below -1.5, indicating that the newer generations are converging towards the global minimum.

One more thing we can do is visualize the population’s evolution as a scatter plot over the Michalewicz function’s filled contour plot, as shown below. In this case, the optimizer converges quickly to the global minimum, and later generations stay near it.

Code
x1_vals = np.linspace(0, np.pi, 400)
x2_vals = np.linspace(0, np.pi, 400)
x1, x2 = np.meshgrid(x1_vals, x2_vals)


x1_vals_flat = x1.flatten()
x2_vals_flat = x2.flatten()


X_flat = np.column_stack((x1_vals_flat, x2_vals_flat))


z_flat = michalewicz_function(X_flat)


z = z_flat.reshape(x1.shape)

fig = go.Figure()

fig.add_trace(
    go.Contour(
        z=z,
        x=x1_vals,  
        y=x2_vals,  
        colorscale="Viridis",
        showscale=True,
        contours=dict(start=np.min(z), end=np.max(z), size=0.1),
        colorbar=dict(
            title="Objective Function", titleside="right"  
        ),
    )
)


generations = sorted(data_log["generation"].unique())
for gen in generations:
    filtered_data = data_log[data_log["generation"] == gen]
    fig.add_trace(
        go.Scatter(
            x=filtered_data["x1"],
            y=filtered_data["x2"],
            mode="markers",
            marker=dict(size=10, color="orange"),
            name=f"Generation {gen}",
            visible=(gen == 1),  
        )
    )


steps = []
for i, gen in enumerate(generations):
    step = dict(
        method="update",
        args=[
            {"visible": [True] + [False] * len(generations)},  
            {"title": f"Scatter Plot for Generation {gen}"},
        ],
        label=f"{gen}",
    )
    step["args"][0]["visible"][
        i + 1
    ] = True  
    steps.append(step)


sliders = [
    dict(active=0, currentvalue={"prefix": "Generation: "}, pad={"t": 50}, steps=steps)
]


fig.update_layout(
    sliders=sliders,
    title={
        "text": "Michalewicz Function and Genetic Algorithm Optimization Path",
        "x": 0.5,
        "xanchor": "center",
        "yanchor": "top",
    },
    xaxis_title="x1",
    yaxis_title="x2",
    xaxis=dict(range=[0, np.pi]),  
    yaxis=dict(range=[0, np.pi]),  
    height=600,
)

fig.show()

Final Remarks

In this post, we implemented a genetic algorithm and showed how we can use it to optimize a function that might be hard to tackle for gradient-based optimizers like the Michalewicz function. We developed a version of each of the fundamental genetic operators, namely selection, crossover, and mutation, and put them all together to find the objective function’s global optimum in a simple for loop.

However, despite our success, there are a number of things that we can address in future work. First, there are many different versions of the selection, crossover, and mutation operators, which may be more or less suitable for various problems. Here, we implemented a simple version of each operator, but there are many other versions that can be much more complex and efficient than the ones we used. Also, the code itself may have some inefficiencies that could be ironed out in the future. For instance, the make_offspring_batch function we used for the crossover operator uses Python’s copy.deepcopy, which may not be the best for memory efficiency. Additionally, we presented a single set of results here using the default parameters that we set, but it may be interesting to have a look at what happens when we change the population size, the number of generations, and the mutation probabilities and standard deviations. Also, even though the Michalewicz problem is challenging, there are other problems that may be even more difficult for traditional optimizers. It would be interesting to explore how adding constraints or mixed-discrete design spaces would affect algorithm performance and how, with these additions, we would have to modify our operator definitions.

Finally, I have to mention that there are some very good implementations of genetic algorithms in Python out there. If you’re working on a larger project, you might want to look at mature packages that offer genetic algorithms (and other optimizers) out of the box. I’m particularly keen on Pymoo, which focuses on multi-objective optimization but also offers single-objective algorithms.