The Pareto Front

Genetic Algorithms are a class of  global optimizers that are particularly effective in cases where the objective function is a black box or is discrete and derivatives can't easily be calculated.. They are also effective in cases where the objective function is smooth or convex. It's just that in these cases, the availability of derivatives often makes other methods such as gradient descent faster.

Sometimes, the target function might have competing objectives, for example designing a treatment plan that maximizes efficacy while minimizing adverse side-effects. In cases like this, we often want not a local or even global optimal solution, but also what are the tradeoffs among the goals. In other words, there’s no single “best” solution. Instead, you want the Pareto set: solutions where you can’t improve any objective without worsening another.

The Pareto set or Pareto front consists of the possible solutions that are not dominated by any other solution. What we mean by "dominated" is Pareto dominance: 

For two objective vectors $f(x) = ({f_1}(x)...,{f_M}(x)$ and $f(y) = ({f_1}(y)...,{f_M}(y)$, x dominates y if ${f_i}(x) \leqslant {f_i}(y)$ for all i and strictly < for at least one i. The Pareto front is the set of non-dominated solutions.

We can extend this to more that two objectives. For solutions in the Pareto set, you can't improve one objective without worsening another.

In two dimensions, the Pareto front is called the Pareto curve. In higher dimensions it is often called the Pareto surface.

NSGA-II 


NSGA-II (Non-Dominated Sorting Genetic Algorithm II) is a multi-objective genetic algorithm for approximating the Pareto front. 

Here's pseudocode for NSGA-II (courtesy of chatGPT 5)

init P (size N) uniformly in bounds; evaluate F(P)
for gen in 1..G:
    R = P ∪ offspring(Q)               # Q from crossover+mutation of P
    fronts = non_dominated_sort(R)     # F1, F2, ...
    P_next = []
    for Fk in fronts:
        if |P_next| + |Fk| ≤ N:        # take whole front
            P_next += Fk
        else:                           # take best by crowding distance
            sort Fk by descending crowding_distance
            P_next += first (N - |P_next|) of Fk
            break
    P = P_next
return P (its F1 approximates Pareto front)

Several pieces differ from the standard GA method. 

Non-dominated sort: non-dominated sorting splits the parent + offspring population into fronts $({F_1},{F_2}...{F_N})$ where ${F_1}$ are the non-dominated points, ${F_2}$ are dominated only by ${F_1}$, etc. 

Crowding distance: crowding distance is a measure of diversity. Within each front, compute a crowding distance for each solution This measures how isolated the solution is in objective space. Boundary points get $\infty$; interior points get a sum of normalized neighbor distance with other objectives.

NSGA-II in Python


I asked ChatGpt to generate some Python code for NSGA-II. I used it as starting point and modified it. Here's the code for the NSGA-II algortithm.


def nsga2(problem: Problem) -> tuple[np.ndarray, np.ndarray]:
    """
    NSGA-II Non-Dominated Sorting Genetic Algorithm II

    Parameters
    ----------
    problem : Problem
        An object of the class Problem

    Returns
    -------
    tuple[np.ndarray, np.ndarray]
        paired X and objective values
    """
    pop_size = problem.n_gen
    n_gen = problem.n_gen
    pop_size = problem.pop_size
    pc = problem.pc
    eta_c = problem.pc
    eta_m = problem.eta_m
    rng = random.Random(problem.seed)
    
    # init
    X = np.array([rng.random() for _ in range(problem.n_var * pop_size)]).reshape(pop_size, problem.n_var)
    X = problem.lower + X * (problem.upper - problem.lower)
    F = np.array([problem.eval(ind) for ind in X])

    for _ in range(n_gen):
        fronts = non_dominated_sort(F)
        ranks = np.full(pop_size, np.inf)
        cdists = np.zeros(pop_size)  # crowding distances
        
        # get the rank, ie. which front it belongs to, for each solution
        for rank, front in enumerate(fronts):
            ranks[np.array(front)] = rank
            cd = crowding_distance(F, front)
            if cd.size: cdists[np.array(front)] = cd

        # GA
        mating_idx = [binary_tournament(X, ranks, cdists, rng) for _ in range(pop_size)]
        mating_pool = X[mating_idx]

        # offspring
        off = []
        for i in range(0, pop_size, 2):
            p1, p2 = mating_pool[i], mating_pool[(i+1) % pop_size]
            if rng.random() < pc:
                c1, c2 = sbx_crossover(p1, p2, problem.lower, problem.upper, eta_c, rng)
            else:
                c1, c2 = p1.copy(), p2.copy()
            pm = 1.0 / problem.n_var
            c1 = polynomial_mutation(c1, problem.lower, problem.upper, eta_m, pm, rng)
            c2 = polynomial_mutation(c2, problem.lower, problem.upper, eta_m, pm, rng)
            off.append(c1); off.append(c2)
        off = np.array(off[:pop_size])
        F_off = np.array([problem.eval(ind) for ind in off])

        # environmental selection
        # take by fronts.
        Xc = np.vstack([X, off]); 
        Fc = np.vstack([F, F_off])
        fronts = non_dominated_sort(Fc)
        new_X, new_F = [], []
        for front in fronts:
            if len(new_X) + len(front) <= pop_size:
                new_X.extend(list(Xc[front])); 
                new_F.extend(list(Fc[front]))
            else:
                # take best by crowding distance
                cd = crowding_distance(Fc, front)
                order = np.argsort(-cd)
                remain = pop_size - len(new_X)
                chosen = [front[i] for i in order[:remain]]
                new_X.extend(list(Xc[chosen])); new_F.extend(list(Fc[chosen]))
                break
        X, F = np.array(new_X), np.array(new_F)

    return X, F

Problem is an object that holds the GA parameters. Except for the non_dominated_sort and the crowding_distance routines, it's a standard GA.

Here is the code for non_dominated_sort and crowding_distance.

# ---------- NSGA-II utilities ----------
def non_dominated_sort(F: np.ndarray) -> list[list[int]]:
    """
    Sort objective values into dominated frons, i.e. collections of solotions not dominated by
    any thing with a lower front index 

    Parameters
    ----------
    F : np.ndarray
        an array of objective values

    Returns
    -------
    list[list[int]]
        array of fronts. fronts[0] are non-dominated, i.e. the Pareto set.
    """
    N = F.shape[0]
    S = [[] for _ in range(N)]      # S[p]: the set of solutions that p dominates.
    n = np.zeros(N, dtype=int)      # n[p]: the number of solutions that dominate p (its “domination count”).
    rank = np.zeros(N, dtype=int)   # which front p belongs to (0 = first/non-dominated).
    fronts = [[]]                   # list of fronts

    for p in range(N):
        for q in range(N):
            if np.all(F[p] <= F[q]) and np.any(F[p] < F[q]):
                # p dominates q
                S[p].append(q)
            elif np.all(F[q] <= F[p]) and np.any(F[q] < F[p]):
                # q dominate p, increment domination count
                n[p] += 1
        if n[p] == 0:
            # nobody dominate p
            rank[p] = 0
            fronts[0].append(p)

    # build the rest of the fronts
    i = 0
    while fronts[i]:
        Q = []
        for p in fronts[i]:
            for q in S[p]:
                n[q] -= 1
                if n[q] == 0:
                    # eveything that dominated q is in a previous front, put q in next front
                    rank[q] = i + 1
                    Q.append(q)
        i += 1
        fronts.append(Q)

    fronts.pop()    # there's an extra [] at the end, remove it
    return fronts

def crowding_distance(F: np.ndarray, front: list[int]) -> np.ndarray:
    """
    Calculate crowding distance

    Parameters
    ----------
    F : np.ndarray
        solution values
    front : list[int]
        fronts

    Returns
    -------
    np.ndarray
        the crowding distances for each elemnt
    """
    if not front:
        return np.array([])
    m = F.shape[1]
    l = len(front)
    dist = np.zeros(l)
    if l == 1: dist[0] = np.inf; return dist
    if l == 2: return np.array([np.inf, np.inf])

    for j in range(m):
        idx = np.argsort(F[front, j])
        f_sorted = F[front, j][idx]
        min_f, max_f = f_sorted[0], f_sorted[-1]
        dist[idx[0]] = np.inf
        dist[idx[-1]] = np.inf
        denom = max_f - min_f if max_f > min_f else 1.0
        for k in range(1, l - 1):
            dist[idx[k]] += (f_sorted[k+1] - f_sorted[k-1]) / denom
    return dist

The example code minimizes the function $f(x) = {x^2} + {\left( {x - 2} \right)^2}$

Here's a plot of the Pareto front.


The entire code can be downloaded from GitHub.

No comments:

Post a Comment