Exercise session 2

Exercise session 2

par Mathis Finckh,
Nombre de réponses : 1

Hi, I am struggling to implement the ES algorithm.

I don't understand how, if we use only one sigma per gen, it varies. Indeed, won't the best and worst individual have the same sigma and then we have no information on how to evolve sigma. I would expect to create a specific sigma for each individual to make the "best" sigma be evolved the same way as the population, keep the average sigma (which is wighted by the best individuals) and evolve that further. But the code seems to expect only one sigma...

Also, what is the sigma_decay_rate ? 

Then, am I understanding correctly that we should create the whole population perturbation at once for the mutated gen ?  So here, a size 100x6 perturbation and apply that to a single mean so create the next mutated gen ? 

Finally, with what I think is a correct implementation given all this, the fitness does not improve, and the final video won't open. 

Would appreciate any help. 

Thanks already


Here is my code : 



def initialise_x0(self):
"""Initialises the first population."""
# TODO: generate the initial population mean vector (current_mean)
# mean_vector = [self.current_mean]*self.n_pop
mean_vector = self.generate_mutated_offspring(self.n_pop)
return mean_vector

def update_sigma(self):
"""Update the perturbation strength (sigma)."""
# TODO: implement a decay of the sigma value over generations, ensuring it does not go below min_sigma
# for sigma decay
tau = self.sigma_decay_rate / np.sqrt(self.n_params) * np.random.normal(0,1)
# # coevolution rule for mutation size in ES : sigma' = sigma * exp(tau * N(0,1))
current_sigma = self.current_sigma * np.exp(tau)
# test boundary rule
if current_sigma < self.min_sigma:
current_sigma = self.min_sigma
self.current_sigma = current_sigma

def sort_and_select_parents(self, population, fitness, num_parents):
"""Sorts the population based on fitness and selects the top individuals as parents."""
# TODO: sort the population and fitness based on fitness values, and select the top num_parents individuals as parents
ind_sorted_fitnes = np.argsort(fitness)[::-1]
parent_population = population[ind_sorted_fitnes][:num_parents]
parent_fitness = fitness[ind_sorted_fitnes][:num_parents]

return parent_population, parent_fitness

def update_population_mean(self, parent_population, parent_fitness):
# TODO: compute the new population mean as a weighted average of the parent population, where the weights are based on the parent fitness
# (you can use rank or raw fitness values)
# Normalise parent fitness scores
# normed_parents_fitness = np.array(parent_fitness / np.sum(parent_fitness))
# normed_parents_fitness = normed_parents_fitness[:, np.newaxis]
num_parents = len(parent_fitness)
# 1. Create linear ranks.
# Since parent_fitness is sorted descending, ranks are: [num_parents, num_parents - 1, ..., 1]
ranks = np.arange(num_parents, 0, -1)
# 2. Normalize the ranks so they sum to 1
rank_weights = ranks / np.sum(ranks)
# 3. Reshape to (num_parents, 1) for broadcasting across the n_params axis
rank_weights = rank_weights[:, np.newaxis]
# Compute population weighted to the normed fitness scores
weighted_parents_population = np.array(parent_population) * rank_weights # normed_parents_fitness

# Calculate the sum of weighted parents population
updated_mean_vector = np.sum(weighted_parents_population, axis=0)
self.current_mean = updated_mean_vector

return updated_mean_vector

def generate_mutated_offspring(self, population_size):
"""Generates a new population by adding Gaussian noise to the current mean."""
# TODO: generate a new population by adding Gaussian noise to the current mean, where the noise is scaled by the current sigma value
# creating unique perturbation for each parameter of each individual
perturbation = self.current_sigma * np.random.normal(0,1,[population_size, self.n_params])
mutated_population = np.array([self.current_mean]*population_size) + perturbation

return mutated_population




En réponse à Mathis Finckh

Re: Exercise session 2

par Fuda van Diggelen,
Hi Mathis,

There are indeed several ways to adapt the mutation step size (sigma) in Evolution Strategies. What you are describing "assigning a specific sigma to each individual and evolving it alongside the genome," is called adaptive co-evolution. However, for this specific exercise, the TODO expects a simpler method: a deterministic scheduled decay for a single, global step size.
In this case, all specific genomes will have the same mutation step size for each individual.
As a consequence, the entire population is sampled around a single current mean vector:
N(x_t+1, sigma_t+1) = x_t+1 + sigma_t+1*N(0, 1).
Thus your understanding of the new population generation is correct.

The sigma_decay_rate is a constant multiplier (usually slightly less than 1, like 0.99) used to deterministically shrink your step size each generation, e.g.: sigma_t+1 = sigma_t * 0.99

If you want, you can follow the directions of the "Co-evolution of mutation size" lecture slide to implement adaptive co-evolution of step size.

For the final implementation except for the sigma it looks good.
Try a simple decay scheduling and see if you are still having problem.

Best,
Fuda