The Travelling Salesman Problem is a classic NP-hard problem in Computer Science, for which there are many viable solutions. The task is; given a number of cities, we must find the shortest route (in time and/or distance) that visits each city exactly once. Traditionally, the best problems have the quickest worst-case running time; in TSP, any algorithm to solve the problem increases superpolynomially (but no more than exponentially), as the number of cities increases. [1]
Let’s start by describing our evolution inspired heuristic for this particular TSP, where we will be finding the shortest route between a number of cities in the UK — in alphabetical order; Brighton, Bristol, Cambridge, Glasgow, Liverpool, London, Manchester and Oxford. We will be attempting to find the shortest distance with these approximate distances (I presume in miles);
Brighton | Bristol | Cambridge | Glasgow | Liverpool | London | Manchester | Oxford | |
Brighton | 0 | 172 | 145 | 607 | 329 | 72 | 312 | 120 |
Bristol | 0 | 192 | 494 | 209 | 158 | 216 | 92 | |
Cambridge | 0 | 490 | 237 | 75 | 205 | 100 | ||
Glasgow | 0 | 286 | 545 | 296 | 489 | |||
Liverpool | 0 | 421 | 49 | 208 | ||||
London | 0 | 249 | 75 | |||||
Manchester | 0 | 194 | ||||||
Oxford | 0 |
Now on to the evolution inspired Genetic Algorithm (or GA), for which we will be using to solve the problem. The heuristic goes like this;
- Initialisation – Create an initial population of solutions, usually created randomly and with an arbitrary size, from just a few to thousands. In our case this would be a population of tours; routes to each and every city.
- Evaluation – Each member of the population is evaluated; the fitness of the individual is calculated by how well it fits with the desired requirements. In our case this would be the total distance, which we are trying to minimise.
- Selection – Our goal is to constantly improve our populations fitness. Selection helps us to do this by discarding the bad solutions and keeping the best ones (similar to natural selection in Darwinian theory). There are a few different methods, but we aim to make it more likely that fitter individuals will be selected for the next generation. Here we will use a roulette approach. (Explained below)
- Crossover – Make new individuals by combining genes of our selected individuals, (Similar to how sex works in nature). The hope is that by combining certain genomes from two or more individuals we will create an offspring with better fitness for our problem, which will inherit the best traits from each of it’s parents.
- Mutation – Make new individuals by slightly altering, typically by random, the existing genome of an individual.
- Repeat! Now we have the next generation, we can start again from step two until we find the fittest individual. [2]
In this particular problem we need to employ a roulette wheel to bias the algorithm to generate fitter solutions for each generation. (The selection process):
This is essentially a normalisation of the fitnesses of the current generation, so that we can select the best solutions for our next generation.
In pseudo-code [3]:
for all members of population overallF += fitness of this individual end for for all members of population probability = sum of probabilities + (fitness / overallF) sum of probabilities += probability end for loop until new population is full do this twice number = Random between 0 and 1 for all members of population if number > probability but less than next probability then you have been selected end for end create offspring end loop
Solution and results from the GA (see the code on github)
Thus, the best is distance is 890 miles, and one of the best genotypes is (Glasgow, Liverpool, Manchester, Bristol, Oxford, Cambridge, London, Brighton), which makes sense!
On the topic of Natural Computing, Ant Colony Optimisation has also been used to solve this problem. Whereby their behaviour in finding short paths between food sources and their nest (an emergent behaviour, resulting from each ant’s preference to follow trail pheromones deposited by other ants) is modelled algorithmically. Simulated annealing has also been used and compared to GA’s here.
Lots of interesting applications for all this. In particular, I was envisioning a system for music exploration; whereby a collection of music releases could represent nodes in a graph, and the edges represent the most similar features between the items, solved by the Travelling Salesman Problem. Moreover, the Genetic Algorithm approach may be very useful. There is very likely to be another post including this topic due to it’s potential relevance to my final project on Music Exploration and Recommendation.
Thanks for reading! This is the fifth part in a series of blogs for the Natural Computing module at Goldsmiths, University of London.
Header image from https://xkcd.com/399/