Using Dispersive Flies Optimisation for Personalised Music Recommendation and A Combinatorial Card Game

The Feature Weighting Problem

Generating recommendations that a user is most likely to appreciate is a well researched problem. In Music Recommendation, a big part of the problem is adjusting the weights of a number of features which will attempt to define a listener’s profile. Recently, Music Information Retrieval techniques (acoustic features) have further increased the dimensionality of the solution space, offering the opportunity for more personalised recommendations, whilst simultaneously increasing the need for better algorithms to ‘tune’ the feature weights to model a user’s musical personality.

Applying DFO to the optimisation problem

NC_diagram.pngThe Fitness Function for this task could be the how well the acoustic features map to the user access data. I.e if the generated recommendation has a high average rating by other members of the community, or if it exists in another playlist by similar users. The dataset I will be implementing the algorithm for, and evaluating the results on, will be subsets of the well researched Million Song Dataset; for example, The Echo Nest Taste Profile Subset and have a neighbourhood set of item-based similarity to draw the fitness from. Partly inspired by Inspired by the Particle swarm optimization recommender system (for movies and user profiles).

Evaluation: I will do my evaluation based on results of other item-based similarity methods for the Million Song Dataset from the Million Song Dataset Challenge.

The Goal: To utilise the collective intelligence of virtual flies, and thus allow users to effortlessly discover new music that they are very likely to enjoy!

 

The Combinatorial Card Game

The premise of this combinatorial problem is as follows; there are 10 cards, each labelled in ascending and unique order 1-10. The aim of the game is to get 1 unique subset of the cards to sum to 36 and the remaining subset’s product to = 360. We can treat this as a 10 dimensional problem.

The generated solution:

Params: 20 flies, 100 iterations, dt = 0.01

Subset 1: [ 9 10 2 7 8 ]

Subset 2: [ 3 6 5 4 1 ]

Function evaluations (calls to get fitness): 50

The solution can of course be any permutation of these 2 subsets.

My curiosity in permutations and concern for how efficient the algorithm is (measured in function evaluations) compared to an approach of simply iterating through the solution vector (a brute force approach) led me to evaluating solving the problem with DFO; the worst case (simply iterating through the vector and finding that answer at the very end) is approx. 10! (3628800 FE’s) and I solved it in 50 FEs which is 72576 x better than the brute force approach!

See my code on Github.

Thanks for reading! This is the sixth part in a series of  blogs for the Natural Computing module at Goldsmiths, University of London.

Advertisements

Applying a Genetic Algorithm to The Travelling Salesman Problem

The Travelling Salesman Problem is a classic NP-hard problem in Computer Science, for which there are many viable solutions. The task is; given a number of cities, we must find the shortest route (in time and/or distance) that visits each city exactly once.  Traditionally, the best problems have the quickest worst-case running time; in TSP, any algorithm to solve the problem increases superpolynomially (but no more than exponentially), as the number of cities increases. [1]

Let’s start by describing our evolution inspired heuristic for this particular TSP, where we will be finding the shortest route between a number of cities in the UK — in alphabetical order; Brighton, Bristol, Cambridge, Glasgow, Liverpool, London, Manchester and Oxford. We will be attempting to find the shortest distance with these approximate distances (I presume in miles);

Brighton Bristol Cambridge Glasgow Liverpool London Manchester Oxford
Brighton 0 172 145 607 329 72 312 120
Bristol 0 192 494 209 158 216 92
Cambridge 0 490 237 75 205 100
Glasgow 0 286 545 296 489
Liverpool 0 421 49 208
London 0 249 75
Manchester 0 194
Oxford 0

Now on to the evolution inspired Genetic Algorithm (or GA), for which we will be using to solve the problem. The heuristic goes like this;

  1. Initialisation – Create an initial population of solutions, usually created randomly and with an arbitrary size, from just a few to thousands. In our case this would be a population of tours; routes to each and every city.
  2. Evaluation – Each member of the population is evaluated; the fitness of the individual is calculated by how well it fits with the desired requirements. In our case this would be the total distance, which we are trying to minimise.
  3. Selection – Our goal is to constantly improve our populations fitness. Selection helps us to do this by discarding the bad solutions and keeping the best ones (similar to natural selection in Darwinian theory). There are a few different methods, but we aim to make it more likely that fitter individuals will be selected for the next generation. Here we will use a roulette approach. (Explained below)
  4. Crossover – Make new individuals by combining genes of our selected individuals, (Similar to how sex works in nature). The hope is that by combining certain genomes from two or more individuals we will create an offspring with better fitness for our problem, which will inherit the best traits from each of it’s parents.
  5. Mutation – Make new individuals by slightly altering, typically by random, the existing genome of an individual.
  6. Repeat! Now we have the next generation, we can start again from step two until we find the fittest individual.  [2]

In this particular problem we need to employ a roulette wheel to bias the algorithm to generate fitter solutions for each generation. (The selection process):

 roullette

This is essentially a normalisation of the fitnesses of the current generation, so that we can select the best solutions for our next generation.

In pseudo-code [3]:

for all members of population
    overallF += fitness of this individual
end for

for all members of population
    probability = sum of probabilities + (fitness / overallF)
    sum of probabilities += probability
end for

loop until new population is full
    do this twice
        number = Random between 0 and 1
        for all members of population
            if number > probability but less than next probability 
                then you have been selected
        end for
    end
    create offspring
end loop

Solution and results from the GA (see the code on github)

Thus, the best is distance is 890 miles, and one of the best genotypes is (Glasgow, Liverpool, Manchester, Bristol, Oxford, Cambridge, London, Brighton), which makes sense!

On the topic of Natural Computing, Ant Colony Optimisation has also been used to solve this problem. Whereby their behaviour in finding short paths between food sources and their nest (an emergent behaviour, resulting from each ant’s preference to follow trail pheromones deposited by other ants) is modelled algorithmically. Simulated annealing has also been used and compared to GA’s here.

Lots of interesting applications for all this. In particular, I was envisioning a system for music exploration; whereby a collection of music releases could represent nodes in a graph, and the edges represent the most similar features between the items, solved by the Travelling Salesman Problem. Moreover, the Genetic Algorithm approach may be very useful. There is very likely to be another post including this topic due to it’s potential relevance to my final project on Music Exploration and Recommendation.

Thanks for reading! This is the fifth part in a series of  blogs for the Natural Computing module at Goldsmiths, University of London.

Header image from https://xkcd.com/399/

Dispersive Flies Optimisation in Symmetry Analysis

In this post i’ll be exploring and evaluating DFO in finding points and lines of n-fold symmetry to aid Computer Vision. See my previous blog post explaining DFO here.

To define our problem to find points of n-fold symmetry, we are looking for specific areas (in this case a single pixel) where the fitness function of symmetry is at its optimum (in this case a minima). Let’s define our fitness function; we need 2 blocks of comparison, each with the same radius (the distance from the centre to any four of the blocks vertices), the sums are then evaluated around our proposed axis of symmetry, such that our symmetry metric = absolute(sum1 – sum2). In the case of a vertical axis for example:

Symmetry

We can see this in action here on b/w images:

Then, to improve, we change our communication so that the flies navigate around the best fly in the swarm (as opposed to our previously democratic approach),

for (int d = 0; d < Global.dim; d++) {
temp[d] = Global.fly[chosen].getPos(d) +
   //random(1) * (Global.fly[chosen].getPos(d) - Global.fly[i].getPos(d));// local nei
  random(1) * (Global.fly[Global.bestIndex].getPos(d) - Global.fly[i].getPos(d));

and re-initialise our agents when loading a new image:

Here we see a more stable point of symmetry, and most importantly, a solid observation; the more symmetrical the image by human perception, the more stable the position of the best fly in the swarm. 

The next problem which would lead to evolving our algorithm and making it more useful is to find symmetry in nature i.e for detecting animals. Let’s see how our most recent algorithm behaves for an image of a zebra (with an updated radius that is proportional to our new image size):

Here we see some interesting behaviour straight away, around 0:13 and 0:52 there is a clear indication of the central, most obvious line of symmetry, albeit a little left of centre! This may be because the right side of our image is a little brighter than the left side, causing some confusion with our fitness function evaluation. Here might be a good place to start evolving our algorithm, then. It is wort noting that because our image is black and white, we don’t have to worry too much about our colour space, and utilising brightness values is sufficient for the next step in developing our algorithm, however, when we start to analyse more colourful animals, we’ll need to consider the best colour space to use, i.e L*a*b*  (where euclidean distance is relative to perceptive difference in colour), but more on this later.

For now let’s investigate how to deal with the varying brightness issue. Whereas before we simply had all black pixels and all white pixels, we now need to account for pixels that lie on the grayscale. One way to solve this could do be to use a more balanced approach, i.e to use 4 summation boxes to compare instead of two, and use a reversed comparison for our second set. More to come!

See the code on Github.

Thanks for reading! This is the fourth part in a series of  blogs for the Natural Computing module at Goldsmiths, University of London.

P.S. A fitting soundtrack for the task:

Stochastic Diffusion Search (and where we can use it)

Stochastic Diffusion Search is one of the first Swarm Intelligence algorithms, characterised as a ‘population based best-fit pattern matching algorithm’[1], first proposed by John Bishop in 1989 [2]. It reminds me most of ants foraging for food, and how they communicate distantly via pheromones to reveal promising locations, this results in a truly incredible ‘collective intelligence’ that allows the ants to efficiently locate and utilise food sources.

As a heuristic, the basic idea is that a swarm of agents are randomly initialised in a search space. They then have some kind of fitness function that determines whether they are satisfied or not satisfied, determined by their position in the search space — this is the Test phase. Then follows the Diffusion phase, which is essentially a phase of communication between the agents; in active diffusion, the successful, active agents will let an unsuccessful, inactive agent (chosen at random) know where in the search space they were successful, and then the notified agent will go there to look for neighbouring success locations. The Test and Diffusion phases are then repeated until the system has converged to an optimum location or set of locations (dependent on the problem).

When we start to solve problems based on this heuristic, an important characteristic of the algorithm arises, this is the ‘Recruitment Mode’ contained within the Diffusion, or communication phase, and it has to do with the direction and condition of the communication. Namely, there is active recruitment (explained above), passive recruitment and dual recruitment. There are also context sensitive/ context free mechanisms that can be used in the recruitment modes.

In the dual recruitment strategy, both successful, active and non-successful, inactive agents randomly select other agents. When an agent is successful, and the randomly selected agent is not, then the hypothesis of the successful agent is shared with the unsuccessful one and the agent is flagged as active. Also, if there is an agent which is inactive and disengaged in communication; and the newly selected agent is active, there will be a flow of information from the successful agent to the unsuccessful one and the unsuccessful agent is then flagged as active. Nevertheless, if there remains an agent that is neither active or engaged, a new random hypothesis is chosen for it. [3] 

I have explored two separate implementations of SDS, one is a mining simulation (image in header) that is very similar to the heuristic i have described above (just replace agents with miners and success with gold!). The other is to find a word in a text document, here is an example of finding all the ‘Denmarks’ in a short passage, making use of the Context Sensitive recruitment mechanism, whereby if an active agent randomly chooses an- other active agent that maintains the same hypothesis, the selecting agent is set inactive and adopts a new random hypothesis (therefore achieving a wider exploration):

Screen Shot 2017-10-26 at 15.12.52

In the Context Free Mechanism, the performance is similar to context sensitive mechanism, where each active agent randomly chooses another agent. However, if the selected agent is active (irrespective of having the same hypothesis or not), the selecting agent becomes inactive and picks a new random hypothesis. ‘By the same token, this mechanism ensures that even if one or more good solutions exist, about half of the agents explore the problem space and investigate other possible solutions.’ [3]

On the subject of improving education, SDS has been utilised to predict whether students enrolled on MOOC courses will pass or fail [4]. One problem which I thought SDS would be useful is the Cold Start Problem in Recommender Systems/Collaborative Filtering; when we want to find suitable recommendations for user with little user information, we could use populate a feature space with popular items and then based on the users feature characteristics, we could engineer the swarm so that they choose the best possible items (using the fitness function e.g. rating). Evidence shows that Swarm Intelligence is ‘likely to be particularly useful when applied to problems with huge dimensionality’ [4] which is usually the case for recommendation systems!

It could also be useful to achieve the main goal of CF; to use the mining analogy, we represent each user as a ‘hill’, they may have ‘gold’ (i.e good items based on rating, or ‘goodness’ rating based on users feature weights) within their collections/purchase history. Employing SDS could be a good way of finding similar users as an alternative to using metrics like euclidean distance.

There is much more to say on SDS and it’s applications. One of the things that fascinates me is the algorithms conception, proposed as an alternative to more traditional Artificial Neural Networks to recognise patterns [1]; Neural Networks are by design, non stochastic, deterministic models, for which an alternative (i.e swarm intelligence) is often more accurate and efficient. It is worth noting that ANN’s are also nature inspired models; emulations of a simplified model, or heuristic, of the brain. Let’s keep looking to nature, then, for more amazing problem solving heuristics!

Thanks for reading! This is the third part in a series of  blogs for the Natural Computing module at Goldsmiths, University of London.

What is Dispersive Flies Optimisation, and where can we use it?

DFO is a swarm intelligence based ‘meta-heuristic’ (problem independent computational method), similar to particle swarm optimisation, but particularly inspired by and based upon flies behaviour, and their tendency to swarm around a marker or point of interest. What differentiates this particular heuristic from it’s swarm intelligence parents is it’s characteristic to be disturbed; and consequently disperse at certain points in time, before re-swarming either to the same marker, or if a better one is found upon their displacement, a new point of interest (or optimum value).

As a result of this, it’s a more specific heuristic, I would estimate that the set of search/optimisation problems this could be useful for is fewer in comparison to a more general swarm optimisation (see wiki), but on this set of problems the performance will noticeably increase most of the time, a la No Free Lunch theorem (see my previous blog post here). A particular example where this may be useful is in Neural Network weight optimisation, (see my peer/friend Leon’s excellent post here for a practical example), when we use multi-layer perceptrons the search space becomes complex (resulting in local and global optima).

In computational ‘search’ problems, this is particularly useful as it allows complex search spaces with many ‘optimal’ solutions to be fully explored for both local optima and global optima. To clarify, let’s investigate the term search space.

The most simple in 3-dimensions of search (or feature/data) space would be our parabola, or sphere function (search spaces can exist in any number of dimensions). Suppose in this instance x1 and x2 are our inputs, and the surface of the shape represents our output (or the set of all possible solutions), here the local minimum (optima) is the same as our global minimum. It can also be represented quite simply mathematically:

f({\boldsymbol {x}})=\sum _{i=1}^{n}x_{i}^{2}

A nice example of a more complex search space is the ‘Hölder Table’ (which looks like waffle shaped table):

{\displaystyle f(x,y)=-\left|\sin x\cos y\exp \left(\left|1-{\frac {\sqrt {x^{2}+y^{2}}}{\pi }}\right|\right)\right|}

For more great looking (and also very practical ‘benchmarks’) test functions see the wiki here and more info with Matlab implementations here. (It is worth noting that it is very difficult to visualise anything above 3 dimensions!)

In DFO, first the population number (NP) of flies are initialised in random points in the feature space, according to a normal distribution. Then, the positions are updated taking three things into account, the current position, the position of the best neighbouring fly (which has the best fitness, to the left or right), and the position of the best fly (overall best fitness in the swarm). To determine the fitness, we have a ‘fitness function’ which is run once every iteration of the algorithm; the position of each fly in the feature space is evaluated against the other positions and then the ‘best’ fly is found according to its proximity to the optima (whether it is the minimum or maximum value, as defined by the problem space). The fitness function could be any mathematical function. For 2 dimensions we could simply say f(x1,x2) = (0,0) if we wanted to create a simple visualisation around a central point.

Now, to what makes this algorithm special, the dispersion threshold. To begin with we specify a value for the threshold, usually a small one! A random number (again using normal real distribution) between 0 and 1 is then generated and if the number falls below the dispersion threshold, the selected fly is assigned another random position, thus influencing it’s neighbours. This is a stochastic approach to making sure the swarm doesn’t focus on a local optima, and is forced to keep constantly searching around the space until the algorithm terminates (when the total number of evaluations is met), hopefully by then the global optima will have been found!*  This heuristic hopefully provides a balance between the key exploration and exploitation characteristics of swarm intelligence algorithms; our search is more diverse due to the dispersion and therefore we have more exploration, and we are stopping the swarm from exploiting specific areas too much.

I was wondering whether it would be an interesting approach for a recommendation system, due to the larger diversity of ‘best’ solutions (FF being highest rating) which would be found within a feature space. Particle Swarm Optimisation which is similar to DFO (minus the dispersion factor) has certainly been applied in this area successfully; notably here. Another interesting application is it’s use in medical imaging – utilising the intelligent agents (flies) to extend our cognitive abilities, in particular enhancing the visual perception of radiologists when detecting possible precancerous tissue.

To conclude my initial investigation, it seems there are many things to learn from nature and bringing an element of stochasticity can help us solve complex problems effectively. Swarm intelligence is a fascinating subset of AI; many minds are better than one when we are exploring a new field, and it’s always useful to visit new areas to avoid over exploiting just one solution to a complex problem.

Find the original DFO paper here.
 Applying It
I will now attempt to apply DFO to a well defined problem and something which has been of great interest to me recently; Recommendation Systems and in particular, Collaborative Filtering for music. Collaborative filtering is a very popular technique used within RS, the task is to find similar users to the current user based on their profiles, and make predictions which are appealing to the current user; ‘In the more general sense, collaborative filtering is the process of filtering for information or patterns using techniques involving collaboration among multiple agents, viewpoints, data sources, etc..’ [1]

I have been greatly inspired by work which applies Particle Swarm Optimisation to the task, most notably these papers on using PSO within CF; PSO RS, PSO for CF based on Fuzzy Features (for which my algorithm is mainly derived from) and for Swarm Intelligence in general applied to RS in Web Usage Mining.

Let’s define our model:

There are three phases needed to accomplish the recommendation task by using the DFO.

• User profile formation
• Neighbourhood set generation
• Predictions and recommendations

The user profile could be formed by a number of features: for this music application, I shall use the example of Discogs to inform the best features. For each release, we have the following; how many people have the item, how many people want the item, the average rating for the album (and each rating per user!) and Genres/Styles. There are numerous other features but I would argue that these are the most useful. We can develop new useful features like a Genre and Style ‘Interestingness Measure’ which can be represented more naturally by fuzzy sets (to fit better with average human perception), e.g very bad (VB), bad (B), average (AV), good (G), very good (VG), and excellent (E). When we then compute the similarities between users, we supposedly have more accurate results.

For the neighbourhood set generation; we need to find the optimum weights for every feature of our user profile — this takes into account the fact that different users place different priorities on features, for example user 1 might be mostly concerned with a genre while user 2 is mostly concerned with how highly an album is rated. for example which leads to the optimum feature space, where we have the most similar users. We could determine this by the RMSE of all the users euclidean distance (a quantitive measure of feature differences) to our current user, and perform DFO until we have the optimum solution. This would then lead to the most accurate recommendations (depending also on our process for selecting items from the users, e.g most commonly owned, highest rated etc.)

Now, for the Dispersive Flies optimisation! We have 4 main areas of concern; fly representation and initial population, the fitness function, fly behaviour and the termination condition.

Fly representation and initial population: we are mostly concerned with creating a vector (size dependent on population) of flies with parameters that represent their position, and thus the feature weights, in our n dimensional feature space.

The fitness function: This is very important. We need to evaluate the fitness score of each fly according to how well the weights map to our ideal ‘user space’ (the least RMSE distance of all users to our current user).

Fly behaviour: Each fly updates its position based on the best fly in the swarm, and it’s best neighbour. The dispersion threshold will be set to an initially small value, but this could be crucial to finding the best possible outcome in a finite amount of time when our data space is big (we could have thousands of users and millions of releases in those collections, and a high number of dimensions we will need to explore!)

 

t-SNE is a particularly useful way of visualising high dimensional data. Each point here could represent a different user

The success of this approach will be down to a number of performance measures, that compares the accuracy of the algorithm and the time it takes to reach the best user space. This is likely something that will become clearer to evaluate once we have run the algorithm a number of times, allowing adaptation of the parameters (dispersion threshold, number of flies). In our particular application, a little variance between different user spaces after each run of the algorithm could be useful in finding different recommendations.

This is not a completely detailed application and I will return to this after I have explored the problem further. There are a few other approaches within the field of recommendation that swarm intelligence could be particularly useful, particularly to solve the cold start problem; when we have a new user without many releases in their collection. Can we take more advantage of swarm optimisation for feature based recommendation?

Exploring new music can be a lot like finding new, unfamiliar territory, where the reward of finding something you like in a cluster of songs you don’t know can be extremely gratifying! Perhaps we can employ more inspiration from nature; ants are the some of the best explorers on our planet, could their meta-heuristics aid our search for new music? Tune in next time to find out.

Thanks for reading!

 

This is the second part in a series of  blogs for the Natural Computing module at Goldsmiths, University of London.

References:

[1] Terveen, Loren; Hill, Will (2001). “Beyond Recommender Systems: Helping People Help Each Other” (PDF). Addison-Wesley. p. 6.

The ‘No Free Lunch’ Theorem & Natural Computing

The No Free Lunch (NFL) theorem was first proposed in Computer Science by David H. Wolpert and W. Macready in a series of papers that they published between 1995 and 1997. I found it to be most easily understood in the context of Supervised Machine Learning: “no learning algorithm has an inherent superiority over other learning algorithms for all problems.” [1]

Or to put it simply: “An algorithmicist looks at no free lunch.” [2]

Supervised Machine Learning is the task of inferring a function from labelled training data [3]. A trained model is then used to perform classical machine learning tasks on new input data such as classification or regression.

An example of an appropriate situation in which supervised learning would be useful is a simple medical diagnostic system or ‘virtual doctor’; given a set of previously diagnosed (classified) instances where patients have had said either ‘yes’ or ‘no’ to various symptoms, we could then train a simple model to learn from the examples and diagnose/classify any new patients from their existing symptoms. A simplistic and logical approach would be to use the popular ‘k nearest neighbour’ algorithm; where we quantify the difference between the new patient’s symptoms and existing examples, find our ‘nearest neighbour’ patients, then select the most frequently occurring diagnosis for our nearest neighbours, this then becomes our most likely diagnosis for our new patient. This approach however could run into some issues, what if we have a more than one most frequently occurring diagnosis? How would we select between them? Our algorithmic approach to this problem could now be seen as too simplistic, considering also we have no more training data available. Therefore we must refine our algorithm to the problem at hand… there goes our free lunch!

Some history of the premise, and then a more formal definition may help us to understand why there is no free lunch;

The Scottish philosopher Hume (1739–1740) pointed out that ‘even after the observation of the frequent or constant conjunction of objects, we have no reason to draw any inference concerning any object beyond those of which we have had experience’. [4]

Wolpert (1996) shows that in a noise-free scenario where the loss function is the misclassification rate, if one is interested in off-training-set error, then there are no a priori (known by deduction) distinctions between learning algorithms.” [5}

More formally, where
d = training set;
m = number of elements in training set;
f = ‘target’ input-output relationships;
h = hypothesis (the algorithm’s guess for f made in response to d); and
C = off-training-set ‘loss’ associated with f and h (‘generalization error’)
all algorithms are equivalent, on average, by any of the following measures of risk: E(C|d), E(C|m), E(C|f,d), or E(C|f,m).

How well you do is determined by how ‘aligned’ your learning algorithm P(h|d) is with the actual posterior, P(f|d).

“Wolpert’s result, in essence, formalises Hume, extends him and calls the whole of science into question.” [5]

Personally (and others seem to agree), I think Wolpert and Macready work through colossal mathematical proof for something which seems like a common sense, something that every computer scientist should possess; the ability to investigate problems thoroughly and develop specialised, analytically derived algorithms based on a subset of existing ‘tried and proved’ methods. We must also accept that well defined problems are a simplification of reality, and we must not make assumptions that our algorithms can be generalised to solve them; in fact, the no free lunch theorem states (informally) that unless we know detail about the problem, no heuristic or algorithm performs any better than the other. [6]

An interesting outcome of the complex proof that these two masterminds lay out, for something that seems quite intuitive, is the notion that two (or more) algorithms that search for an extremum of a cost function perform exactly the same, when averaged over all possible cost functions. “In particular, If algorithm A outperforms algorithm B on some cost functions, then loosely speaking there must exist exactly as many other functions where B outperforms A.” [7] This suggests the need for hybridised approaches to search and optimisation.

Towards the end of ‘No Free Lunch Theorems for Search’, Wolpert & Macready begin to discuss biologically inspired algorithms in relevance to the NFL theorem which they have previously proposed. In the case of genetic algorithms and evolutionary computation in particular, where “natural selection is viewed as optimisation over a cost or ‘fitness’ function” (albeit an extremely simplified view) [7], a performance measure of such subsequent generations will not comply with the NFL theorem, as the same initial algorithm will see improved performance on the same data over time. This perhaps predicts the surge in nature inspired computational techniques applied to various fields of art and engineering, the adaptive ‘nature’ of the algorithms themselves provides much flexibility when applied to specific problems.

“Nature Inspired Algorithms is a very active research area. This is because problems with which we are normally familiar are getting more and more complex due to size and other aspects, but also to new problems cropping up all the time on which existing methods are not effective. Nature seems to have been there and done it, so to speak.” – Abdel Salhi 

 

 

Bibliography

Images (by order of appearance)

http://dazuo.blogspot.co.uk/2006_02_01_archive.html

https://www.ericpetersautos.com/2013/10/24/todays-thoughts-oct-19-2013/free-lunch-1/

https://daily.jstor.org/how-do-fish-schools-work/

Text

[1] The lack of a priori distinctions between learning algorithms

[2] “On the Futility of Blind Search” J C. Culberson, 1996

[3] https://en.wikipedia.org/wiki/Supervised_learning

[4] A Treatise of Human Nature Book I, Part III:Section XII (cont.), Hume, 1738

[5] http://www.no-free-lunch.org/

[6] https://www.youtube.com/watch?v=sCePNVXSMD0&t=315s

[7] No free lunch theorems for search