The Feature Weighting Problem

Generating recommendations that a user is most likely to appreciate is a well researched problem. In Music Recommendation, a big part of the problem is adjusting the weights of a number of features which will attempt to define a listener’s profile. Recently, Music Information Retrieval techniques (acoustic features) have further increased the dimensionality of the solution space, offering the opportunity for more personalised recommendations, whilst simultaneously increasing the need for better algorithms to ‘tune’ the feature weights to model a user’s musical personality.

Applying DFO to the optimisation problem

The Fitness Function for this task could be the how well the acoustic features map to the user access data. I.e if the generated recommendation has a high average rating by other members of the community, or if it exists in another playlist by similar users. The dataset I will be implementing the algorithm for, and evaluating the results on, will be subsets of the well researched Million Song Dataset; for example, The Echo Nest Taste Profile Subset and have a neighbourhood set of item-based similarity to draw the fitness from. Partly inspired by Inspired by the Particle swarm optimization recommender system (for movies and user profiles).

**Evaluation: **I will do my evaluation based on results of other item-based similarity methods for the Million Song Dataset from the Million Song Dataset Challenge.

**The Goal: **To utilise the collective intelligence of virtual flies, and thus allow users to effortlessly discover new music that they are very likely to enjoy!

The Combinatorial Card Game

The premise of this combinatorial problem is as follows; there are 10 cards, each labelled in ascending and unique order 1-10. The aim of the game is to get 1 unique subset of the cards to sum to 36 and the remaining subset’s product to = 360. We can treat this as a 10 dimensional problem.

**The generated solution:**

Params: 20 flies, 100 iterations, dt = 0.01

Subset 1: [ 9 10 2 7 8 ]

Subset 2: [ 3 6 5 4 1 ]

Function evaluations (calls to get fitness): 50

The solution can of course be any permutation of these 2 subsets.

My curiosity in permutations and concern for how efficient the algorithm is (measured in function evaluations) compared to an approach of simply iterating through the solution vector (a brute force approach) led me to evaluating solving the problem with DFO; the worst case (simply iterating through the vector and finding that answer at the very end) is approx. 10! (3628800 FE’s) and I solved it in 50 FEs which is 72576 x better than the brute force approach!

See my code on Github.

*Thanks for reading! This is the sixth part in a series of blogs for the Natural Computing module at Goldsmiths, University of London.*