Major Project Chapter 5 – A Vector Space Comparison

This week i’ve been working on using Doc2Vec with CritiqueBrainz reviews, to try and get a good dimension of semantic similarity, from which to serve recommendations from a given track. So far seems to be working, sort of. For a good introduction to the workings of Doc2Vec see this post.

The Process

I’ve been working with the CritiqueBrainz JSON dumps from late 2016, this is a large corpus of reviews from which I can train my Doc2Vec model and then get a Cosine similarity between two reviews and thus 2 releases / pieces of music.

Improving it

Mick has suggested to use Text-Summarisation in order to get rid of superfluous information and thus have a more focused and possibly more successful/efficient vector space for each review.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s