There are lots of music recommenders out there, each one vying for
our attention, trying to connect us to our next favorite band. But not all
recommenders are created equal. Some have the insights and good taste of a
Bill Goldsmith or a John Peel, while others seem to have as much insight
about what I might like as an iPod on shuffle play. But which
recommenders are the best, and which are the worst? And how do these music
recommendations compare to what a professional would generate? Can machines
compete with humans in what is clearly a question of taste? A few months
ago, I decided to try to answer these questions - to find out what is the
best music recommender, and to find out if machines can compete with humans
when it comes to recommending music. Over the next few days, I shall be
blogging what I've discovered.

The tough question

The hardest question to answer is 'How to evaluate a recommendation?'.
A typical recommendaton starts with a seed artist, so for instance
a recommendation may be of the form "If you like The Beatles you may
like.. " followed by a list of recommended artists. Now, when I look a
music recommendation based on an artist that I really know, I can get
a feel for the quality of recommendation. In the list of recommended
artists, I expect to see some artists that I already am familiar with -
probably bands that I like. Even though these recommendations may
be obvious, they help me gain some trust in the recommender. If I
ask for a recommendation based on my affinity for Miles Davis and
a recommender suggests John Coltrane, I get a warm feeling that the
recommender is on the right track. It is a relevant (albeit somewhat
obvious) recommendation. I also expect to see some artists that I've
heard of but am not too familiar with, and I also expect to see some
artists that I've never heard of. So for me a good recommendation
contains a mix of the familiar (to help me gain trust), and the novel
(to help me find new music). Now of course, if the recommended artists
are off track ('if you like Miles Davis you might like Paris Hilton'),
no amount of familiarity or novelty is going to turn the recommendation
into a good recommendation. The recommendations need to be relevant.
To summarize, a good recommendation has three aspects:

  • familiarity - to help us gain trust in the recommender
  • novelty - without new music, the recommendation is pointless
  • relevance - the recommended music has to match my taste

Lets look at a couple of recommendation as examples. Here's a set of
similar artist recommendations for The Beatles from last.fm. According to
last.fm, if you like the Beatles you might like:

  • The Rolling Stones
  • The Who
  • John Lennon
  • Led Zeppelin
  • Queen
  • Beach Boys
  • Doors
  • David Bowie
  • Kinks

Well, last.fm certainly gets the 'familiar' and the 'relevant', but
there's nothing novel here. Any fan of the Beatles has no doubt heard of
all of these bands, no great discoveries will be found in this selection.

Compare those to this set from Dominique, a professional music critic.
According to Dominque, if you like the Beatles you might like:

  • Paul McCartney/Wings
  • John Lennon
  • Harry Nilsson
  • Queen
  • George Harrison
  • ELO
  • Raspberries
  • Badfinger
  • XTC
  • The Millennium

This list has the familiar and relevant McCartney, Lennon, Harrison
and Queen as well as some artists that a typical Beatles fan may not
have heard of: The Raspberries and Millennium. Dominique seems to have
restricted his recommendations to bands from the late sixties and early
seventies. Not an unreasonable choice, but it does perhaps reduce the
novelty aspect of the recommendations.

Compare those to this set from Chris, another professional music critic.
According to Chris, if you like the Beatles you might like:

  • Chuck Berry
  • Harry Nilsson
  • XTC
  • Marshall Crenshaw
  • Super Furry Animals
  • Badfinger
  • The Raspberries
  • The Flaming Lips
  • Jason Faulkner
  • Michael Penn

So there's the familiar with Chuck Berry and Harry Nilsson, but I have to
pause and think for a bit about relevance - there's no Rolling Stones or
The Who to give me a warm fuzzy feeling, Chris gets to the novel artists
right away - high on novelty scale but not high on the familiarity scale
(at least for me).

And finally here's a set of similar artist from the All Music Guide.
According to All Music, if you like the Beatles you might like:

  • Hollies
  • Searchers
  • Peter and Gordon
  • Monkees
  • Gerry and the Pacemakers
  • Bee Gees
  • Zombies
  • Dave Clark Five
  • Remains
  • Sorrows

This list seems fairly balanced, there's the familiar (Monkees, Bee Gees,
Dave Clark Five, and the Hollies), they've avoided the cliches (no Rolling
Stones, Queen or the Who) and have some bands that I'm not too familar
with including a band called 'The Sorrows' who are described as "one of
the most overlooked bands of the British Invasion" which certainly looks
like something that'd be fun to listen to if I like the Beatles.

With these four examples, I've tried to show the three elements
that I look for in a good music recommendation: familiarity, relevance
and novelty. The difficulty is, of course, that no two people will agree
exactly on what is familiar, relevant or novel. What I find novel, may be
familiar to a professional music critic, while a musician may find musical
relevance when I don't. The difficult job for a recommender, whether
it is human or a machine is to find the proper level of famililarity,
novelty and relevance for each person.

Finally, if we really want to evaluate a number of recommender systems,
to compare the quality of recommendations, we have to figure out a good
way to turn the very subjective measures that we've looked at here into
a set of objective measures that we can use to score recommendations.
Part 2 will look at some of the objective ways we will use to evaluate the
various music recommenders.

Comments:

I have to wonder about the wisdom of using the Beatles as the canonical example for the sanity checks. They present the same sort of problem as Miles Davis in that over the course of their career, they made definitive statements in so many sub-genres that it's hard to say what is a *bad* recommendation.

Posted by Adam on October 05, 2007 at 04:57 AM EDT #

@adam: fair enough criticism - I chose the Beatles for this example since everyone knows the Beatles - and common recommendations given for the Beatles such as the Rolling Stones or the Who are good examples of recommendations that fall into the 'good but not useful' category.

Posted by Paul on October 05, 2007 at 05:34 AM EDT #

Hey Paul,
Thanks for throwing AMG into the mix.

I would like to propose a new term to replace "Novelty" in our general lexicon. When I hear the term "Novelty" in regards to music, I automatically think of Spike Jones, Alvin & the Chipmunks and Weird Al.

Is there another term that has the same meaning that doesn't already have a music-related connotation? Freshness? Newness? Unfamiliarity? Discovery? Introduciness???

Posted by Zac on October 05, 2007 at 09:52 AM EDT #

Thanks Paul, it's a very interesting split of the aspects that should make a good recommendation list, or at least, a blind list (no explanation given to justify any of the choices). However I see two objections to this split.

First, each of the aspects must be weighted toward the use of the recommendation. If the sole use is uninterested recommendations of artists that you would like given you like seed A, it's a good separation, but there are more usages to "recommendations" than the list themselves, for human consumers...

How you define "relevance", especially, can affect the other two aspects. Some users might want the obvious results, because they don't want the list for their own discovery, but for a broader, more conservative goal, where others could only want to discover new things and have no interest in the common ground.

You also use "familiarity" to establish a trust in the "system", but these "no new information" recommendations are wasted slots once a confidence in the system is established.

I think the three aspects you've found are very good, but balance between them is essential.

Also, I know the inevitability of metrics, but creating one is almost an invitation to game it, SEO-style. How about an automated system that would game the three aspects you describe? 4-5 common recommendations (familiarity), 2-3 obscure/original choices (novelty), and some in-between, filtered by what you already know/listen to (personal relevance). Sounds a bit cynical, like the top 5 lists of High Fidelity, but perhaps, in the end, not totally bad...

Also, I think "personal relevance" would be a better expression than "relevance" alone, since this word could be used to qualify the whole quality of the recommendation.

Posted by Marc-O on October 05, 2007 at 02:34 PM EDT #

If you are going to use last.fm as a music recommendation, what about using pandora.com as well?

The challenge with using the Beattles as a test group, is that their style of music varies. If you like one song, it may match these bands, but another song, may be more like these bands over here.

Posted by Sean on October 19, 2007 at 07:25 PM EDT #

Post a Comment:
Comments are closed for this entry.

This blog copyright 2010 by plamere