Hit Song Science - not yet a science
Last year I posted about my skepticism about the "hit predictors" that claim that they can use machine learning algorithms to predict which songs will be hits. So when the proceedings for ISMIR 2008 went online, the first paper I downloaded was Hit Song Science is Not Yet a Science by Francois Pachet and Pierre Roy. In this paper, Pachet and Roy set out to validate the hypothesis that the popularity of music titles can be predicted from acoustic or human features. It is an interesting paper that concludes:
... the popularity of a song cannot be learnt by using state-of-the-art machine learning techniques. This large-scale evaluation, using the best machine-learning techniques available to our knowledge, contradicts the claims of "Hit Song Science", i.e. that the popularity of a music title can be learned effectively from known features of music titles, either acoustic or human.
The authors do not close the door on 'hit song science' - they
limit their conclusions to say that with the current state-of-the art
features, we cannot predict hits - but with better features it may be
possible. They conclude: Hit song science is not yet a science, but a wide open field.
If I had to analyze one million new songs a year, or 3,680,0000 minutes of audio, I'll take the semi-accurate, still-to-be-polished hit song science tool over the human solution any day.
Posted by Bruce Warila on August 31, 2008 at 10:29 PM EDT #
Reading this I had to chuckle.. I can still write a very short program that predicts with extremely high accuracy whether or not a song will be a hit:
public static void main(String[] args) {
Song s = new Song(args[0]);
if (s.isActualSong()) {
System.out.println("Not a hit");
}
}
I'll bet my accuracy is well above 99%. Not quite 100%, but close enough.
;-)
Posted by jeremy on September 01, 2008 at 12:15 AM EDT #
Bruce:
The point from the paper is that these systems are not 'semi-accurate' at all - popularity simply cannot be derived from current state-of-the-art features. If someone tells you they have an algorithm that can predict a hit directly from the audio they are selling you snake oil.
Posted by Paul on September 01, 2008 at 06:50 AM EDT #
There might be some features that can be derived directly from the audio signal that correlate at least with some hit songs characteristics (of course, we probably don't know them yet), but things like the hype about an artist, when he released the song, the budget, the record company put into the marketing of the song, and if the song was a recording of a new casting show band, are not encoded in the audio signal.
Posted by Daniel on September 01, 2008 at 05:23 PM EDT #