In the last few months, we've been working hard trying to improve our algorithms for determining acoustic and perceptual similarity of music. One aspect of this is task is to chose a good set of features to extract from music. We've done a number of experiments with various feature sets to help us understand what feature sets work best for various tasks.

One major difficulty with these types of experiments is that feature extraction can be extremely CPU intensive. Lots of time consuming DSP algorithms (MP3, decoding, FFTs, filtering, windowing, convolutions, DCTs) are used during feature extraction. A fast feature extractor can run in 0.1 X Realtime, that is, it can process ten seconds of audio in one second. That seems pretty fast, but it still takes nearly three days to process our modest sized test collection of 10,000 songs. And we don't just want to extract one feature set, we want to extract and experiment with twenty different kinds of feature sets. Plus, sooner or later we are going to want to scale this up to industrial sized music collections. Extracting a single feature set for 2,000,000 songs could take 18 months of continuous processing.

Luckily, I work for a company like Sun, that has some pretty good computing resources. Sun is rolling out its Sun Grid , the $1/cpu-hr compute utility. With the grid, I should be able to take my feature extractor and distribute it over a collection of hundreds of CPUs to yield a 100X performance. My 3 day feature extractor will run in 45 minutes. The 18 month processing of 2,000,000 songs will take less than a week. This has the potential to really change how I work. I'll be able to try all sorts of experiments that would just take too long otherwise.

I'm really excited about this. With the grid, I'll be able to do all sorts of things that would otherwise be nearly impossible.

Comments:

Post a Comment:

This blog copyright 2010 by plamere