If you have a large music collection you probably know all about messy metadata.  Artist, song and album name misspellings are common.  Missing or incomplete data are par for the course.  Inconsistent numbering and formatting, improper internationalization, duplicates, partial albums, multiple encodings, compilations - all make this a very sticky problem.  Now, imagine you are a music researcher with 100,000 tracks, or even imagine you are Last.fm with millions of tracks, all with messy metadata.  The messy metadata gets in the way all of the time - it ruins recommendations and playlists, it confuses, and just makes you look clueless when you can't tell that ELP, Emerson, Lake and Palmer, Emerson, Lake & Palmer, EL&P are all all the same band.  

Last.fm has decided to take on this problem and solve it. But not just for themselves, but for the world. They are distributing an audio fingerprinter that will collect data on common misspellings for tracks, artists and albums. Soon they will be offering web services that you can use to clean up the data.  This will be a boon to all of mankind.  RJ describes the project on the last.fm blog: Audio Fingerprinting for Clean Metadata

 I really hope this will tie in with the MusicBrainz database.

Update:  Elias points to this post on the MusicBrainz forum where Russ (of last.fm) clears things up:

We have no intention of dropping MusicBrainz (especially since it's 
taken so long to get the license in the first place!). MB is a lot more
than just the fingerprinting: we think MB is a great source of metadata
- and the more metadata sources we have, the better. Our fingerprinting
services will definitely return MBIDs in the future.

That's just super! Now hopefully 'the future' is 'real soon'.

Comments:

And FINALLY, I'd say! ;)

Posted by Norman on August 29, 2007 at 05:40 PM EDT #

"I really hope this will tie in with the MusicBrainz database."

Indeed. It seems to me they're reinventing the wheel.

Posted by mll on August 30, 2007 at 04:30 AM EDT #

mll, you might have already read the comment Russ wrote about how MusicBrainz and Last.fm will fit together.

http://lists.musicbrainz.org/pipermail/musicbrainz-users/2007-August/016640.html

He writes "Our fingerprinting services will definitely return MBIDs in the future."

Posted by elias on August 30, 2007 at 02:26 PM EDT #

Will the fingerprint data be used for audio based similarity mapping I wonder? Elias? That could make for a nice little data set.

Posted by Ian on August 30, 2007 at 10:28 PM EDT #

Ian, fingerprints can be used to map audio contents to any type of information. For Last.fm it's definitely a big step towards anything related to audio analysis on the client side as it's the first time Last.fm touches the files directly (instead of of only using information supplied by the audio players). However, (I'm not sure if this is part of the question) the fingerprints themselves cannot be used to compute similarity (two almost identical songs might have two completely different fingerprints).

Posted by elias on August 31, 2007 at 07:14 AM EDT #

Post a Comment:
Comments are closed for this entry.

This blog copyright 2010 by plamere