Saturday May 21, 2005

Russell Beattie writes about his experience with Yahoo Music Unlimited. Russ really liked the interface, the ease of finding music, including the similar artists feature. He certainly doesn't like having to use an non-iPod music player (a Creative Zen Micro) which he called a "user-hostile frustrating piece of junk". Russell's most telling comment is "Say what you will about me not owning this music, but I'll tell you right now it sure *feels* like I own it."

Friday May 20, 2005

There are a number of aspects to music similarity. Melody, instrumentation, tempo, rhythm, spectral shape, acoustic density all factor in to our concept of similarity. One aspect of similarity that is very hard to extract directly from the audio is the lyrical content. Lyrics can play a large role in similarity, but for the near future at least, any system that uses lyrics to determine similarity will have to get the lyrics from hand-edited databases such as maintained by It is just too hard for a speech recognizer to recognize lyrics (they have enough of a problem recognizing clean spoken speech in a noise free environment). Song lyrics are hard enough for people to understand. The Misheard Lyrics Hall of Fame documents some of the more frequent misheard lyrics. Some of my favorites:

Wrong lyric: The ants are my friend, they're blowin' in the wind
Right lyric: The answer, my friend, is blowin' in the wind

Wrong lyric: I'll never leave your pizza burning
Right lyric: I'll never be your beast of burden
There are lots more at

Thursday May 19, 2005

Reader Geoffrey Peters sent me a link to his recent project Song Search by Tapping. This is a java applet that allow you to tap out the rhythm of a song on your keyboard and the applet will try to identify the song based solely upon its rhythm. It's a neat little applet that worked well for me. It was even able to distinguish between the songs "London Bridge is Falling Down" and "Mary had a little lamb" which are practically identical. Right now the song database that is searched is very small (only 30 tunes), but Geoffrey says that they are interested in seeing how their system will scale up to much larger databases. There's a paper Song Search and Retrieval by Tapping that describes how the system works.

I've tried a number of music content query systems including systems that use query by humming and query by parsons code. The query-by-humming systems are often troublesome because getting the microphone configuration just right is hard. Parsons codes work well but are not very intuitive and can be difficult for the musically untrained to generate. I find the query-by-tapping interface to be the easiest way to query for music content.

There are some other folks who are looking at query-by-tapping systems. The BeatBank system described in this paper can identify the correct song about two-thirds of the time with a small (56 song database). The Super MBox system shows a 15% accuracy with an 11,000 song database. I'm looking forward to the day when I can use a query-by-tapping system to find that song I heard on the radio on my way to work, but given the current state of the art there's quite a ways to go before that happens.

Wednesday May 18, 2005

We've all heard the The Story about Ping ... but how about the Story about Pong. Now with all of the latest buzz around PS3, XBox 360 and Nintendo Revolutions its good to look back at home video games roots. Today's Nashua Telegraph has a good article about Ralph Baer the inventor of the video game. There's lots more about early video game history too.

Monday May 16, 2005

We've all heard of Extreme Sports and Extreme Programming and even Extreme Ironing, but I'm guessing that most people haven't heard of Extreme Music Notation. Music Information Retrieval researcher Don Byrd maintains the Extremes of Conventional Music Notation webpage, where he records the extremes found in written music. Some interesting excerpted tidbits:
  • softest pppppppp (8 p's) in Ligeti's Etudes for Piano, 1st Book
  • loudest ffffffff (8 f's) in Ligeti: Etudes for Piano, 2nd Book, (the 1812 overture only reaches ffff)
  • Instruments to be played by one performer in a piece - *Mahler: Symphony no. 5 calls for one clarinetist playing six different instruments.
  • Most repeated notes in a melody - 32 in Prokofieff: Toccata, Op. 11 (1912)

There are many others, quite interesting.

Friday May 13, 2005

Today I used google to search for "music information retrieval". The usual and expected links come up. Interestingly, however, was that the first 'sponsored link' was Work at Google. Perhaps Google is finally starting to think about MIR.

Musical Information Retrieval is a fairly new area of research. In fact, this year will be only the second year in which a community-wide algorithm and systems evaluation contest is held. The Music Information Retrieval Evaluation eXchange (MIREX) will take place during the 6th ISMIR Conference in London, UK, September 11-15, 2005. The goal of this contest is to compare state-of-the-art algorithms and systems relevant for Music Information Retrieval.

Nine evaluation tasks are planned:

  • Audio Artist Identification
  • Audio Drum Detection
  • Audio Genre Classification
  • Audio Melody Extraction
  • Audio Onset Detection
  • Audio Tempo Extraction
  • Audio and Symbolic Key Finding
  • Symbolic Genre Classification
  • Symbolic Melodic Similarity

I find the Artist and the Genre classification tasks to be particularly interesting.

This year, the MIREX will be using the new M2k (Music to Knowledge) toolset as the framework for the evaluation. The deadlines for MIREX are:

  • Participation Statement: June 12, 2005
  • Algorithms and 1-page Abstracts: June 26, 2005
  • Extended Abstracts: September 2, 2005

Thursday May 12, 2005

Here's another way to have fun on your computer. JFugue is a Java API for music programming. It has a rich set of classes for creating phrases, chords, transforms and so on. Here's a bit of Jfugue code that play a scale:

		import org.jfugue.*;

		public class MyMusicApp {

			public static void main(String[] args) {
				Player player = new Player();
				Pattern pattern = new Pattern("C D E F G A B");;
JFugue includes Lots of documentation and is released under a creative commons license.

Wednesday May 11, 2005

Yahoo just announced its own music subscription to go: Yahoo Music Unlimited. Here's the good:

  • $4.99 per month for access to one million songs ($10 cheaper than napster-to-go)
  • Downloads for $0.79
  • Load songs into portable devices (just like napster-to-go)
  • Supports XSPF: XML Shareable Playlist Format

Here's the bad

  • $4.99 is not going to last for long. Expect the price to rise after you are locked in.
  • DRM won't let you get at the bits
  • Can't load subscription downloads onto the iPod
  • Only runs on windows - no Mac, linux or solaris support.

prefuse is an open-source, java-based interface toolkit for building interfactive visualizations of data. Using this toolkit, developers can create responsive, animated graphical interfaces for visualizing, exploring, and manipulating these various forms of data. Here's a cool demo showing a force-directed layout of a social network. According to the paper prefuse: a toolkit for interactive information visualization prefuse is designed to work with very large datasets. One example in the paper demonstrates using prefuse for browsing through a 600,000 node web directory.

This package looks like it will be very useful for visualizing artist and music similarity networks. Some folks over at Berkeley are doing just this with project Orpheus. Orpheus is an exploration and visualization tool for discovering the universe of new and independent recording artists. Check out the demo of orpheus. (Thanks to Brooke Maury for the prefuse tip).

Tuesday May 10, 2005

Scansoft and Nuance announced yesterday that they plan to merge. This continues Scansoft's acquistion of just about every small speech engine business (L&H, Speechworks, Rhetorical). The merged company will be called Nuance.

Monday May 09, 2005

Looks like Napster is getting into the ringtone business. It's a funny business where you charge $0.99 for a high quality full length song that you can listen to anytime on your PC or iPod, and at the same time sell an instrumental, 30 second, rendition of the same song for $1.99. Oh yeah, you can also purchase an AudioTone (which is the actual song audio, degraded for your phone) for $2.99. So why can't I just make my own AudioTone from one of the songs in my mp3 collection? Crazy business model.
I've just updated the Tools We Use page hosted at the music-ir website. The "Tools We Use" is a page describing all of the various tools used by researchers in the Music Information Retreival community. The current version of the page contains descriptions of around 75 tools used by MIR researchers. If you know of a good MIR tool that is not listed on this page, please let me know. Email me at "paul DOT lamere AT sun DOT com". Thanks to the many MIR-men and MIR-maids who have contributed.

There has been much talk about Motorola's deal with Apple to produce an iPod-style phone (called the H1) that can play songs dowloaded from iTunes. This device was announced quite some time ago, but we've yet to see one. Now Nokia is entering the market of music-enabled phones. They've announced the N91 a multimedia phone with a 4gb hard disk, (roughly the size of an iPod mini).

Right now, it looks like the only way to get music onto the N91 is either by hooking it up to your computer (via USB) and copying it there, or by recording FM. It doesn't look like there's any support for hooking it up to iTunes or one of the subscription services like Napster-to-go. I think the music market will get really interesting when people can start to browse and buy music from their phone/mp3 player. It will be here very soon!

Friday May 06, 2005

Each scientific discipline has its tough problems. Physicists search for the grand unified theory. Mathematicians try to prove Fermat's last theorem or solve the Poincare Conjecture. Neuroscientists struggle to explain consciousness.

Music Information Retrieval researchers have been similarly vexed by a single tough problem. Like Fermat's last theorem, this problem is simply stated, and many have tried to solve it, yet its answer remains elusive. For many MIR researchers, this problem is what brought them to the field of MIR, while for some, the intractable nature of the problem ultimately drives them away.

In 1965 the US Government commissioned several federal agencies with the task of trying to solve this problem. Some promising progress was made, but ultimately these efforts ground to a halt and the programs were abandoned.

For the 40 years or so since then, a series of solutions have been proposed by successive generations of researchers, musicologists and performers. For me, I am hoping to be able to make my mark in the MIR field by contributing to finding the ultimate solution to this problem. I think we are getting close, and soon we all may be able to celebrate a great advance when MIR researchers can finally answer the question "What are the lyrics to 'louie louie'?

Some resources:


This blog copyright 2010 by plamere