Friday Nov 21, 2008

The JavaOne call for papers is now open Speaking at JavaOne is fun, but also a lot of work. Here are some images from last year (courtesy of flickr):

Monday Nov 17, 2008

Most music recommenders use some form of collaborative filtering to connect listeners with new music. That's not a surprise, CF works really well for popular music (which, of course, is what most people are listening to). But if you are interesting in long tail music, music that is unpopular (or perhaps music that is brand new and doesn't have any listeners), then you are out of luck A collaborative filtering algorithm cannot help you -- CF recommenders rely on the wisdom of the crowds, but that leaves them impotent when there are no crowds. One approach to the problem of dealing with new (or unpopular) content is to use a content-based recommender - instead of making recommendations based upon who is listening to a song, the recommendations are made based on what the song sounds like.

We've seen a few commercial music recommenders that rely on content-based techniques - MusicIP, Echo Nest, Ghanni, One Llama, Audiobaba, Owl, SoundFlavor and Pandora have all had some aspect of content-based recommendation at the core of their systems. Now there's another content-based recommender to add to the list of contenders - mufin. mufin-logo.png

Mufin (which stands for MusicFinder) is a content-based recommender that (according to Techcrunch) is an offshoot of Fraunhofer (the folks who invented the MP3). Given this pedigree, I had very high expectations for Mufin.

The Technology

From the Mufin Technology page:

    mufin uses mathematical extraction of specific details from a music title to create an objective desciption of the title’s characteristics which is independent of human influences. Using this description, songs which are similar to one another can be filtered out of a large database. Rhythmical characteristics (e.g. intensity of the rhythm, tempo, percussion in the piece), sound colour, harmonic and melodic qualities are weighted differently for the result.

My Impressions

I was lucky enough to get a invitation to the Mufin private beta - so I took it for a spin - here are my impressions.


The Mufin site is very simple to understand. It has a classic search box where you can search for an artist, album or track. Since there are nearly 5,000,000 tracks in the catalog you are bound to find what you are looking for. I searched for bands like Deerhoof, ELP, The Beatles, Hannah Montana, The Feelies, and the Rheostatics. Mufin had them all. Unfortunately, Mufin won't let you play back all of the songs in the catalog - some songs can be played all the way through, some you can play 30 second excerpts, and some are not playable at all - this is, no doubt due to the crazy licensing issues that plague digital music. Mufin also lets you purchase tracks from iTunes - (allowing Mufin to make some money via referrals).

The Recommendations

Of course, searching for music is no big deal - what we are interested in are the content-based recommendations. So lets check them out.

Whenever I try out a new content-based recommender, the first track I usually try is Revolution #9 by the Beatles. This is a good way to find out if a recommender is really a CF recommender trying to pose as a content-based recommender. Revolution 9 is so unlike any other Beatles song (or any other Rock or Pop song recorded in the 70s) - that recommendations from a content-based recommender will yield very different results than what you'd get from a CF recommender. Using Mufin to find music similar to Revolution #9 yielded a rather strange grab bag of music: Some Wagner, Engelbert Humperdink, UFO, Gershwin, The Disney Aladin Soundtrack, Frank Sinatra, and Ella Fitzgerald. No 70s pop, which is a good sign, but the results seemed to be strongly under the influence of a random number generator. Not a very good start - but since Revolution #9 is not your typical track perhaps these results were an aberration.


So I moved on to something a bit more conventional - some power pop with from Weezer. I tried the track 'My Name is Jonas'


These results looked better with tracks by Foreigner, Cell Division, Guano Apes, Dream Theater. However, there was also quite a bit of heavy metal - Rhapsody, Kamelot, Dio - that didn't really seem to fit to well.

Next up was some cool jazz - Take Five by Dave Brubeck. For similar tracks I did get some cool jazz like McCoy Tyner, but also some salsa, some crash test dummies, some blues by McCracklin and Louisiana Red (good stuff, but not anywhere near the cool jazz and syncopated rhythms of Brubeck), some female vocalist/country music by Frances Black, Gospel music by the Jackon Southernaires. Again it seemed like the random number generator was working overtime on this set of recommendations.


Next up was Led Zeppelins venerable classic Stairway to Heaven:


The recommendations included soul music by Aaron Neville, minimalist electronica (tubular bells), classical piano (Mussorgsky), easy listening (Englebert Humperdink, Paul Anka). There was no need to 'get the led out' of this list. One has to wonder about a recommender that puts Engelbert Humperdink in the same list as Mussorgsky's Pictures at an Exhibition.

Finally I tried some classical music - with Beethoven's Fifth symphony.


These recommendations seem to be much better - the recommendations include some more Beethoven, there's Hayden, Handel, Tschaikovksy, Mozart, Chopin, Dvorak and a number of classical film scores.


The Mufin folks have done a good job putting together the Mufin site. They've indexed an incredible amount of music. The search engine and the recommender engine are fast. The site design is clean, and easy to use. However, the content-based recommendations provided by Mufin don't fair well when compared to the type of recommendations one can get from a CF filtering recommender. There is a very high rate of 'clunker' tracks that no human would ever recommend based upon the seed track. For some seed tracks, the recommendations seemed no better than shuffle play -while for others especially classical, the recommendations seemed to be pretty good.

The Mufin team are working hard to improve their recommendations - I hope they get the kinks out - it would be nice to see long tail, content-based recommendation become a reality.

PhD student Employee Nepusz Tamás has created some nifty and rather intriguing plots of the artist similarity space at his site: Reconstructing the structure of the world-wide music scene with sixdegrees.png.

Update: There's a super-zoomable version on the seadragon gallery page. I can't figure out how to directly link to the image ... It is the 5th one from the right. The plot is created by crawling the similar artist graph of (starting with Nightwish of all places) using the audioscrobbler webservices. The artists are arranged on the graph using a DrL graph layout algorithm (DrL is a force-directed layout algorithm that works with very large data sets. More info about this algorithm can be found in this paper ). The nodes in the graph are colored based upon the most frequent tags, while the edges are colored based upon their 'betweeness centrality score'. The area of a node in the graph is approximately proportional to the popularity of the artist.

Nepusz has also created an interactive map that allows you to type in the name of a few artists to see where they live on the map - or you can just enter your user name and it will show you where all your favorite artists are in the world of music.

The layout algorithm does a pretty good job of showing the large scale structure of the artist space. Artists with similar genre tags are well clustered. The ability to see where a particular artist is located on the map is very nice and the user integration is particularly sweet. Interesting too, is how the popular artists seem to be clustered in the graph. The larger vertices in the red-rock area form a very tight line. This may be an effect of using the artist similarity which has a popularity bias (the top 10 artists similar to the beatles are all very popular artists).

What I really wish I could do is to use these plots for music discovery - I'd like to be able to mouse over a vertex to see what the band is (and even be able to listen to the band). It would be really interesting, for instance, to explore the point where the electronic and the rock world meets (like in the subsection of the graph shown here on the left - what artist is represented by the large orange node?). 6d-outlyer.png It'd be interested to see what the outliers are (I wonder what this reggae/ska artist is doing near the jazz, as seen in the subgraph on the right). 6d-outlyer2.1.png I'd like to be able to zoom in and see some of the finer structure (if I zoom in on the Nightwish neighborhood, do I find more finnish, gothic metal?).

I'm a sucker for such visualizations, I think they can be a powerful tool for helping people to understand and explore a music space, and they can reveal relations and structure that are not evident in simple lists of similar artists. But creating these visualizations are not easy. Without special care they can easily turn into meaningless blobs. Nepusz has done an excellent job finding the right embedding algorithm, color and sizing strategy for the data. I hope he continues to add interactivity to his plots. Well done.

Saturday Nov 15, 2008

There's a nifty feature on Apple's site about Ge Wang and his ChucK programming language for making music.


Ge Wang is a really smart energetic guy. While at the ISMIR conference this year, I sat next to Ge at dinner - he showed me his first iPhone app: SMule's Sonic Lighter:

Since then, Ge has created another very popular iPhone app called the Ocarina:

Ge as been very busy.

Thursday Nov 13, 2008

Steve and I are taking a road trip tomorrow, driving to the great white north to attend the Montreal Music and Machine Learning Workshop 2008 being hosted at the Université de Montréal. The organizers have posted the schedule that includes eight talks and 10 posters.

I'll be giving a short talk called Stairway to Muskrat Love - If you like Led Zeppelin, you might like Captain & Tennille that describes some of the work we've done here in SunLabs (My talk title seems to be considerably less formal than the other talks on the schedule, sigh).

I'm particularly looking forward to the poster session (and not just for the wine ad cheese) - there are posters being presented by students from UdeM, Columbia, and McGill. It should be a fun day - and hopefully worth the 4AM departure time (3:30AM for Steve!) and the 10 hours of driving.

Sunday Nov 09, 2008

10A6C163-8370-4909-A075-5B90BF349B5F.jpgiLike now has a developer platform that lets you embed the iLike player in a web page and play just about any song on demand. There are three main functions:
  • iLikeDisplaySong - displays/plays a single song
  • iLikeSongChooser - search and select a song
  • iLikeDisplayPlaylist - to display/play a playlist
The terms-of-use is refreshingly simple: We're starting with the hope that it simply breaks even. If it turns out to be profitable, we may find a way to share revenue with our best developers. If a few developers use the system a ton and it costs us too much to support them, we may ask them to help us cover the costs. But for now, the service is free, please use it and build cool stuff, and if you build something that's really popular, we're sure we can all make something good come of it.

The code for displaying a song is straightforward - just include some iLike javascript and away you go. For instance to play a song, use the code:

  iLikeDisplaySong({elId: "song1", songName: "roundabout", artistName: "yes"});
APIs that make it easy for 3rd parties to embed music, like this iLike API are going to really help new music exploration and discovery companies connect people with music. Via Doubtful Sound.

Friday Nov 07, 2008

Now that the election is over, the entire country is focused on what kind of dog will Malia and Sasha be getting. Since we build recommenders, it is only natural that we try to solve this pressing dog-recommendation problem. After extensive analysis, and incredible amount of computation, we have narrowed the selection to one of the dogs in this live streaming feed.

Puppycam via Anthony, the ruffomender neologism by Steve.


Georgia Tech today launches the new Center for Music Technology with more than 20 researchers from the arts, sciences and engineering. Several interdisciplinary projects already in progress were demonstrated today at a launch event for potential collaborators.

There's lots of interesting things to see on at the GTCMT website, including ZooZBeat


Flou - (pronounced "flew") is not exactly a game; you do fly a ship through space, but you cannot shoot anything, score points, or win or lose. The focus, rather, is on the soundtrack: as you navigate through a 3D world and zoom through objects in space, you add loops and apply effects to an ever-evolving musical mix.


They are doing some interesting work around music and emotion: Statistical Learning and Expectation Evoked Emotion, Emotion in Raag, and Basic Auditory Cues for Emotion

A number of researchers from GTCMT have been active in the MIMR community including Parag Chordia, Mark Godfrey and Alex Rae. I particularly enjoyed their work presented at ISMIR this year around Content-based Recommendation - Hubs and Domain-specificity Nice. The President-Elect also has a Flickr account. There are some great, behind-the-scenes shots.

Wednesday Nov 05, 2008

It's Friday night, and you are headed to the video store to get a DVD for the evening's entertainment, but you've already seen "From Justin to Kelley" and "Gigli" - so what do you do? Use the Shawshankr of course! The Shawshankr is a next-generation movie recommender that is extremely easy to use. Just type in the name of a movie that you like and the Shawshankr will give you a top notch, personalized recommendation. It couldn't be easier. Our user studies tell us that the Shawshankr recommendations are of high quality, and perhaps more importantly, extremely consistent (and everyone knows how important consistency is in recommendation!). So if you are looking for a good movie, be sure to give the Shawshankr a try.


The Shawshankr was created by my colleague, Jeff - who deserves all the credit and praise for this wonderful system.

If you are a Spotify user, here's a playlist for the day
    1. Marvin Gaye, "Ain't No Mountain High Enough''
    2. John Parr, "St. Elmo's Fire (Man in Motion)''
    3. Tina Turner, "The Best''
    4. The Doobie Brothers, "Takin' It To The Streets''
    5. Earth, Wind & Fire, "Shining Star''
    6. O'Jays, "Give The People What They Want''
    7. Sam and Dave, "Hold On I'm Coming''
    8. Kool & the Gang, "Celebration''
    9. Natasha Bedingfield, "Unwritten''
    10. The Isley Brothers, "Shout''
    11. The Temptations, "Get Ready''
    12. India.aire, "There's Hope''
    13. McFadden and Whitehead, "Ain't No Stoppin' Us Now''
    14. Staples Singers, "I'll Take You There''
    15. Orleans, "Still The One''
    16. Sly and the Family Stone, "Everyday People''
    17. The Doobie Bros., "Long Train Running''
    18. Stevie Wonder, "Sir Duke''
    19. John Fogarty, "Centerfield''

Tuesday Nov 04, 2008

This is what it looked like at 6AM this morning at the Amherst St. School in Nashua. I arrived at 5:58. The line was long, but moved quickly. By 6:20, I was done - the 121st voter in the ward.


The line at 6:20 was just as long, but a lttle brighter.


Monday Nov 03, 2008

We made a video of our Music Explaura - a web application that gives you transparent, steerable recommendations. This demo shows how we can start from an artist that we like (Jimi Hendrix), and steer the recommender toward the aspects of Jimi Hendrix that we like (his guitar playing) and away from aspects that we don't like as much (60s psychedelia). The resulting recommendations are much closer to our taste than the typical recommendations. Watch the video:

You can read more about transparent recommendations in this two pager: Creating Transparent, Steerable Recommendations as well as this blog post.

Friday Oct 31, 2008

5EB61AC2-F596-47E6-8991-BD0B68269BF0.jpg They've updated the list of accepted talks/panels for SXSW 2009 and despite being very, very late, my submission "I'm So Sad, My iPod Thinks I'm Emo" was accepted. I feel a little like Charlie Bucket now that I have my very own golden ticket to South By Southwest. This will be fun.

Thursday Oct 30, 2008

Recommender startup Matchmine has shut it doors. CEO Mike Troiano says on the Matchmine blog:
    Today - more suddenly than anyone would have liked - matchmine came to an end. I got word of the decision on Friday, and told the team here this morning. We are shutting the company down immediately, though a few of us will stick around to try and support our partners through a transition, and notify others affected by the closing of our doors.

This article fills in the details.

    The nature of our financing meant that the financial market crisis overtook us more abruptly than most,” Troiano writes. “To my team and their families, our vendors, network partners and prospects, I can only say that I am deeply sorry for the way this comes to a close. And I don’t mean ‘press release sorry;’ I mean really, personally sorry.
We've had lots of interesting discussions with the Matchmine folks over the last year - they are a bunch of smart people - I hope they all find good homes.

This blog copyright 2010 by plamere