Tuesday Apr 15, 2008

In one of the podcasts I listen to, I heard an interview with the guy who writes MarsEdit - a blog publishing tool for the mac. I figured I would try it out. This post is written in Mars Edit. The hookup to blogs.sun.com was pretty easy - MarsEdit wasn't able to auto-detect the blogger interface, but when I indicated manually that the blog was MetaWebLog compatible - it did all the right stuff. I can insert images using MarsEdit as well. Here's a picture I took of an offline recommender system at a bookstore. IMG_0138.jpg

(By the way, this picture shows that even human-based recommender systems can get really wonky - this bookstore suggests that if you like 'Collapse' you may also like "I am Legend" - sure they are books about similar topics ... but they seem to me to be about as similar as Bill Bryson's "A Walk in the Woods" and "The Hobbit".)

I shall try to use MarsEdit for a few blog posts over the next month or so during the trial period to see if it is worth the $30.

Well - after posting here are a few impressions - I was hoping that MarsEdit would scale the image properly. Also, I'm surprised that the editing is not WYSIWYG - it is back to inserting html tags in the post, how last decade. Perhaps there are some setting that I can change to make this work better for me. So far, I'm not excited enough to part with $30

Next month, a couple of students will arrive at Sun Labs to intern over the summer working on aspects of Project Aura and Search Inside the Music.  The interns always bring energy, new ideas, and new music to the labs.  It is a fun time for us old stodgy researchers and hopefully a fun time for the interns as well.  One of our summer interns, Will, has already been thinking quite a bit about issues around recommendation and has been writing about them in his blog.  Some good posts:

Back in February, when we were interviewing intern candidates (and there were many excellent candidates to chose from), my colleague Doug Eck (the brains behind the machine learning on Search Inside the Music) recommended one of his students for the internship.  During the hiring process, I mentioned to  Doug that the competition was pretty stiff - I pointed him to a couple of Will's posts as an example of the competition.  Well, ever the diligent professor, working hard to get his students the best opportunities, 20 minutes later, Doug sent me a link to his student's "blog".  Here's a screen capture (with bits obscured to protect the innocent).

Clearly a highly relevant and insightful blog, covering many fascinating topics.  A must read - immediately added to my set of RSS subscriptions.

Sunday Apr 13, 2008


Steve and I our giving a talk about Project Aura at JavaOne.   The talk is at 12:10 PM on Tuesday, May 6.   It is a good timeslot (not too late and on the first day), but there are lots of excellent talks that occur at the same time.  If you come to the talk, be sure to say hi. 

Here's the talk abstract: 

Project Aura is an open-source recommendation engine, written in the Java™ programming language, being developed by researchers at Sun Labs.

Recommendation technology is a key enabler for the next-generation web. Recommenders will be essential to helping us wade through the huge volume of content such as news, music, video, blogs, and podcasts. However, current recommendation technology that relies on the wisdom of crowds has several drawbacks that can ultimately lead to poor recommendations.

Project Aura takes a novel approach to recommendation, avoiding many of the problems inherent in traditional recommender systems. This session presents the technology behind Project Aura, along with an analysis of how well Project Aura performs. It shows how you can use the Project Aura API to generate high-quality recommendations in your Java technology-based applications. Finally, it shows the capabilities of Project Aura by demonstrating Aardvark, a blog recommender built with Project Aura.

Thursday Apr 10, 2008

I've been a very infrequent blogger lately - I've been spending every free moment coding - getting ready for the Sun Labs open house - our annual labs event where we highlight the work we've done in the last year.  This week, Steve, Jeff and I get to show what we've been working on for the last 6 months or so - Project Aura.

 Project Aura is a web-scale, open, hybrid recommendation system that uses social data (the wisdom of the crowds) combined with the 'aura' of information extracted directly from content or mined from the web to make recommendations.  By combining content-based methods with social methods Project Aura can avoid much of the 'cold start' problems that plague traditional collaborative filtering recommenders, while providing a way to offer explainable recommendations. 

 Project Aura is built on top of Project Caroline - a platform for development and delivery of dynamically scalable Internet-based services.

Here's a photo of our demo setup ... you can see a bit of our visualization of the recommender, and a poster showing our distributed datastore (and steve too!)


Tuesday Apr 01, 2008

Last.fm is sponsoring a new music category in Mozilla’s Extend Firefox 3 contest - the author of the best music-related extension entry will be flown to London to meet  the Last.fm team and will  attend a Last.fm/Presents live event as a guest of Last.fm; and will also receive a Logitech Squeezebox network music player.  Details are here.

Friday Mar 28, 2008

After years of development, the Echo Nest is online and doing business. Yesterday, they announced that www.thisismyjam.com  is using the Echo Nest to create beat-matched mixes for sharing online.   Here's a 'this is my jam' mix that I made with some of the artists I've been listening to recently:


There's lots of neat stuff going on under the hood. They are adjusting song tempos to get a match (I can hear the shift in the song transition from Bjork to Rodrigo Y Gabriela).  When I have a bit more time, I'll be taking a much closer look.  Congrats Brian and Tristan!

Wednesday Mar 26, 2008

Anyone who has worked with a large music collection knows that there are all sorts of difficulties in dealing with the metadata that is associated with the music. The data is often wrong, misspelled, or missing.  Getting the metadata right is a hard problem to solve even with common songs  (should "Hey Jude!"  have an exclamation point or not?) or artists (Is it 'Prodigy' or 'The Prodigy'?). Some artists however seem to go out of their way to make things difficult.  Take our friend Aphex Twin - on his album Windowlicker, track number two is called: 

Mi−1 = −αΣn=1NDi[n][Σj∈C{i}Fji[n − 1] + Fexti[[n−1]]"

I wondered how some of the various Music 2.0 sites were able to handle this track - here are my findings.

First, here's the song in the iTunes store - they don't even try to render it


Amazon.com does a bit better, they don't try to render the mathematical characters, but it is readable:


iLike falls back on the streetname for the song


Google skips the track completely:

As does All Music:

There are some that get it right - MusicBrainz for one:

Our Search Inside the Music research database (curated by Doug Eck) gets it right too, no surprise since we resolve our music against musicbrainz:

 As does last.fm (I think they resolve against MusicBrainz too):

Spotify gets it right too:


So it looks like about 1/3 of the sites got it right, and those are the ones that are using MusicBrainz to clean up their Metadata.   This is one of the many reasons why I really like MusicBrainz .

And by the way, this track is also noted for the fact that Aphex Twin has embedded his image in the audio.  An FFT of the track reveals his mug

Tuesday Mar 25, 2008

Steve and I are giving a talk at JavaOne about our recommendation project.  We submitted our slides a couple of weeks ago and we've received some feedback.  According to the J1 organizers,  our Slide 18 must go!  It is too controversial!   

Our slide 18  shows how social recommenders can be vulnerable to hacking and manipulation. It shows a screenshot of the amazon recommender linking Pat Robertson's "Six Steps to Spiritual Revival" to a gay sex manual.   The story behind the recommendation prank is here.   Apparently that's too racy for JavaOne.  


  We offered this alternative, which shows the Last.fm similar artists for Hillary Clinton - which have apparently been manipulated Last.fm listeners to tie her to politicians with a less than stellar reputation. Once again, this was a  No go with the JavaOne crew - politics can be a bit of a touchy subject too.  

So that leaves us looking for some less controversial examples of recommendation hacking - no sex and no politics allowed.  Our current best candidate is the boring  Vote For the Worst  which is a site that encourages people to vote for the worst American Idol Candidate - surely there's something better than that. Any suggestions?

Monday Mar 24, 2008

xkcd points out yet another problem with shuffle play.  iPod whiplash indeed.

Sunday Mar 16, 2008

On her blog, Anita Lillie (master's candidate at MIT's media lab)  has asked for help finding projects and papers about spatially-based organization of digital music collections. I've posted a few comments pointing to ones that I know about - but I'm sure there are more.  If you know of other interesting spatial interfaces to music add a comment here or over at Anita's research blog.  And as an extra credit assignment, use the Apple SDK to port some of these to the iPhone.

 Here are some of my favorites:

 Hannes Jentche's inteface:


Justin Donaldson's visualization of MyStrands Data

 Fidg't': Visualizer:







 Music Rainbow:

Electronic Boom





Search Inside the Music

Monday Mar 10, 2008

I visited the iTunes music store this weekened.   I was surprised to see that Jeff Buckley's Hallelujah was the number one song.  That certainly was curious since the song has been around for over 14 24 years and Jeff has been gone for over 10.  This morning I visited Amazonmp3 - and there I saw Jeff Buckley's Hallelujah at #3.  Wtf?  Why was a 14 year old song recording suddenly at the top of the charts?  A quick blog search gave the answer.  American Idol contestant (and next teen heart throb, dreads and all) Jason Castro sang "Hallelujah''  last week.  Jason almost seemed to be channeling Buckley with his ethereal singing voice.  American Idol judges Randy and Simon both stated that the Jeff Buckley version of that song was one of their all time favorite songs.  No doubt these comments drove thousands of idol viewers to iTunes and Amazon to check out Jeff Buckley's version of the song.  I am amazed that this little television moment had such an effect on the charts.  It is a good song .. and Jason does a pretty good 2 minute version of the song.

Here's the moment, captured on YouTube.

Tuesday Mar 04, 2008

This week  I zip up to Toronto to take part in a Canadian Music Week panel called 'Recommendation Engines: The ubersolution to Managing Music Ultra Surplus'.  They've put together a great set of panelists:

The only panelist that I've not met is Dr. Gjerdingen - but I am looking forward to meeting him.  He's the co-author of one of the most cited studies in the Music Information Retrieval academic community that looked at how well humans can classify music into music genres.  Just about every paper on machine classification of music cites this study.

 Here's the panel abstract:

New music discovery will be unlike anything any generation has ever experienced. As more music becomes available online and the long tail gets longer, recommendation engines with improved search filters are going to play a more important role in online commerce. With an aggregate of over one billion song versions now available somewhere online (according to Cache Logic), implementing recommendation engineering navigation for this functionally infinite database seems absolutely necessary. This panel features experts who'll bring you up to date on the latest discovery and recommendation news on what would otherwise prove an ungraspable chaos of music overload.

The panel is on Thursday at 2:40 in the Manitoba room of the  Royal York hotel in Toronto.  Be sure to stop by and say hello if you attend.

Monday Mar 03, 2008

It is hard to evaluate music recommendation systems. Current evaluation techniques will often use web mining for artist co-occurrence on web pages or playlists as a way to infer artist similarity to compare against a recommender, or will try to predict a a set of ratings (1 star or 4 star) such as we see with the Netflix prize.  However, these types of evaluations generally don't measure several aspects that are associated with a good recommendation.   For instance,  evaluations that measure how well a recommender system can predict how a user will rate songs are tested against songs that the user has already rated.  This penalizes recommenders that generate good but novel recommendations.  Since the user hasn't rated these novel recommendations yet (since they are novel), the recommender doesn't receive any credit for the recommendations.  A good recommender should recommend novel items, but most recommender evaluations don't evaluate this aspect of recommendation at all.  A recommender that tells you that if you like 'The Beatles' you might like 'The Rolling Stones' may be accurate, and may be evaluated highly, but it is not a great recommender if all it tells you about are artists that you all ready know about.

Probably the best way to evaluate recommenders are with user studies.  Simply ask people how they like they recommendations to rate the recommendations.  However this can skewed results as well.  People will tend to rate recommendations that include many familiar relevant items as better than recommendations that contain a number of unfamiliar items.  Since it can take a good deal of time to evaluate a recommended item (such as a song, artist, movie or book), it is hard to get accurate evaluations for recommendations that contain large numbers of unfamiliar items.

My co-tutorist, Oscar Celma has created a personalized survey for evaluating a number of different types of music recommendation.  Unlike previous evaluations, Oscar's survey recognizes the importance of novel recommendations.  The survey will offer you a number of  music recommendations (based upon your last.fm listening behavior) and ask you questions about the recommendations, including whether or not you've heard the artist or the track before, and to what degree do you like the music.   With this evaluation Oscar can learn which recommenders tend to recommend familiar music, which recommend novel music - as well as which recommenders are recommending relevant music (that is, music that the user will like to listen to).

Oscar has done a good job designing the survey - you can evaluate as many or as few recommendations as you'd like.  The more participants in the survey, the better the results, so I encourage all of my readers to take the survey.  As a reward, tt the end of it all, you may get a few novel recommendations from a state-of-the-art music recommender to expand your music horizons.

Take Oscar Celma's Music Recommendation Survey

Tuesday Feb 26, 2008

Yesterday was the first (hopefully of many) SanFran Music Tech Summit.  This was a gathering of musicians, producers, lawyers, radio heads, and technologists.  The summit was held at the Kabuki Hotel in Japan Town.  Kudos to Brian Zisk and the rest of the organizers for putting this all together. 

I was lucky enough to get to moderate a panel on recommendation and discovery.  The panelists included two technologists that build automated recommendation systems (Michael Troiano - matchmine, CEO and Benjamin Masse - Double V3) and two human recommender systems:  Bill Goldsmith - Radio Paradise, Founder/Coder and Balance - Main Urban Buyer, Rasputin Music.  I thought this was a great mix of panelists.  The human recommenders have a real understanding of what it takes to engage their audience.  Those of us who are trying to build automated recommenders can learn a great deal from these guys.  The panel talked about the characteristics of a good music recommendation.  Some of the observations:

Building trust is extremely important - for human-based recommenders, this trust is built up over a long period of time.  Bill talked about how his listeners over a period of days and weeks grow to trust his taste in music - they know Bill won't lead them too far astray. Similarly, Balance interacts with his customers on a weekly basis, building a relationship over a many weeks and months.  Contrast that with automated systems, Benjamin suggests that they only have 30 seconds or so to gain some level of trust with a web user before the user is ready to click onto another page. 

Bill talked about the ''Tyranny of the Bored'' - where the opinion of people who have nothing to do all day but browse the web, digg stories, tag music, and write reviews have an inordinate amount influence on our taste.  The taste and opinions of busy people, those that don't have time to spend on the social webs is not counted.

They've recorded the panels and have put them online.  The recommendation  panel is here:



Colin took some good shots of the panelists. 





Wednesday Feb 20, 2008

The west coast Sun Labs offers a weekly speaker series called 'Intellectual Desserts' where internal and external speakers present talks on an eclectic series of topics  - mostly technical.   Since I am headed out to the west coast lab next week I figured it would be a good time to give a talk about our new project, so I signed up for the Intellectual Desserts.

It was not until after a signed up for the talk that I looked closely at the speaker schedule. Last week's Intellectual Desserts speaker was Don Knuth.  This week the speaker is Ivan Sutherland. Both Don and Ivan are computer luminaries and Turing award winners - both have had an incredible impact on computer science - they are living legends.  Yes, I am intimidated.  But such is the life in Sun Labs - it is great to be surrounded by so many smart people.

This blog copyright 2010 by plamere