Thursday Jul 31, 2008

Jennie has written up a blog review of, in particular how uses tags to help make recommendations - she shows the tag cloud for the Jonas Brothers and even points out that the Jonas Brothers is getting tag-spammed with metal.

I think she's getting ready to do a few slides about hacking and social tags at the tutorial that Elias and I are giving at ISMIR this year. That takes some of pressure off me for sure.

I also noticed that in her post she cleaned up the language in the tags. Pop artists tend to attract the haters that enjoy tagging vandalism. The Jonas Brothers tag cloud is filled with words that a typical Jonas Brother fan should not be exposed to. Perhaps, needs a safe-tags setting to filter out the profanities so my 13 year old daughter doesn't have to.

My colleague, Steve is always bragging about how he's a second generation searchguy (his Dad did IR on punch cards). But now I can top him with my second generation music folksonomist progeny. In your face, Steve!

Friday Jul 25, 2008

The Program Committee for ISMIR has just posted the set of accepted papers for ISMIR 2008. There are lots of really interesting looking papers. Some that look especially interesting to me:
  • Music Thumbnailer: Visualizing Musical Pieces In Thumbnail Images Based On Acoustic Features
  • Oh Oh Oh Whoah! Towards Automatic Topic Detection In Song Lyrics
  • Accessing Music Collections Via Representative Cluster Prototypes In A Hierarchical Organization Scheme
  • Collective Annotation Of Music From Multiple Semantic Categories
  • Combining Features Extracted From Audio, Symbolic and Cultural Sources
  • Content-Based Musical Similarity Computation Using the Hierarchical Dirichlet Process
  • Development Of A Music Organizer For Children
  • Five Approaches to Collecting Tags For Music
  • Hit Song Science Is Not Yet A Science
  • Learning A Metric For Music Similarity
  • Oh Oh Oh Whoah! Towards Automatic Topic Detection In Song Lyrics
  • Playlist Generation Using Start and End Songs
  • Social Playlists and Bottleneck Measurements : Exploiting Musician Social Graphs Using Content-Based Dissimilarity and Pairwise Maximum Flow Values
  • Support For Mir Prototyping and Real-Time Applications In the Chuck
  • Ternary Semantic Analysis Of Social Tags For Personalized Music
  • The Quest For Musical Genres: Do the Experts and the Wisdom Of Crowds Agree?
  • Uncovering Affinity Of Artists to Multiple Genres From Social Behaviour Data
  • Using Audio Analysis and Network Structure to Identify Communities In On-Line Social Networks Of Artists

Thursday Jul 24, 2008

One of the nifty new features of the recently revamped is that every tag that ever has been applied by users to music now has its own page that contains everything knows about the tag. The page shows the top artist for the tag, a wiki-style description of the tag, and a shoutbox where listeners can have a running conversation about the tag. The shoutbox seems like it will be a lot of fun - it's the place to go if you want to argue the finer points of technical death metal vs. melodic death metal. The current dialog in the emo shoutbox is particularly entertaining as fans try to protect their listening turf. Some selections:
  • And what is it with everyone with the same goddamn haircut!
  • Haha, it says Iron Maiden is emo.
  • 'Flex94 tagged My Chemical Romance as ‘emo’.'What the hell is that.?
  • James Blunt and Iron Maiden emo?! O.o
  • OMG!! some people really don't know what is emo and how it sounds.... AFI, FOB, Blink182, Bullet for my valentine, Good Charlotte, Sum41, Placebo, Him, 3DG, Nirvana, Slipknot, 50cent... - EMO??? WOW... X_x some listeners are really stupid!
This is all getting a bit meta - people commenting on tags that other people apply to music. Maybe we can take this a bit further and let people moderate the comments that people are making about the tags that others are applying to the music. And then we can have meta-moderators to moderate the moderators ... ow - my head hurts.

Wednesday Jul 23, 2008

2008 is looking to be the year that researchers start paying attention to social tags in a big way. At the recent AAAI Workshop on Intelligent Techniques for Web Personalization and Recommender systems, one third of the talks were devoted to social tagging topics. Oscar tells me that 5 out of the 30 papers at the upcoming RecSys 08 are devoted to social tagging topics.

This year's ISMIR will be no different. There's lots of interesting research around tags and music information retrieval - researchers are looking at filling out the tag cloud (autotagging and tagging games), extracting information from tags (latent semantic methods), cleaning up tags, using tags to assist in training of classifiers, using tags for recommendation, discovery and exploration. There seems to be no end to the interesting things we can do with social tags.

At this year's ISMIR, Elias Pampalk and I will be presenting a tutorial on social tags and music. In the tutorial we will be looking at the state-of-the-art in commercial and research systems that use social tags. I'm rather excited about the tutorial - the topic is just so fascinating to me - and it is great to work with Elias especially with his deep knowledge of the tagging data (and his new found statistical expertise).

If you are attending ISMIR this year and have an interest in social tags, consider signing up for the tutorial.

Friday Jul 18, 2008

I got this from Brooke Maury. This is an experiment in how ideas and code spreads over the Internet. If you feel like participating, click the 'spread it' button and add your blog/website to the graph.

One of the problems that plagues recommender systems is the New User problem - how do you give recommendations to someone who has just come to your site for the first time? Well, you could ask them lots of questions about themselves, but that can be off-putting, most folks don't want to wait even 3 seconds for a web page, how can you expect them to fill out an extensive questionnaire describing their likes and dislikes? You could just ignore the problem and generate recommendations based upon the slimmest of data, like a single favorite artist. Or you could just recommend what is popular. But you run the risk of alienating your new visitor with these generic, non-personalized recommendations.

It would be nice if people carried around some representation of their taste so when they visited a new site, they could get good recommendations right away. This certainly would be good for the individual - they would be spared from having to re-enter all of their preference data every time they visit a new site - and it would be great for every web startup that wants to make sure they deliver the most appropriate content to their visitors, even for first time visitors.

There are a number of attempts to create a representation for portable taste data. There's the Attention Profile Markup Language (APML), there's the Matchkey. Now there's one more - OpenTaste. OpenTaste an open standard for making your online persona portable. OpenTaste profiles allow you to capture your preferences, store them online, and share them --- or not --- with OpenTaste-enabled websites. Technically speaking OpenTaste is a open protocol standard based on OAuth and OpenID that enables the markup of preference semantics in RDF/XML, XML, and HTML with microformats. In some ways you can look at OpenTaste as APML with semantics. The XML version of OpenTaste even uses APML syntax to represent explicit and implicit attention, but then layers analysis models for attention and preferences, along with semantic relationships to ensure that the concepts are clear and unambiguous.

OpenTaste is a new effort being spearheaded by strands - the site has just gone live, with specifications, examples and schemas. More info is promised in the OpenTaste blog. Given that one of the categories on the blog is 'W3C' - it looks like OpenTaste may be going the formal standard route. Soon, perhaps there will be standard way of specify portable taste data - solving the New User problem once and for all.

I'm sitting here in my office minding my own business, but all I can hear loud laughing from Steve's office next door. He tells me he's reading Sten's latest post about Javaone.

Wednesday Jul 16, 2008

At this year's International Conference on Music Information Retrieval, they've added a session where researchers can highlight late breaking or preliminary results or show technology demonstrations. These submissions are not reviewed, any submitted abstract will be accepted. If there are more submissions than are allowed, then "presentations that contribute to a balanced session will be given priority". The conference website has had a server problem and has been down for a week or so, and since the deadline is approaching, I thought it would be good to remind folks of the deadline.

Submission deadline: July 24, 2008

Submission details:

Late-breaking / demo submissions: A special session during the ISMIR-08 schedule will be dedicated to:

  1. the presentation of preliminary results and ideas which are not yet fully formed nor systematically evaluated; and
  2. the demonstration of applications that are of interest to the MIR community.
These “late breaking / demo” submissions will not be reviewed and will not have associated papers on the proceedings. They have been assigned a late deadline. Submissions should be emailed to: [email protected] as a pdf file containing the title, author(s) and affiliation(s), and a 100-200 word abstract outlining the work to be presented. Abstracts will be published online. Should there be more submissions than allowed by the available presentation space, presentations that contribute to a balanced session will be given priority. Limitations on the number of papers per author apply.

Tuesday Jul 15, 2008

Over the weekend I traveled to Chicago to attend the AAAI workshop on Intelligent Techniques for Web Personalization and Recommender Systems. At the workshop 10 research groups presented their work, including 3 papers that were related to using social tags for recommendation. I found the two talks being presented by the DePaul researchers to be particularly interesting and relevant to the work we are doing here in Sun Labs.
  • Personalization in Folksonomies Based on Tag Clustering by Jonathan Gemmell, Andriy Shepitsen, Bamshad Mobasher, Robin Burke - describes using unsupervised clustering of social tags as intermediaries between a query and and a set of items. Terms in the query are weighted based upon their affinities to particular clusters to help disambiguate queries. For example, if I am a delicious user and I tag lots of resources with tags related to programming, and I issue a search query with 'java' in it, I'm more likely to be given search results related to the Java Programming Language instead of results about Starbucks or South Pacific Islands.
  • A Framework for the Analysis of Attacks Against Social Tagging Systems by JJ Sandvig, Runa Bhaumik, Maryam Ramezani, Robin Burke, Bamshad Mobasher - in this talk JJ presented their framework for analyzing attacks against social tagging systems and showed how they could use the framework to evaluate the impact different types of attacks could have on retrieval algorithms built on top of tagging systems.
At the end of the talks we had an interesting discussion session where we explored some of the challenges in recommendation. It was fascinating to hear the different perspectives.

I really enjoyed the workshop and was glad to be invited. It was really well run - and the papers were top notch. Apparently, they only had room to accept about 30% of the submitted papers. They should be posting the papers on the workshop site soon, along with the slides from my talk.

If you have an iPhone and you like music just drop whatever you are doing and download and install the free Shazam iPhone app. Shazam is a song-id application. If you hear a song and would like to know the name and the artist, just fire up the Shazam app, wait a dozen seconds while Shazam listens, and Shazam will tell you the artist, album and track. It uses audio fingerprinting technology to make it all happen.


Now I've seen lots of audio fingerprinters in my day (we've even created our own here in the labs), but I've never had the chance to use one in the wild like this, and it is a lot of fun to use - and I get a kick every time it gets it right (which is almost always). The app really feels like magic. It also is quite robust to noise and listening conditions. I tried it with my little laptop speakers while sitting in the kitchen with the dishwasher running and my lovely wife talking to me. I've tried it in the car, on the highway with the windows open. It seemed that as long as I can hear the music, so can Shazam.


In my experience, Shazam is also really accurate - it has almost always given me the correct answer. Sometimes, it doesn't know the answer, and it will tell me so - but if it does give me a result, I can count on it being correct. The Shazam index of music seems to be pretty big. It was able to identify every song on the local head-banger radio station with no trouble, along with most things I tried in my personal collection, including recent releases like the latest Weezer album. Unsurprisingly, It was not able to identify songs that were pretty deep into the long tail - it didn't recognize "Harry and the Potters", or any of my Magnatune music.

Shazam keeps track of your 'tags' - the songs you've ID'd with the app. This lets you keep a log of all of the tracks that you've encountered in a day so you can follow up later on. It creates links to iTunes so you can buy the track, as well as links to YouTube videos of the track or artist. You can even take and attach a photo to a tag to help you remember the context of the tagging.


The application info page for Shazam says that it only going to be made available for free for a limited time, so grab it for free while you can. It really is a cool and useful app.

Update - For the curious, Shazam presented a paper describing the technology behind their fingerprinter at ISMIR 2003. The paper is a good read: An Industial-Strength Audio Search Algorithm (Note that the main ISMIR site is down, so this is a pointer to a cached copy at Columbia).

Monday Jul 14, 2008

Today I tried out the new iPhone client. It is pretty cool. It captures the most relevant bits of All of my radio stations are there: Personal Radio, Recommended Radio, Loved Tracks Radio, etc. I can see upcoming events, my personal music charts.


The Now playing screen is especially slick - you can 'love' or 'ban' a track, read an artist bio, view the artist tags, find events related to the 'now playing' artist. It will scrobble the tracks it plays.

Probably the best part of the player is the sound quality. It sounds really good when compared to some of the other iPhone internet radio clients. (From what I can tell, I don't think Pandora's client even has stereo output (for shame!)).

I do have a few quibbles .. but since this is a 1.0 client, I'm sure they will be addressed right away.

  • First of all,'s corporate color is red - which on the iPhone is typically reserved for really important things like "End the call". The client however uses red everywhere - so when I look at the client I always think there's something wrong.
  • I wish I could tag artists or tracks with the iPhone. Right now, I can only look at the tags that have been applied to the tracks.
  • The volume control only lets me turn the volume up. Sometimes, I like to turn my music down too.
  • You pay a price for the good sound quality. It took a couple of minutes to buffer a song over the edge network, and it was rather unreliable, with pauses mid-song while more of the song was buffered. I wouldn't recommend using the client unless you are connected to a WiFi client. When connected to WiFi it would take 10 seconds or so to buffer a song and even on WiFi once in a while, the playback of a song would pause while more of the song was streamed into memory. Not a good listening experience.
  • Probably the biggest problem is that as soon as you want to do anything else with your phone, the client quits and the music stops. Of course, this is not's fault, but it does take away from the overall enjoyment of the music if you have to start a new listening session after every time you need to check your mail.
I hope to see more fu in the next version of this app. There are lots of interesting things that could be added - geotagging of music, finding people nearby that have similar music tastes.

The iPhone client is nicely done - the price is right - so if you have an iPhone ... you should get this app.

Friday Jul 11, 2008

From the Radio Paradise home page:

Fire update: (7/9/08) As you might have heard, the town of Paradise is once again threatened by a fire. This time, our house/office is within the precautionary evacuation zone, but we're safely out of the area. As long as the air-conditioner in our server room holds out (it's expected to be 106 degrees F today) there shouldn't be any interruptions in our service.

Radio Paradise is my favorite Internet Radio station. I really hope that the fires bypass Bill's server room (and house/office). We don't want to see Radio Paradise turn into Radio Inferno.

Monday Jul 07, 2008

On a very rare occasion, I'll see a book in a bookstore that will scream to me 'buy me'. This book is one of those .. edited by Dj Spooky, with a forward by Cory Doctorow, and from MIT Press I knew I couldn't pass this book up. Sound Unbound: Sampling Digital Music and Culture The back cover blurb: If Rhythm Science was about the flow of things, Sound Unbound is about the remix--how music, art, and literature have blurred the lines between what an artist can do and what a composer can create. In Sound Unbound, Rhythm Science author Paul Miller aka DJ Spooky that Subliminal Kid asks artists to describe their work and compositional strategies in their own words. These are reports from the front lines on the role of sound and digital media in an information-based society. The topics are as diverse as the contributors: composer Steve Reich offers a memoir of his life with technology, from tape loops to video opera; Miller himself considers sampling and civilization; novelist Jonathan Lethem writes about appropriation and plagiarism; science fiction writer Bruce Sterling looks at dead media; Ron Eglash examines racial signifiers in electrical engineering; media activist Naeem Mohaiemen explores the influence of Islam on hip hop; rapper Chuck D contributes "Three Pieces"; musician Brian Eno explores the sound and history of bells; Hans Ulrich Obrist and Philippe Parreno interview composer-conductor Pierre Boulez; and much more. "Press 'play,'" Miller writes, "and this anthology says 'here goes.'".

Saturday Jul 05, 2008

It seems like there's always something to do ... I ended up spending a good part of my Independence Day working on a couple of journal papers (mostly on one, the co-authors are doing the good work on the other). I'm also preparing a talk for next weekend's AAAI Workshop on Intelligent Techniques for Web Personalization and Recommender Systems. With the two journal articles due this week along with a talk, I'll be pretty busy ... but I really want to take a break for a day and take #4 daughter on hike in the White Mountains - here's one of my favorite spots:

Monday Jun 30, 2008

Instead of relying on machines to recommend music, John Scalzi does something radical ... he asks his readers to recommend some music to 're-hipify' him. He's received lots of interesting recommendations - funny thing, no one asked John what kind of music John likes ... so apparently human recommenders are not stymied by the 'new-user' cold-start problem. They'll just recommend things that they like. John does hint that he doesn't like Emerson, Lake and Palmer - what's with that? Well ... no recommendations from me!

This blog copyright 2010 by plamere