Duke Listens!

Wednesday Jun 04, 2008

MIR Technology goes mainstream

Here's a nifty video that shows some technology developed at BMAT is being used by Operación Triunfo (a Spanish reality/talent show similar to American Idol), to improve the singing of the contestants.

BMAT at Operacion Triunfo 2008 from bmat on Vimeo.

Posted on: Jun 04, 2008

Posted by: plamere

Category: music

Permanent link to this entry | Comments [4] | Comments have been disabled.

Tuesday Jun 03, 2008

Open Research - Background Reading

The story so far - this is an experiment in 'open research' - I'm going to blog my research on a particular topic. In this post, I'm outlining some of the background reading for this paper. Suggestions are welcome

Updates

6/10/2008 - Added references suggested by Elias

Music Seeking Behaviors in Music

Note that many User Oriented papers from the ISMIR community focus on specific tasks such as finding a known song or organizing a personal music collection. I'm looking for papers that focus on user aspects of music exploration and discovery, so any suggestions will be appreciated. Current resources are:

Net, Blogs and Rock 'n' Roll - a book that profiles the 4 listener types (savants, engaged, casual and indifferent)
Fabbri, F. (1999). Browsing Music Spaces: Categories And The Musical Mind. Paper presented at the 3rd Triennial British Musicological Societies’ Conference.
Kim, J. & Belkin, N. J. (2002). Categories of music description and search terms and phrases used by non-music experts. Proceedings of the Third International Symposium on Music Information Retrieval, 209-214.
Laplante, A. & Downie, J. S. (2006). Everyday life music information-seeking behaviour of young adults. Proceedings of the 7th International Conference on Music Information Retrieval: ISMIR 2006.
Lee, J. & Downie, J. S. (2004). Survey Of Music Information Needs, Uses, And Seeking Behaviours: Preliminary Findings. Proceedings of the 5th International Conference on Music Information Retrieval: ISMIR 2004, 441-446.
Vignoli, F. (2004). Digital music interaction concepts: A user study. Proceedings of the 5th International Conference on Music Information Retrieval.
Sally Jo Cunningham, David Bainbridge, Annette Falconer: 'More of an Art than a Science': Supporting the Creation of Playlists and Mixes. ISMIR 2006: 240-245
Sally Jo Cunningham, David Bainbridge, Finding new music: a diary study of everyday encounters with novel songs".
Hypemachine Survey:

Thanks to Jin Ha Lee for her excellent list of resources on Users Aspects in MIR

Genre

Aucouturier, J.J & Pachet, F. (2003) Representing Musical Genre: A State of the Art. Journal of new Music Research.
McKay, C. & Fuinaga, I. (2006) Musical genre classification: is it worth pursing and how can it be. Proceedings of the 7th International Conference on Music Information Retrieval
Jeremy Reed and Chin-Hui Lee: "A Study on Attribute-Based Taxonomy for Music Information Retrieval"

Mood

Stephen Downie: Exploring mood metadata: relationships with genre, artists and usage metadata.

Text Techniques

Coming soon (Stephen can point the way)

Tagging Papers

Coming soon

Evaluation

Coming soon

Visualization

Coming soon (Brooke can point the way)

Posted on: Jun 03, 2008

Posted by: plamere

Category: tagging-research

Permanent link to this entry | Comments [1] | Comments have been disabled.

Monday Jun 02, 2008

Open Research - Research Goals

The story so far - this is an experiment in 'open research' - I'm going to blog my research on a particular topic. In this post, I'm outlining the high level goals for the project.

Executive Summary

Find better ways to exploit the information that are contained in social tags, especially social tags that are applied to music, in order to provide tools to allow an individual to explore a complex space.

Details

Most music recommendations are of the form - 'here's a set of artists or tracks that you might like', or 'if you like "weezer" you might like "cake". They don't offer any reasoning behind the recommendation. These types of recommendations may be appropriate if you are shopping for music at AmazonMp3 or iTunes, but it really is a horrible way to go about exploring for new music.

Music is a very rich space. There are hundreds or thousands of overlapping genre and subgenre. The meaning of a genre changes overtime ("Pop" of 2000 is very different from "Pop" of 1970). Some artists define their own genre, other artists span many genres. There's no single "correct" taxonomy of genres - experts don't even agree on the most basic questions about genre such as where does 'pop' end and 'rock' begin. Genre is complex - and yet it is only one axis of this complex music space. There is also mood, era, lyrics, artist influence, popularity and on and on. Ultimately, I'd like to build a discovery tool that will allow a user to easily explore a rich space such as music. Unlike a traditional recommender that is tailored for a music shopper, this discovery tool would be tailored to someone who is exploring for the enjoyment of discovery. Ishkur's Guide to Electronic Music is an excellent example of the type of interface I am interested in building.

In this one interface is captured the a very rich view of the world of electronic music. There is decade information, high level genre, subgenres, genre dependencies, with examples from multiple artists for each genre along with reviews of the artist or genre. I'd like to be able to build an interface similar to Ishkur's automatically using data that is available from sites like Musicbrainz, Last.fm and All Music. A first step toward this goal is to be able to find information about how various genres and sub-genres overlap. Professionally curated genre hierarchies such as what you'll find at All Music, Mp3.com, Amazon or iTunes, tend to be small and flat and don't show any kind of overlap. The genre hierarchy embedded in the wikipedia is richer, but still doesn't show how the various genres overlap. For this first step I want to build a rich representation of the genre space automatically from data mined from the web. This rich representation includes:

Identifying the set of genres and subgenres that people actually use. (i.e. is 'Brutal Death Metal' really a genre?).
Distinguish genres from other types of music descriptions such as mood, locale
Identify overlaps in genre (how much if at all does 'emo' overlap with 'punk'?, Is 'alternative' just a new name for 'rock'?)
Create a hierarchy of genre. Hierarchies are a natural way to explore a space - A hierarchy allows an explorer to start from with the general and move to the specific. The hierarchy may be a mono- or a poly- hierarchy.
Identify synonyms (Are 'hip hop', 'hip-hop' and 'rap' identical?)
Disambiguate terms (Does 'progressive' mean the same when it is applied to 'progressive rock', 'progressive jazz', and 'progressive metal'?)
Identify changes in genre meaning over time (Is 1960s 'Ska' the same as 1980s 'Ska'?)
For each genre, what are good exemplar artists/tracks for the genre? Are tracks from popular artists better exemplars?

I want to build this rich representation of genre from data mined from the web, without any human intervention, so that as the genre space evolves overtime, this genre map will automatically evolve as well. I've done some early experiments generating genre mono-hierarchies from Last.fm artist tags with promising results. (See this post). However, there is much work to be done to turn that experiment into a real system that that can take a big bucket of social tags, separate the wheat from the chaff, and turn them into a representation that can guide visualization and exploration of the music space.

Next steps - we need to do some background reading, in particular we need to explore a number of areas:

Music seeking behaviors - to understand better how having a rich representation of genre might be used to enhance music discovery (transparency, play)
Genre Hierarchy - explore the issues involved in trying to represent genre in hierarchies or overlapping graphs
Text techniques - we will need lots of text mojo here (disambiguation, clustering, latent semantic analysis, etc.)
Evaluation - we need some way of evaluating our results
Visualization - how do we visualize a complex, overlapping, polyhierarchy?

I am interested in all kinds of feedback. A big goal for me with this 'open research' is to get early feedback. Likewise, suggestions for background reading on the aforementioned topics (or any others that seem relevant will be helpful). In a future post, I'll put together the reading list for anyone that wants to follow along.

Posted on: Jun 02, 2008

Posted by: plamere

Category: tagging-research

Permanent link to this entry | Comments [7] | Comments have been disabled.

Saturday May 31, 2008

Open Research

It is paper review time. I'm on the program committee for a few conferences so I've been reading and reviewing a number of papers. Similarly, I have a couple of submissions that are out for review. The paper review process is really quite interesting. I send the fruits of my hard labor off to some anonymous judges, that spend some unknown amount of time reviewing, commenting and evaluating it and ultimately recommending that my work be accepted or rejected. This peer review process is a critical part of the research process - reviewers can find flaws in thinking or in methodology. They can point out overlapping previous work, They can suggest areas for improvement or even ways to extend the research to new areas. The downside to the peer review process is that all of this critical input comes at the end of the process, when you are finally ready to publish. Its never a good day when a reviewer points out a serious flaw in your paper ("paper should site work by smith that performed identical experiments with better results in 1968").

One way to mitigate this problem is to get early feedback from colleagues and peers. These colleagues can act as earlier reviewers who will (hopefully) point out flaws in the research early on. Sometimes though, it is hard to find colleagues that are motivated and versed enough in the subject matter.

Recognizing that early reviews of research is good, I'm going to try an experiment. I'm going to try to blog my research on a particular topic with the hope that researchers that are interested in the same topic will be able to offer continuous feedback or even contribute directly to the research. My blog will become my research notebook for this topic. I've always had great feedback/comments for my research oriented blog posts - so I'm hopeful that I'll get a continuous review that will keep the research on the proper track. If a paper results from this line of research, constructive critics will be acknowledged, and significant contributors to the line of research will be considered to be co-authors.

To avoid an entanglements about intellectual property and ownership, I'll stick to an area of research that Sun is probably not going to want to patent. Obviously, you shouldn't contribute any ideas that you or your organization may want to protect.

In the next post, I'll outline the goal for the research project.

Inspiration for the research experiment comes from Elias Pampalk, Daniel Lemire and conversations with Steve Heller.

Posted on: May 31, 2008

Posted by: plamere

Category: tagging-research

Permanent link to this entry | Comments [6] | Comments have been disabled.

Friday May 30, 2008

Autoflexing on the Caroline Web Server

There's a nifty new blog, Head in the Cloud, that's all about Project Caroline. The latest post, Carrying the Load: Enabling the CWS Auto-flexing feature is particularly interesting .

Posted on: May 30, 2008

Posted by: plamere

Category: General

Permanent link to this entry | Comments [0] | Comments have been disabled.

Thursday May 29, 2008

A web parasite

Here's a business model:

Web mine all of the data you can from Last.fm
Serve their content on your web page surrounded by Google ads
Profit

That's what MusicSRC is doing. They've wrapped a subset of Last.fm's functionality and data with some Google ads, but they've done nothing new or interesting with the data. Not only are they leeching off of Last.fm's tag data, they are also leeching Last.fm's bandwidth. The hundred or so images that are displayed on every MusicSRC artist page are pulled directly from Last.fm's servers. MusicSRC doesn't even bother acknowledging Last.fm.

Maybe MusicSRC has an arrangement with Last.fm that lets them ignore the Last.fm terms of service, but if that's the case they should be doing something a little bit more compelling with the data than just wrapping it in Google ads. Last.fm has always been extremely generous with their data, and there's a large research community that is depending upon this data. It would be a sad day if Last.fm decided to shut down its web services because of leeches.

Posted on: May 29, 2008

Posted by: plamere

Category: music

Permanent link to this entry | Comments [10] | Comments have been disabled.

Wednesday May 28, 2008

thrash, death, grind, speed and hair

If you ever have trouble trying to parse the difference between all of the many different varieties of heavy metal, be sure to check out this week's Pandora Presents. In this 12 minute podcast, you get a hardcore tour of the various shades of metal.

Here's a bit of a genre hierarchy that we've built from analyzing Last.fm tags.

Cool metal photo by Mithrandir3

Posted on: May 28, 2008

Posted by: plamere

Category: music

Permanent link to this entry | Comments [7] | Comments have been disabled.

Tuesday May 27, 2008

If you like Harry Potter, you might like Harry Potter

Borders has opened their own online bookstore. Previously, they had relied on Amazon to serve as their online presence. Borders apparently decided that letting another bookstore act as their storefront was not in their best interest. One thing that Borders didn't take with them when they left Amazon was all of their user data - at least it doesn't look like they did. They have no collaborative filtering recommendations at they site at all.

The only recommendations they offer, as far as I can tell, are recommendations for books that have similar metadata. Instead of the Amazonian 'people who bought X also bought Y' Borders offers a "you may also like" recommendations that consist of books that have the same author or of the same genre/category. So if you like Harry Potter, Borders suggests five other Harry Potter books. If you like the Da Vinci Code, Borders suggests 5 other books by DVC author Dan Brown. For a book like Six Degrees: The science of a connected age by Duncan Watts, Borders offers no recommendations whatsoever. Malcolm Gladwell's Blink is classified as a 'self improvement' book - Borders freakomends A New Earth: Awakening to Your Life's Purpose. (It is a bit telling how sparse the Borders user data is when Borders tells me that I can be the first to rate 'Blink').

In the Borders music store, things seem to be a little better. They are showing relevant similar artists. It may be that they are getting this data from All Music since they are already using AMG for artist bios.

I find the lack of relevant, novel recommendations at the Borders bookstore to be quite puzzling. In its brick and mortar stores, Borders has, no doubt, collected terabytes of data about who likes what books. And yet, Borders doesn't seem to be using any of this data in its online store to help connect people with books. Amazon has reported that 35% of its sales are a direct result of their recommendations - so it seems crazy that Borders is not taking advantage of their data to recommend relevant books. With a good recommender, Borders could be selling a whole lot more books.

Posted on: May 27, 2008

Posted by: plamere

Category: freakomendations

Permanent link to this entry | Comments [3] | Comments have been disabled.

Thursday May 22, 2008

Hacking Recommenders

Here's a rather unusual recommendation. If you are interested in this book by Pat Roberston, the TV Evangelist and notorious homophobe you may be interested in this gay sex manual. This recommendation is not only unusual but it created a bit of a stir in the press and was an embarrassment for Amazon

This freakomendation is a result of a small group of folks purposely manipulating the Amazon recommender. These folks merely had to visit Pat Robertson's book and then visit another book. By coordinating their actions and visiting the same book (the gay sex manual), they were able to manipulate Amazon into making the recommendation that would make Pat Roberston blush.

Credit to Bamshad Mobasher for this freakomendation and the screenshot.

Posted on: May 22, 2008

Posted by: plamere

Category: freakomendations

Permanent link to this entry | Comments [0] | Comments have been disabled.

Wednesday May 21, 2008

Online music recommendation - a dramatic reenactment.

Our buddies over at matchmine created this video that gives you an idea of what online music recommendation is really like.

How Online Music Recommendations Work Right Now from Nathan Burke on Vimeo via the Matchmine blog

Posted on: May 21, 2008

Posted by: plamere

Category: freakomendations

Permanent link to this entry | Comments [1] | Comments have been disabled.

The 800 year download

Well, tonight is the season finale of American Idol, where we get to find out which of the two remaining Davids is our American Idol. It will take Fox two hours to reveal that single bit of information. That is a pretty low data transmission rate - 0.5 bits per hour. If we downloaded an MP3 of a 3 minute song at the rate of 0.5 bits per hour we'd be waiting about 800 years for the download to finish.

Posted on: May 21, 2008

Posted by: plamere

Category: General

Permanent link to this entry | Comments [0] | Comments have been disabled.

Tuesday May 20, 2008

That's what she said.

I stumbled upon this fascinating Flickr photo set of the World's Largest Organ. Guiness book of world records lists this organ as "largest pipe organ", "largest musical instrument ever constructed", and "loudest musical instrument ever constructed". The organ has > 32,000 pipes, >1,200 stop, 7 manuals (2 with 7 octaves). The low C pipe stands 64′9″ tall, and weighs 3,350 pounds. It produces a frequency of 8 Hz (the sound of the vibrating pallet is described as "a helicopter hovering over the building").

The pictures are fascinating. Check out the whole set.

This video gives a little of the story about the organ:

Posted on: May 20, 2008

Posted by: plamere

Category: music

Permanent link to this entry | Comments [1] | Comments have been disabled.

Findbugs to the rescue

I'm continuously amazed at the level of detail that findbugs will go to weed a codebase of bugs. Here's one that I came across today in our codebase:

        
     String[] segments = path.split(file.separator);

File.separator used for regular expression The code here uses File.separator where a regular expression is required. This will fail on Windows platforms, where the File.separator is a backslash, which is interpreted in a regular expression as an escape character.

I don't live in the Windows world, so it would have been months or years before I would have encountered this problem in the wild. Thanks FindBugs team!

Posted on: May 20, 2008

Posted by: plamere

Category: Java

Permanent link to this entry | Comments [1] | Comments have been disabled.

Monday May 19, 2008

iPhone heaven

I've had my iPhone for about 5 months now. I really like having the web in my pocket all of the time. But when I take my nightly walk, I often leave my iPhone behind and bring along my dusty old 3rd generation iPod. My iPhone, with its smaller memory, never seems to have the right songs on it. I've tried to make smart playlists that freshen the iPhone with new tracks while keeping many of the old favorites - but it always seems that when I'm a couple of miles from home I get a hankering for a song that just isn't on my iPhone. And so I bring along the big old iPod. With its 9,000 tracks, it almost always has what I'm looking for.

I make my living thinking about how to use technology for organizing and discovering music. If I can't keep my iPhone fresh with the music that I want to listen to, then it is probably hard for most other people too. It is just too hard to manage a device that can't hold all my music. So I'm hoping that the next generation iPhone will have much more memory. I'd really like to see the iPod classic combined with the iPhone - 160gb of music would be iPhone heaven.

Posted on: May 19, 2008

Posted by: plamere

Category: General

Permanent link to this entry | Comments [3] | Comments have been disabled.

Freakomendations - Netflix

Because you enjoyed F**k, you will need to see In The Womb (or it may just sit on top of your DVD player for 9 months).

I'm really not making these up.

Posted on: May 19, 2008

Posted by: plamere

Category: freakomendations

Permanent link to this entry | Comments [1] | Comments have been disabled.

Duke Listens!: Visit my main blog at MusicMachinery.com

Wednesday Jun 04, 2008

Tuesday Jun 03, 2008

Table of Contents

Updates

Music Seeking Behaviors in Music

Genre

Mood

Text Techniques

Tagging Papers

Evaluation

Visualization

Monday Jun 02, 2008

Executive Summary

Details

Saturday May 31, 2008

Friday May 30, 2008

Thursday May 29, 2008

Wednesday May 28, 2008

Tuesday May 27, 2008

Thursday May 22, 2008

Wednesday May 21, 2008

Tuesday May 20, 2008

Monday May 19, 2008

About this weblog

Index

Your Current Location