Thursday May 05, 2005

One of the big problems faced by Music Information Retrieval researchers is how to get good data for MIR experiments. Even a small scale music classification experiment may require 10,000 songs. At $1 per song, that's $10,000 just for the data for a small experiment. Experiments with larger collections (100,000 to 1,000,000 songs) become impossible.

The situation is about to get even worse. Soon, we may lose all programmatic access to our music. As more digital music is sold (iTunes) or rented (napster) more of our music is wrapped up in a DRM container. The only thing we can do with such music is to play it with an authorized player. Doing anything else with the bits is forbidden. Even trying to get at the bits is forbidden thanks to the DMCA.

There are lots of things music consumers could do with the bits: music similarity classification, beat and tempo detection, cross fading from one song to another. There are many more things that MIR researchers can do with the bits. All will be lost if all of the bits are taken away from us. Our music will become 'listen-only'.

The fight for the bits is not over yet. There are some who are pushing for a sane DRM policy that protects IP but also promotes innovation and ensures access. Sun's Susan Landau is on the frontlines in the DRM policy battle. Susan gave a talk at the recent Sun Labs Open House called: Rocky Shoals and Bright Lights: DRM Directions that describes a "DRM policy direction that is a win/win/win for consumers, for technology developers, for content producers -- and for the Internet and society".

Wednesday May 04, 2005

I was starting to look around for a client side blog editor. I was getting tired of typing html tags all of the time. It seemed that all of the really cool client side blog editors required a Mac. Well, that wouldn't do for me. But today I found out that BlogEd has received a major facelift. I'm using it right now to create this entry. I can do all the formatting tricks such as:
  1. Create a numbered list
  2. Put things in bold
  3. Put things in italics and color
  4. I can easily put in things like the JavaTM platform with superscripts

bblfish is responsible for getting it to work with roller. Thanks Much!


For the Search Inside the Music Demo, I wanted to be able to visualize the music space by showing a plot of the music space. I looked at a number of charting packages for Java and settled on JFreeChart. JFreeChart is a free (LGPL) chart library that supports a large number of chart types from pies, bars, lines and scatter plots. The JFreeChart API is quite flexible. I was able to easily wrap my code around a JFreeChart to make the charts interactive. Here's an example of a visualization of the music space that shows a musical journey from one style of music to another. All using JfreeChart:

Tuesday May 03, 2005

Rhapsody joined Napster this week in offering a "to go" service that will allow subscribers to move their songs onto Janus-compatible devices. For $15 per month, Rhapsody customers can load up their portable players with as many songs as they want (and will fit) in a month. As long as they pay their bill they have access to the entire Rhapsody catalog on their device ... but when they stop paying the monthly fee, all the songs go away. Currently there are only three Janus-compatible devices listed on the Rhapsody site: the iriver h10, the creative Zen Micro and the Dell DJ.

Monday May 02, 2005

I'm a regular reader of The Register so I was quite pleased to see this extensive coverage of the Sun Labs Open House. Correspondent Andrew Orlowski took a tour of labs and offers highlights in the article Inside Sun Labs. Mr Orlowski offers this observation about the Search inside the Music project:

    The psychological insight necessary to produce a divine playlist doesn't come from an algorithm, or even an encyclopedic knowledge of music, but from experience and understanding. I must say that I agree with Mr. Orlowski wholeheartedly. A divine playlist (or even a not-that-bad playlist) can have the subtlety and nuance of a poem. Musical themes, lyrical themes, mood, instrumentation, rhythm and tempo can interact as one song leads into another. A good playlist is its own art form. A visit to The Art of the Mix demonstrates that the art form is alive and well.

    But just because a computer can't generate a divine playlist doesn't mean that it can't greatly improve upon "iTunes' lamentable Party Shuffle". The problem I'm (and many others in the MIR community are) trying to solve is how to deal with a million song iPod. Shuffle play won't scale, and there aren't any DJs waiting to program my tastes into the proper playlist poetry. I'd much rather have a million song iPod that could generate a playlist that at least contains songs that I like, even if they aren't in the divine order.

    Sunday May 01, 2005

    One of the interesting things about giving a technology demo such as "search inside the music" is that you are exposing a whole lot of folks to technology that they haven't seen (or heard) before. Folks sometimes assume that what they are seeing is all new technology.

    While there is lots of new and interesting technology in the 'search inside the music' demo, there are also some technologies that have been explored previously by a number of researchers in the Music Information Retrieval (MIR) community. Some of the most interesting and directly related work is the work by Logan, Ellis and Berenzweig summarized on Dan Ellis's Music Similarity page, and the work by George Tzanetakis on Manipulation, Analysis and Retrieval Systems for Audio Signals.

    The International Symposium of Music Information Retrieval (ISMR) has a good collection of papers (although their site seems to be down as I write this) about music classification based upon similarity. Likewise, the site provided by Dr. Stephen Downie is an excellent starting point for those who are interested in learning more about music information retrieval.

    One of the downsides of the Sun Labs open house is that I'm so busy giving demos and talks that I can't attend any of the other talks. One talk in particular that I was hoping to see was Guy Steele's talk on Fortress. Fortress promises to Do for Fortran what Java did for C. In particular:

    • A growable, open language
    • Components: management of large projects
    • Distributed data and control models
    • Type system organized as objects and "traits"
    • Advances in syntax

    Fortunately, all of the public talks are now online. The audio and the slides for Guy's talk are available at the Sun Labs open house documentation kiosk. Worth checking out.

    Friday Apr 29, 2005

    I've spent the last two days at the Sun Labs Open House giving demonstrations of our current project: Search Inside the Music. Lots of people came by to see what we were doing. It was non-stop demoing for two days and I've never talked so much in my life. It was great fun to see so many people interested in this technology. There's a little write-up on the project at Vnunet: Sun unveils all-knowing music library

    Wednesday Apr 20, 2005

    When I'm working closely with a Java API, I will usually find myself, sooner or later, digging into the API source code (if it is available) to see how a feature really works. This usually means finding the separate source download (or even configuring CVS to pull the code from some repository someplace). A pain in the neck to say the least, but the code provides the definitive description of what is going on, so sometimes it's worth it. However, all the pain of finding, downloading, unpacking and browsing the source can be eliminated.

    Since version 1.4, JavaDoc has included a '-linksource' option that will automatically create an HTML version of each source file and link it into the normal documentation. With the -linksource option, the class and method names in the docs will contain links to the source code. When browsing docs built with the -linksource option, I don't have to break away from the docs to find the corresponding source file, it is all right there in the docs. For an example of how this work check out the JFreeChart API docs.

    Tuesday Apr 19, 2005

    For the past 3 years or so I've been a member of the JSR 113 Expert Group defining the next generation of the Java Speech API. The specification is now in public review. If you have an interest speech on the Java platform take a look at the specification and send your comments to [email protected]

    Major Changes from JSAPI 1

    • Compatibility with JSAPI 1.0 maintained where possible and desirable.
    • Primarily aimed at J2ME, runs on J2SE as well.
    • Designed for CLDC 1.0 and MIDP 1.0 and above.
    • All use of floating point has been removed.
    • No dependence on AWT.
    • Dictation has been removed, but can be added in the future.
    • Supports W3C SRGS recognition markup rather than JSGF.
    • Supports W3C SSML synthesis markup rather than JSML.
    • Event masks added to filter events.
    • Includes a full-fledged AudioManager compatible with JSR-135.
    • AudioSegment class introduced to contain audio data.
    • The Word class now supports specification of audio for use with recognition and synthesis.
    • Clearer delineation between engine implementations and instances.
    • Support for priorities.
    • Support for trusted vs. untrusted applications.
    • More complete support for "built-in" grammars.
    • Recognition result confidence added.
    • Support for Synthesizer focus

    The public review period ends on April 23.

    Thursday Apr 14, 2005

    I'm currently reading the excellent book Freedom of Expression: Overzealous Copyright Bozos and Other Enemies of Creativity by Kembrew McLeod. The overarching theme is how current copyright law has been taken over by corporate types and instead of promoting the creation of new work, stifles it. Here's an excerpt about what happened to the Verve when they sampled 5 notes from a Rolling Stones song:
    The Verve, a popular British band that scored a major worldwide hit in 1997 with “Bittersweet Symphony.” The Verve negotiated a license to use a five-note sample from an orchestral version of one of the Rolling Stones’ lesser hits, “The Last Time,” and received clearance from Decca Records. After “Bittersweet Symphony” became a hit single, the group was sued by former Stones manager Allen Klein (who owns the copyrights to the band’s pre-1970 songs because of aggressive business practices). He claimed the Verve broke the agreement when they supposedly used a larger portion than was covered in the license, something the group vehemently disputed.

    The Verve layered nearly fifty tracks of instrumentation, including novel string arrangements, to create a distinctly new song. In fact, the song’s signature swirling orchestral melody was recorded and arranged by the Verve; the sample from the instrumental record is largely buried under other tracks in the chorus. The band eventually settled out of court and handed over 100 percent of their songwriting royalties because it seemed cheaper than fighting for a legal ruling that might not end in their favor. As if things couldn’t have gotten worse, they were then sued by another old Rolling Stones manager, Andrew Loog Oldham. Klein went after the Verve for infringing on the songwriting copyright, which he owned, but Oldham possessed the copyright on the sampled sound recording. They totally lost everything.

    Not only couldn’t the Verve earn money from their biggest hit, they were stripped of control of their song. For instance, after the group refused Nike’s request to use “Bittersweet Symphony” in an ad, the shoe manufacturer aired the song after it purchased a license from Allen Klein. “The last thing in the world I wanted was for one of my songs to be used in a commercial,” the despondent lead vocalist Richard Ashcroft said. “I’m still sick about it.” In one final kick in the groin, “Bittersweet Symphony” was nominated for a Grammy in the Best Song category, which honors songwriters. Because the unfavorable settlement transferred the Verve’s copyright and songwriting credit to Klein and the Rolling Stones, the Grammy nomination went to “Mick Jagger and Keith Richards.” Ashcroft quipped that it was “the best song Jagger and Richards have written in twenty years.” He then suffered from a nervous breakdown and the group broke up.

    By the way, Freedom of Expression is licensed under a Creative Commons License and is available online.

    Monday Apr 11, 2005

    Starting in 1989 when my first child was born I stopped listening to new music. It was replaced by the likes of Raffi and Steve Blunt. I missed about 15 years of music (which gets a bit embarassing, such as when someone mentions 'beck' and I'd start talking about jeff beck, not even considering the notion that they'd be talking about another 'beck'). Now I'm trying to make up for this lost decade and a half. I've been trying to expand my musical horizons.

    A little while ago, Simon Phipps pointed me toward Radio Paradise a streaming internet radio station. They have a great mix of new and old music. RP is like old time radio in that they have DJs that think about how songs fit together and flow smoothly between genres. Also, listeners can rate songs which affect the playlists. Here's the playlist for the last hour or so:

    4:09 am - Yo La Tengo - Autumn Sweater
    4:06 am - Blood Oranges - Bridges
    4:02 am - Malcolm Holcombe - Goin' Home
    3:59 am - Tom Waits - Long Way Home
    3:54 am - The BoDeans - Good Things
    3:51 am - Mazzy Star - Halah
    3:46 am - Cranberries - When You're Gone
    3:41 am - Zero 7 - Give It Away
    3:37 am - REM - Try Not To Breathe
    3:30 am - King Crimson - The Power To Believe II
    3:25 am - Mark Knopfler - What It Is
    3:20 am - Calexico - Quattro (World drifts in)
    3:16 am - Santana - El Farol
    It's all good stuff. I'm listening to artists that I've never heard before. If I hear something I really like I can click on the 'now playing' link and find out more about the artist (or buy the CD from Amazon). I'm starting to get an idea of what happened musically in the 90s, and what's going on today too.

    Saturday Apr 09, 2005

    I was ripping all my CDs into OGG format - they sounded good, and the file size was small. However, when I got a portable MP3 player that didn't support OGG format I had to convert them all to MP3s. Instead of re-ripping, I converted the OGGs to MP3s (I know, audio purists shudder at such a thought). Only later after I had deleted all of the OGGs did I realize that the tags identifying artist, album and title didn't get added to the MP3s. I had thousands of MP3s with no MP3 ID tags. Arghh!

    Faced with the prospect of re-ripping hundreds of CDs, I set out in search of an automatic MP3 tagger. After looking at a number of them I settled on Entagged - the Musical Box. I like entagged for a number of reasons:

    • Written in Java, so it runs on Linux and Solaris

    • Interacts with FreeDB to grab album/artist and genre info

    • It is open source

    If you are looking for an MP3 tagger, I'd recommend entagged.

    Friday Apr 08, 2005

    One of the difficulties with building a Music Information Retrieval system is acquiring enough music data for training and testing of the system. Music publishers hold the IP of the music very tightly which makes it difficult for MIR researchers to use and share their music. If I have published some MIR research results using a body of 5,000 songs and you want to repeat the experiment and duplicate the results you will have to build up the identical 5,000 song collection on your own. This is unlike other disciplines: Text retrieval folks can use Text Retrieval Conference (trec) data, speech researchers can use the extensive data provided by the Linguistic Data Consortium. There is currently no equivalent resource for music resources.

    The International Music Information Retrieval Systems Evaluation Laboratory (IMIRSEL) Project, is an attempt to rectify this problem by providing a standard set of evaluations and a common set of music data for MIR researchers to use. Dr. Downie and his team are working hard to establish this as the 'TREC' of Music Information Retrieval. They have just released M2K which will serve as the MIR evaluation framework for this year's ISMIR (the International Conference on Music Information Retrieval). The IMIRSEL looks to be a great resource for MIR researchers.

    Unfortunately, the IMIRSEL doesn't offer any music data collections as of yet so researchers have to look elsewhere for music data. One excellent source of music data is, the open source record label. Magnatune makes available for download over 350 albums by nearly 200 artists, for a total of nearly 5,000 songs spanning all genres. The music is licensed under the creative commons license which means that non-commercial use of their music is free. The 128 bit encoded MP3 files are consistently tagged with genre information making them great for a number of MIR tasks such as artist classification, genre classification and music similarity (and the music sounds good too!). Until the IMIRSEL is fully established, Magnatune may be the best place to find good data for MIR research.

    Wednesday Apr 06, 2005

    What is the Worst song in the world? ... researchers at the University of Waikato intend to find out with this The Worst Song in the World Survey. They are looking for participants, so head on over and take the survey. If you need some inspiration, be sure to read Dave Barry's Bad Song Survey. Now I've got to try to get that horrid Pina Colada song out of my head.

    This blog copyright 2010 by plamere