Sunday May 18, 2008

Recently, Michael Mandel updated the MajorMinor site (an ESP-style game for collecting social tags for music), to allow one to browse the tags generated by humans as well as to examine the tags generated by their autotagging algorithms. The machine tags are trained on the human tags to 'automatically find relevant clips based on their sounds'. They have about 50 or so tags trained, including tags like 80s, instrumental, indie metal, strings, slow fast, repetitive, sample and rap.

Some of the machine tags are really good: jazz, drum and bass and female. while some are not so good: violin, punk and keyboard.

It's great that Michael is making this data available for everyone to see and explore. We have some similar data from our autotagger. Doug has put together a nifty web for exploring this data - hopefully in the next couple of months we can do the same as Michael and make it available for all to use and explore. Doug keeps threatening to start a blog, so perhaps he'll write about it sometime in the near future.

Friday May 16, 2008

itunes.png This is one of my favorite freakomendations. iTunes suggests that if you like "Baby One More Time" by Britney Spears, you might like the "Report on Pre-War Intelligence on Iraq".

This recommendation just doesn't seem to make any sense in any context. iTunes has lots of users so presumably they have lots of data and Britney Spears is a very popular artist - so this can't be a cold start problem. Something went awry somewhere.

Or perhaps there's some reason - there is a Britney Spears / Bob Dole connection - or maybe the lyrics are some sort of subtle commentary on America's attitude toward Iraq. Hit me baby one more time indeed.

britney.png prewar.png

Thursday May 15, 2008

Synthèse points to an article describing a study that suggests that music can influence the way wine tastes. For instance, subjects in the study listening to heavy rock music rated a cabernet sauvignon as being 60 per cent more powerful and heavy than those who drank in silence.

Some wine/music recommendations:
  • CABERNET SAUVIGNON: All Along the Watchtower by Jimi Hendrix; Honky-Tonk Woman by The Rolling Stones; Live and Let Die by Paul McCartney; Won't Get Fooled Again by The Who.
  • CHARDONNAY: Atomic by Blondie; Rock DJ by Robbie Williams; What's Love Got to do With It by Tina Turner; Spinning Around by Kylie Minogue.
  • SHIRAZ: Puccini's Nessun Dorma as sung by Luciano Pavarotti; Orinoco Flow by Enya; Chariots of Fire by Vangelis; Canon by Johann Pachelbel.
  • MERLOT: Sitting On The Dock Of The Bay by Otis Redding; Easy by Lionel Ritchie; Over The Rainbow by Eva Cassidy; Heartbeats by Jose Gonzalez.
A couple of weeks ago, I made up the word 'freakomendations' as a way to describe strange recommendations that are often made by recommender systems. When I first used the word, I did a Google search and found that there were exactly zero usages of the word on the internet. This really was a new word according to Google.

Now, two weeks later, Google tells me that there are about 1,800 instances of 'freakomendations' on the web. Now since I've used the word in exactly 10 blog posts, that means that there are a whole lot of other people using the word now. Or does it? I took a detailed look through the Google results and this is what I found.

Of the 1,780 results, Google thinks that only 72 of them are worth presenting, the rest are duplicates. Of these 72, only one mention is from another actual blog post. All of the rest are aggregators of one type or another, that are just republishing my words on another feed. There are feeds that are devoted to serving up any post that mentions 'Emerson, Lake and Palmer' for example.

In about two weeks, each of my 10 freakomendation posts seems to have spawned more than 100 copies in the various aggregators, republishers and splogs. This propagation will likely continue. This suggests that 99% of the RSS feeds out there are just re-broadcasting content. Only 1% of content is original. This makes me want to cry.

Wednesday May 14, 2008

If you like Norah Jones ... You might like Ravi Shankar.

At first blush, this looks like a bad recommendation - the Starbucks queen seems quite removed from the Indian Master - it is hard to imagine any kind of connection between these two artists, but the connection is actually quite close. Ravi Shankar is the father of Norah Jones. This little tidbit - the paternity of Norah Jones - turns what seems to be a bad recommendation into a credible recommendation.

Often we get recommendations like this - where they seem to make no sense, but with a little information the recommendations can become good, or at least reasonable.

Recently added a 'connections' tab to the set of artist tabs, so now it is easier to find these types of musical connections.


Musicbrainz has this data too.


Tag a tune - an ESP-Type game for labeling music has gone live. It is not the first game to collect tags for music, but I think it is perhaps the most fun. The game is well polished, easy to learn and potentially addictive to play. My current high score is 6800. Check it out.


Tuesday May 13, 2008

If you ever need to explain the Microsoft Playsforsure DRM fiasco to your Mom - this guy can help: The day the music died (via Colin)
Here's a Freakomendation from iLike. I was listening to the progressive rock masterpiece Karn Evil 9 by Emerson Lake and Palmer, when iLike suggested that I might like the song Grandma got run over by a reindeer performd by Elmo and Patsy. The rest of the recommendations were other Christmas schlock.

ilike apparently thought that ELPs only distinctive song was the Christmas chestnut I Believe in Father Christmas and connected it up with other well worn Christmas music. ilike-elp.png ilike.png

Monday May 12, 2008

There's a fascinating and informative slide deck from Nielsen SoundScan called the State of the Industry that was presented at NARM last week. Some highlights:

Music is really long tail - in 2007, 450,344 of the 570,000 albums sold were purchased less than 100 times. 1,000 albums accounted for 50% of all album sales.

The music industry had its biggest sales week since they started keeping records, with 58 million units sold in the last week of 2007. The previous record was 47 million during the last week of 2006.

13% of all album sales come from American Idol and the Disney franchises.

CD sales are down 31% since 2004, but digital music sales are up 490%.

Surprisingly, Vinyl sales are coming back - they grew 15% in 2007 and are up 70% in the first 3 months of this year. Mostly in indie vinyl.

1 out of 4 albums are purchased in a non-traditional retail store (i.e. internet, or at a concert).

80,000 albums were released in 2007

844 million digital tracks sold in 2007, 1% of all digital tracks accounted for 80% of all track sales.
There's lots more data in there. Definitely worth a good look. - Thanks Oscar

Sunday May 11, 2008

The Echo Nest has released their new recommendation API. The API will give you recommendations based upon a single artist or a set of recommended artists. Echo Nest is one of the few commercial recommenders that doesn't rely on collaborative filtering for recommendations. Instead, the Echo Nest combines information from "online cultural data, text analysis, audio analysis and user activity". The API is very easy to use. You make queries like this to lookup artist IDs:
Once you get an artist ID you can get recommendations with this request:
Some recommendations using the new Echo Nest Recommendation API Seed Artist of Jimi Hendrix:
  • Cream,
  • HendriX
  • Stevie Ray Vaughan
  • Jeff Beck
  • The Jimi Hendrix Experience
  • Eric Clapton
  • Funkadelic
  • Buddy Miles
  • Michael Hill
  • Doyle Bramhall II
Seed artist of Led Zeppelin:
  • LED
  • Deep Purple
  • Bonham
  • Page and Plant
  • Black Sabbath
  • Aerosmith
  • The Who
  • Bad Company
  • AC/DC
  • Cream
Seed artist of Miles Davis:
  • Miles David and John Coltrane
  • Miles Davis Sextet
  • Miles Davis Quintet
  • John Coltrane
  • Charlie Parker
  • Kenny Dorham
  • Wayne Shorter
  • Wynton Marsalis
  • Bill Evans
  • Thelonious Monk
Seed artist of the Beatles
  • Badfinger
  • George Harrison
  • The Hollies
  • The Animals
  • John Lennon and Paul McCartney
  • Paul McCartney
  • The Beau Brummels
  • The fourmost
  • Peter and Gordon
  • George Martin
Seed artist of Emerson Lake and Palmer
  • Rick Wakeman
  • Emerson, Lake & Palmer
  • Patrick Moraz
  • Yes
  • Gentle Giant
  • Steve Hackett
  • King Crimson
  • Colosseum
  • Tony Banks
  • Graham Bond

The recommendations, for the most part look pretty good. There's less popularity bias than you'd typically see with a collaborative filtering algorithm. Still, there are some funny bits. If you like Miles Davis, there are 3 other types of Miles Davis that you might like (but I bet you could have figured that out without a fancy recommender). Also, cover and tribute bands seem to feature prominently.

There were a few little glitches with the web services. The returned XML was not always well formed ('&' characters were not properly turned into entities), some artist searches failed for no apparent reason (I couldn't find 'the beatles' or 'weezer' with the 'suggest' call).

I'm not sure if these recommendations would pass the recommender Turing test. Not too many people would offer a recommendation of Lennon and McCartney for Beatles fans - it is too obvious. Same with Jimi Hendrix and the Jimi Hendrix Experience. However, the Graham Bond recommendation for Emerson Lake and Palmer is brilliant and one that I've never seen made by any other recommender (human or machine).

The Echo Nest gives these recommendations for free (for up to a certain amount per day). If you want to make many requests, you can buy a license from the nest. Also, the Echo Nest is offering free unlimited recommendations to the first 100 small music sites that register for the API. You can't beat free, so sign up quick!

I'm really pleased to see the Echo Nest releasing these services. This is just the type of service that the next generation web companies that are trying new ways to deliver music really need.
The first thing I'll do (well, 2nd after filing the expense report), is to download and install Hudson. Hudson is tool for building and testing software projects continuously. It makes it easier for developers to integrate, test and obtain the freshest, tested build. Hudson manages distributed building and testing, integerates with all of the major source code control systems, works with JUnit, Findbugs, makes RSS feeds for build status. I was really impressed with the demo that I saw at JavaOne. Installing it was dead simple: Just Download the jar and type
java -jar hudson.war

If Hudson works as well as it seemed to in the JavaOne talk, then I think we'll be using this as our build and integration tool for Project Aura.

Thursday May 08, 2008

Notes from the Tech Talk at the SanFran music tech summit. This panel was a discussion about the technology behind some of the most popular music sites. Moderator is Colin Brumelle.




  • Colin Brumelle - Moderator
  • Tom Conrad - CTO Pandora
  • Marc Urbaitel - CTO In-Ticketing
  • Shaun Haber - Warner Bros. Records - Director of operations, using Drupal to build an artist platform.
  • Jeremy Riney - Project Playlist, CTO, Founder
  • Jack Moffit - Xiph, Chesspark - IM, gaming
Why did you chose a particular type of technology:

Marc - uses php - quicker turnaround time, lets them be much more nimble. Open source is good.

Shaun - open source CMS - chose Drupal: a big reason is active developer community.

Jeremy - also uses Drupal - Paylist is the largest Drupal user with 25 million active users.

Tom - were on Java, Oracle, Jetty servlets due to legacy reasons. Oracle was a disaster, so they ported it all to postgres. Re-implemented Oracle procesdures at the Java language. Some core routines in C, - huge memcachd - 200 servers, 2000 interactions per second. 64 bits linux, intel CPUs, the shiny frontend is flash. They didn't have anyone who knows about flash. Used openLazlo to build the application using javascript and their framework and compile it down to flash. Tom says Lazlo is a great piece of software.

Jack - Perl, then Python, with webware frameworks, mysql, postgres, now Jango (rails-like python), they run everything on Amazon EC2 and S3. Wrote lots of Javascript - use all scriptaculous, prototype and others.

Colin: Is EC2 the future?

Jack - Went through CO-LO hell. Was hard to provision new hardware. On Amazon, they can type one command and get 10 more machines. Jack is very happy with EC2, S3. Jeremy was concerned with complexity but Jack says it was not too hard.

Tom: If they were starting today, they would be considering cloud computing like EC2. The hardest part to scale horizontally is the database. Risk become predicting the future. How do you provision just the right amount of servers. This would become a guessing game.

Marc uses cloud computing to do scaling testing (buying lots of tickets at once).

Tom - also the cloud is useful for data recovery - use the cloud to serve as the failover. Pandora decided to do their own CDN. The save much money they by doing it themselves.

Tom says don't by Foundry load balancers

Questions from Derick of CDBaby - Tells the story about how he rewrote CD Baby for PHP and Ruby On Rails. After 2 years of frustration, he threw it all away. Nothing to do with Rails - but keeping the two systems (PHP and Rails) alive was hard. Derick also lauds EC2. Tom does say that you are still paying a margin to Amazon for this so it could cost you more than doing it yourself.

Tom talks about "test driven development". They can rip their system apart and put it back together and be confident that it will work because of their tests

Digital Thought Leaders Panel
Originally uploaded by PaulLamere.
The digital thought leaders panel moderated by Brian Zisk, with Tim Westergren of Pandora, Aza Raskin of Songza, Michael Pertricone of the consumer electronics association and Ty Roberts of Gracenote/Sony.

First topic is the well-worn topic of how we navigate the intellectual property minefield of music. How can companies make money while still compensating artists. I wish the panel would focus a bit more about technology and less on rights and IP. There's a separate legal track. As K7lim says "this is a kindergarten discussion of IP policy."

Tim Westegren calls for better, simpler design is necessary to engage with listeners, especially new listeners. "Simplicity brings people in" says Aza.

Brian asks: "What is the Future?" - Ty Roberts talks about music product. He's interested in 'music packaging' - augmenting the simple MP3 with all of the ancillary metadata (album art, reviews and bios). Aza suggests 'continuity of experience'. Eliminate the facebook.iTunes silos - get rid of having to worry about where your music is coming from. Michael says it is 'Simplicity'

Tim points out that radio has always been popular. He says that people don't want to spend alot of time administrating their listening experience. He suggest that music will be everywhere, supporting by advertising. Once this is in place, there will be lots of ways that people can use and interact with the music.

Aza points to the Kindle as a good example of where things should go with music. "Feels like free" is key - whether it is ad supported or some other model.

Discussion about the "metadata problem".

Tim offers advice to artists - add a new member of the band - a non-musician - to be the marketing person to get the band exposure on the 'nets.

Brian opens the summit
Originally uploaded by PaulLamere.
Brian Zisk has just launched the SanFran Musc Tech Summit. There are hundreds of music tech folks gathered in Japan town. It looks to be a fun event.

Monday May 05, 2008

Originally uploaded by PaulLamere.
Javaone week has started. We are waiting for the communityone for for the keynote to start. Lots of fun.

This blog copyright 2010 by plamere