Saturday Oct 27, 2007

I'm starting to work on a new project (still related to music discovery and recommendation).  Starting a brand new project is really fun.  There are not too many times in a decade that I get to start with a completely blank slate.  It's a great time for me to refactor my development process, learn about new tools and learn about new ways to do things.  Still, sometime all the newness can be a tad overwhelming ... here are all the new things just in the last couple of weeks:

  • New source code control (mercurial)
  • New version of netbeans
  • New version of GWT
  • New team mates
  • New coding standards (what are those '/**' and '*/' for?)
  • New project hosting platform
  • New organizational chart
And of course there are all sorts of new project-centric design  challenges as well.   It is all new, and it is a lot of fun.

 

Friday Oct 26, 2007

I just received this in my inbox:


Dear Paul,

The 2008 JavaOne Conference Call for Papers is Open!

JavaOne, Sun's 2008 Worldwide Developer Conference, is seeking proposals for technical sessions and Birds-of-a-Feather (BOFs) sessions for this year's Conference.

Attracting over 15,000 developers and leaders in the developer community. From Industry leaders, to experienced developers to developers starting out - this conference is one that brings together some of the industry's best and brightest.

The JavaOne conference is your opportunity to reach this specialized community by educating and sharing your experience and expertise with the developer community. Please go to http://www.cplan.com/sun/javaone08/cfp to review the guidelines and instructions to submit your proposal(s).

If you have questions regarding the Call for Papers process, please contact [email protected] .


There's a new Sun Labs Communications page that has all sorts of links to information about the various projects in the lab, including project spotlights, blogs, podcasts, news articles  and events related to activities in the research lab. Neat Stuff.

Saturday Oct 20, 2007


Justin's Poster
Originally uploaded by PaulLamere.

120 or so folks who are keen on figuring out how to make recommendations have descended onto the campus of the University of Minnesota for the two day RecSys'07 conference. It is a good mix of attendees, 47 participants from have come from outside of the U.S.. 50 of the participants are from industry. There are 16 long papers (out of 35 submitted) and 14 short papers (out of 23 submitted).

The keynote, given by Krishna Bharat, was an excellent presentation on Google News, placing it into the wider context of news and journalism.   Greg Linden has an excellent description of the talk

 The first paper session on Privacy and Trust had me thinking quite a bit more about how people can get good recommendations without revealing too much about themselves. 

 The afternoon panel session included members of industry who opined about what issues were important to the commercial world.  Themes emerged about search vs. discovery vs. recommendation (just some semantic problems), APIs, portability of attention data, cross content recommendation, and the difficulty of evaluation.

I particularly enjoyed the poster session - I like being able to talk to folks about their research, and a poster session is the best way to do it.

Day 2 of the conference is about to begin - and I am the session chair for the first session, so I now need to read through a few papers so I can avoid a repeat of my worst conference moment.

Thursday Oct 18, 2007

In a few hours I hop on a plane to head to Minneapolis to attend ACM Recommenders 2007.  I'm really looking forward to this two day event, for the talks, for meeting up with old friends, and to finally meet f2f many folks that I've communicated via email and facebook over the last year.

Tuesday Oct 16, 2007

Everyone's favorite Mp3blog aggregator, the hype machine, has launched a new version.   Lots of new features including:

  • Favorite anything you like on the site: tracks, blogs, searches, people. Share them with friends via your personal URL.
  • Get a personalized music feed, created from your blogs,searches and people you find interesting. Share that too.
  • Automatically update Twitter when you favorite anything on the site to share it with your twitter community.
  • Hype Spy: watch in real-time what everyone is searching for and listening to on the site.

Monday Oct 15, 2007

I just read the book High Performance Web Sites - which is an excellent O'Reilly book about how to improve the performance of your website by focusing on the frontend. This book gives 14 weighted rules to follow that will improve the load time of your web page. Reducing the number of HTTP requests, gzipping components, adding far-in-the-future expire headers all lead to pages that load much faster.

The author also describes a tool called YSlow that measures a number of performance aspects of your page. It reports the total load time, the page weight, a detail of component loading times, as well as an overall score that indicates how well optimized a page is. A score of A means that the website is doing all it can to eek out performance, while a score of F means that there is plenty of room for improvement.

I applied this tool to 50 well known Music 2.0 sites and recorded the front page load time, the YSlow score, the page weight (the amount of data downloaded) and the total number of http requests needed to download the page.  As you can see from the table, Music 2.0 sites have much room for improvement. The average load time for a Music 2.0 page is 6.6 seconds, Ruckus was the worst with a load time of over 30 seconds. Ruckus also has the lowest YSlow score of 30, showing that there are lots of things Ruckus can do to improve its page loading time. According to YSLow, for starters, Ruckus could combine its 23 javascript files into one, saving 22 very expensive HTTP requests.

Note that flash-heavy sites like Musicovery, MusicLens appear to do well here, but they've just shifted the loading times into a flash app that is not measured by YSlow.

name load time
(secs)

score


Page
Weight (K)

 HTTP
Requests

Amazonmp3 4.722
D (65)


325.3 57
Aol music 7.6
F (36)


306.9 91
All music guide 12.013
F (37)


479.2 113
Amie street 3.291
F (39)


576.3 51
Artistserver 5.537
F (44)


173.3 39
blogs.sun.com 2.4
F (58)


103.3 22
Cruxy 3.491
F (53)


707.3 43
Facebook 6.45
F (43)


197 110
Finetune 10.6
F (53)


129.6 22
Goombah 5.269
F (49)


185.2 33
Grabb.it 3.398
D (67)


77.4 32
Grooveshark 2.303
D (60)


254.4 30
Haystack 5.796
F (50)


495.7 71
iLike 4.71
D (62)


190.8 24
Lala 5.548
B (83)


33.8K 19
Last.fm 3.544
C (71)


408.3 59
MP3 Realm .802
B (83)


18.3k 7
Midomi 29.872
D (63)


238 34
Mog 14.712
B (80)


293.3 53
Mp3tunes 10.886
F (56)


439.9 45
MusicBrainz 7.8
F (44)


369 35
MusicIP 1.517
C (72)


80.3 11
MusicLens .965
A (96)


5 2
MusicMobs 2.372
D (62)


161.4 31
music.of.interest2.239
C (79)


105.2 7
Musicovery 1.315
A (92)


144.6 5
MyStrands 5.463
F (55)


222.5 33
Napster 3.876
F (27)


374.9 58
OWL Multimedia 3.963
F (36)


469.2 49
OneLLama 11.777
F (40)


341.4 59
Pandora 5.732
D (63)


957.5 17
QLoud 1.336
F (52)


206.7 38
Radio Paradise 2.445
F (55)


254 60
Rate Your Music 7.951
C (71)


634.9 71
Rhapsody 7.183
F (31)


782.1 121
Ruckus 34.039
F (30)


766.2 73
Shoutcast 2.452
D (61)


96.4 22
Shazam 10.442
F (56)


507.8 46
Slacker 16.803
F (48)


373.5 25
Snapp Radio 11.747
C (78)


38.6 8
Snapp Radio2 .986
B (85)


68.2 5
Songbird 5.636
D (61)


621.2 37
Soundflavor 7.879
F (44)


559.1 60
Spiral Frog 3.535
F (46)


407.6 90
The Filter 8.229
F (37)


710.3 61
The Hype Machine 6.395
F (48)


159.2 32
Hype Machine(beta) 3.92
F (56)


233.2 49
Tune core 5.945
F (47)


222.3 56
Yahoo Music 5.211
F (55)


481.6 35
Youtube.com 4.955
D (65)


204.2 65
YottaMusic 2.127
C (76)


51.3 42
ZuKool Music 4.046
D (67)


244.3 20

 I've also included SnappRadio (a mashup that I wrote a while back) along with a rewrite (snappradio2) that uses the Google web toolkit. One of the primary goals of the GWT is to make sure that your web app performs well. This is evident, as the load time for my app went from a laggy 12 seconds to a snappy one second.

Looking at this data, it looks like just about all of the music 2.0 sites could cut their page load times in half with just a few simple techniques.  Combining javascript code into a single file, adding far-future expiration headers to javascript and images - take little time to implement but can have surprisingly large positive impact on performance.  

I highly recommend the book along with the Yahoo's exceptional performance team's website. Both are filled with techniques and tools for improving web site performamnce.
 

Friday Oct 12, 2007

Inspired by me*dia*or (an aggregation of music technology blogs), I've created The Taste Blog.  The Taste Blog is an aggregation of my favorite technical blogs focusing on recommender systems. 

If you have a favorite tech blog about recommender systems, feel free to suggest it.

Thursday Oct 11, 2007

There's a short but sweet interview with Ali Partovi, the CEO of iLike,  in USA Today.  Ali distinguishes iLike from Pandora by saying:

Most people I know miss concerts by their favorite artists, because they didn't know they were coming to town. That's a failure for the artist, fan and venue. Services like Pandora do a great job of matching music to your tastes, if you have hours to spend there listening. Our approach was to create a service not for music consumption, but a place to communicate, discover and learn.

Read the interview: iLike steers customers to tunes they're likely to like

Wednesday Oct 10, 2007

Martin Hardee reminds me that there is an online virtual tour of Sun Labs that demonstrates some of the technology being developed here in the labs.  There's even a room devoted to the Search Inside the Music project.  Now, I find that I cannot bear to watch myself on video, so I've not actually watched the Search Inside the Music segment, but some others said that it was worth watching.  I can't vouch for that.

 

Tuesday Oct 09, 2007

I have a bunch of invites to MeeMix. If you are interested, leave a comment here (along with your email address in the appropriate field) and I'll send you one.    MeeMix is a music streaming service.  They describe what they do as such:

We take song parameters, the information you provide about yourself, and your song ratings in MeeMix and apply a smart algorithm developed by us. The algorithm takes all these elements into account and provides a customized play-list for every member with their own unique taste for each of their internet radio stations.

What we do in MeeMix is different from other music discovery sites because we focus on understanding your personal taste and providing a fitting playlist just for you, rather than providing you with a list of songs that are similar to the ones you loved. It's content classification for your exclusive taste, not an abstract one.

 


IMG_2823
Originally uploaded by jjdonald.
In the new ACM Recommenders 07 Facebook group, Oscar points out that this picture of Justin's latrines is still the 3rd most 'interesting' photo from the nearly 300 photos taken at Recommenders 06 in Bilbao. To find out why, or just to connect with others going to ACM Recommenders conference next week - consider joining the ACM Recommenders group on Facebook.

Monday Oct 08, 2007

Over lunch, after the Pop and Policy Music Recommendation panel session on Saturday, one of the panelists (I don't recall who) raised the question: "Is music recommendation different than other kinds of recommendation?  Is music recommendation special?"  Since all of us were in the business of music recommendation, we all quickly agreed that it was special... but is it really?  What distinguishes music recommendation from book or movie recommendation? Is it a harder problem? Is it any more or less relevant than any other recommender.  Certainly there are some obvious differences between music and other types of media.  Where we may read a book or watch a movie just once (or maybe twice), music is re-used many times - we can listen to the same song over an over again (ask the parent of any 12 year old about this).  There is much more music content than other types of media - there are  millions and millions of music tracks, while Netflix offers less than 100K movies.  Music is used in many different contexts: we may have playlists for exercise,  playlists for a romantic dinner, playlists for work or studying.  We don't go to Amazon to find a book to read while jogging.  Song order can be very important, while the order for movies or books is less important (unless you are watching a series of course).  

Music is different - but does that mean that music recommenders are different?  Does a good recommender for music need to do anything special because it is recommending music?  

Sunday Oct 07, 2007

Mike Arrington caused a bit of a stir with his article The Inevitable March  of Recorded Music Towards Free.  Mike says "The economics of recorded music are fairly simple. Marginal production costs are zero: Like software, it doesn’t cost anything to produce another digital copy that is just as good as the original as soon as the first copy exists, and anyone can create those copies (meaning there is perfect competition and zero barriers to entry)."
Mike does acknowledge that  monopolistic control over copyrights, and political lobbying may keep the price of music inflated ... nevertheless, I think he's right.  Simple supply and demand economics state that when the supply is infinite, the price will be zero. Now, I am no economist, but I did learn a bit about economics from the wonderful, and extremely readable book "The Undercover Economist". In this book, economist Tim Harford explains the idea of marginal costs and why things tend to operate "at the margins", and when the marginal cost of something is zero, then its price will tend to zero.  It is time for the next generate of musicians to realize that the best way to make money is to sell the thing that is in short supply - the live performance.

Yesterday, I sat on a panel at the Pop and Policy summit in Montreal. The topic was music recommendation.  I was joined by Brian Whitman (founder of Echo Nest),  Doug Eck (Machine Learning and music at the University of Montreal), Diane Sammer (CEO of Goombah), and Sandy Pearlman (MoodLogic,  producer of The Clash, Blue Oyster Cult).  The panel was moderated by journalist Karla Starr (who wrote an excellent piece on music recommendation for the Seattle Weekly).

It was fun sitting on the panel, and hearing about what everyone else thought were the issues in recommendation.  Some common themes emerged:  collaborative filtering has problems with coldstart and popularity biases, recommenders based solely on content tend to suck, evaluation of recommenders is a very hard problem.  I had the audience take the Music Recommendation Turing Test - results were similar to what we saw in this blog; most people thought that the machine was the human and vice versa.

Although the panel was fun, I thought the most interesting conversation was afterwards at lunch, where all but one of the panelists spent an hour or more, in a relaxed setting talking about music, music recommendation.  At the end we realized that we all were in the business of music recommendation because it really is a great way to get access to whole lot of music.

This blog copyright 2010 by plamere