Here's a screenshot of a recommendation made by Amazon's wonderful book recommender.  This shows the classic collaborative filtering (CF) algorithm at work:  "Customers who bought the Da Vinci Code also bought these items ... " It is a great way to get recommendations, but sometimes it can go awry.  Here we see three recommendations. The first two recommendations seem quite reasonable: Digital Fortress is a book by the same author as the Da Vinci Code, Holy Blood Holy Grail is a book about the same subject matter as the Da Vinci code.  The 3rd recommendation - the 5th Harry Potter book, seems rather strange.  It is a book written for a completely different demographic - middle schoolers and young adults, it is a fantasy about school kids and has very little similarity to an art history thriller such as the Da Vinci Code.  So what gives? Is the Amazon recommender broken?  Nah ... it is doing it's job just fine.  The Da Vinci Code and Harry Potter and the Half-Blood Prince do have one thing in common.  They are both very popular books and they happened to be very popular at the same time.  When mom would go to Amazon to order a copy of the Da Vinci Code, she would also pick up a copy of Harry Potter for the kids.  This was repeated thousands of time, and the Amazon recommender duly noted the correlation.   From a shopper's perspective, it is probably a pretty good recommendation.  Amazon knows that if you buy this book, based on its data, there's a 5% chance that you'll pick up the latest Harry Potter as well.  However, from a book reader's perspective, this is probably a poor recommendation, the books have little in common. 

A recommender for shopping is not the same as a recommender for discovery.  If you are shopping for books, Amazon is the place to go, but if you are looking for a good book to read, you may want to use a recommender like LibraryThing that recommends books based on who has read a book as opposed to who has purchased a book.

 

Comments:

http://www.amazon.com/Paul-Mooney-History-Jesus-Cleopatra/dp/B000KGGZV2

Posted by TigerFist on May 25, 2007 at 10:22 AM EDT #

Thanks for the favorable mention of LibraryThing, but I want to press on the distinction between "shopper and "reader." Books are for reading as food is for eating. If I buy pasta, olive oil and pine nuts, suggesting basil is a good recommendation. Milk is not, no matter the raw statistical correlations involved. Statistical correlations are the start, not the end of a good recommendation system, even when all you care about is the commercial side. When it comes to books--and I'm speaking with a lot of experience--raw statistical correlations give you Harry Potter almost every time. Around LibraryThing we call that the "Harry Potter problem." Have Nabokov's _Pale Fire_? What are you most likely to have as well? _Ada_? _Lolita_? No, Harry Potter. I'm surprised their recommendations for bananas--yes, Amazon now sells bananas--aren't Harry Potter books too. LibraryThing's system is, as you write, based on who reads what, not who buys what. But I don't think that accounts for most of the difference. Fundamntally, we tune the algorithm toward interestingness, not obviousness. LibraryThing wants to entertain, certainly, but we're also selling our data now. We want the data to "work." I think that, even in a purely commercial context, interesting wins out. For example, Harry Potter on Amazon almost always recommends the *other* five Harry Potters. The simple statistical overlap between purchase histories is undeniable. But even if, as I suspect, they've done A/B tests on what actually gets clicked, it probably makes sense. This is the logic of the online ad. What clicks works. But Amazon recommendations are not Adsense links. They're editorial content too. To measure them properly you need to look at the whole buying pattern. If Harry Potter recommends Philip Pullman books, my click through might be lower, but I'll probably buy the next Harry Potter anyway, and now you've hooked me on another series. That is, there are surely secondary effects, even if they're hard to track. (Speaking of series, don't get me started on "backward" recommendations--enjoyed Book 6, how about reading 5?) And recommending Philip Pullman or, say, Susan Cooper keeps me interested, keeps me from thinking of the links as "just" ads. Keeps me from becoming "recommendation blind," as I have become ad blind. In my opinion, the interestingness of Amazon recommendations has been eroding for a while now. I certainly look at them less. I don't envy them here. Calculating long-term effects is difficult if not impossible to measure, particularly compared to a straight A/B test. But there must be an effect. Ultimately, recommendations are like any other long-term content that can be manipulated for short-term gains at the expense of long-term value. You can get a ratings bump if your sitcom characters ski-jump sharks and have babies. But it erodes the franchise. In the end, people turn away.

Posted by Tim Spalding on May 27, 2007 at 05:45 AM EDT #

Post a Comment:
Comments are closed for this entry.

This blog copyright 2010 by plamere