Thursday, November 17, 2016

Intro to recommender systems - When and why did we start talking about RS?

We covered the "definition" in the first post and we know now that this domain is very broad. In this post I want to elaborate on the question when we actually started being more interested in computer based RS and why.

From the previous post we know the the aspect of receiving a recommendation has been in people's lives all the time. The time we started talking much more about RS came when people started consuming much more content online, be it books, videos, items, anything. There are two reasons why RS became so popular in online consumption. One obvious one, although less important, is that in online world you are facing computers not people in most of the cases but you would still like that aspect of getting advice when you are looking for something. The other reason was much more along the side of the breadth of the offerings that the online stores have. I will illustrate it on an example.


Example



I like this example that I heard during the coursera course. When you own a real bookstore, what books will you put on your bookshelves? In most of the case you would stock most popular books because these have the highest chances of being sold, because most people on average likes them. So to simplify it, you will start with most popular book and go down the popularity scale until you run out of space. In comparison to a an online book store, you can pretty much be selling every book on the planet because your shelf space is "unlimited". To look at it from the customer view. When I come into a real bookstore and I maybe have a hunch of what I feel like reading but nothing exact I can start looking through the shelves. I will finish at some point as there is only certain amount of shelves. When I find myself in a similar mood but in an online book store I can be looking for ages and then I may start feeling overwhelmed by the number of options that I ended just leaving without getting anything. This is called an information overload. When the online stores utilizes a RS it can start surfacing to me the books that would be within my taste and the probability that I will spot something I am interested in and end up borrowing/buying it increases and then both I as the customer am happy as well as the store because I got something and I am likely to come back.

An extreme is when I am a reader who is interested in niche genres, for example. It is extreme because it is on the other side of the spectrum of popularity so these books have fewest readers. It is slightly different from the case above because in the previous case I might have found many many somewhat interesting options but in this case there is only a very narrow selection of books that I would find interesting. In such a case I might not find a book I would like in a real store because it is so niche. When I come to an online store there is no chance that the homepage would be featuring niche books because why would you put a book which is of interest of fewer on a highly exposed spot. So I would have to start digging through menus etc and I might experience an information overload again or never find it. When employing an RS the home page itself can even be personalized for myself and the niche genres can be right there for me because the RS knows from my history my interest.

Netflix prize

I want to mention here as well that quite a big push in RS systems and research around it was caused by the Netflix prize. Netflix prize was a competition between 2006 - 2009 that was for $1M and the conditions to win was to decrease Netflix recommender system error (RSME was used) by 10%. Netflix gave out anonymized set of dataset. A fact that I find interesting is that it took three years for a team to win but at the end there was only 20 minutes time different between the first and the second place. There was supposed to be a sequel to this competition but it was apparently canceled due to privacy concerns (You can google up some articles about why, I don't remember the details).


RecSys

It was in 2007 where the first conference on recommender systems started and it was held in Minnesota. It had two days and one track. The 2017 Recsys had 3 conference days with two tracks and 2 workshop days adjacent to the conference dates. Here is the link to the conference websites

You see that RS are and will be becoming more and more important as we get more and more online services with big catalogues.

Other resources:
Interesting webinar video on youtube (Joe Konstant presenting) 

No comments:

Post a Comment