Why this series?
2 years ago I discovered the domain of recommender systems. I found the beginnings quite challenging firstly because I am coming from Software engineering background and not from a data science background or pure math background and secondly because I was struggling to find any "overview" material. I ran into a lot of scientific papers that are, of course, awesome but they were usually focusing on one specific part. I was struggling to find some high level overview that would start from generics first and then go deeper. After two years in this field I have still tons to learn but I think I have gathered enough information to give you starting hints to you who might be at the same spot as I was 2 years ago.This will be a series or posts where I will cover 360 degree overview of recommender systems to get you the starting information. I want to make this series a one stop shop for starters full of information I found myself until now, heard at the conferences, learnt from papers/books etc.
Here is the list what I will be covering in the series:
- high level principles what the recommender systems are based on
- math
- evaluation
- Software architecture
- other connected areas such as, explanations, user experience
- industry cases I read or heard of will be mixed in
Sources
I want to first mention the sources that everything what I know and will be covering here is coming mostly from the following:- Introduction to recommender systems on coursera - I see that this is not appearing there but if you google for it you will see other RS courses on coursera organized by Minesota here
- Recommender systems handbook from Springer - there is 2nd edition out at the time I am writing this
- RecSys conference - great annual conference. Everybody who is leading this field is there.
- Netflix tech blog - I like a lot what they publish on their blog
General overview
Wiki definition
Let's dive in. Naturally first we should start with a definition. Let's see what wikipedia says link. Here is the first paragraph from there:Recommender systems or recommendation systems (sometimes replacing "system" with a synonym such as platform or engine) are a subclass of information filtering system that seek to predict the "rating" or "preference" that a user would give to an item.The definition above sounds like recommender systems are, for example, about predicting how many stars you would give to a movie that you did not watch. That is not an incorrect information, however, we have to be careful not to limit ourselves in such a mindset. Let me follow up with a simple use case. Let's imagine we have a RS that predicts movie ratings and by evaluation we found that we have 100% prediction accuracy which means our system precisely predicts the number of stars that a user ends up giving to a movie after they watched it. Now the question comes, what shall we do with it? What should we recommend to users? We may simply arrive to a thought that we can provide them with a list of movies that they have not seen which we sort by the number of predicted stars. That sounds reasonable right? Ok, let's now pivot and look at it from the user's perspective. Let's assume I, as the user, like Oscar winning movies and I give them 5 stars. And I just came home during a week day and I am tired from work like every weekday. What is the chance that I will want to watch 5 star Oscar winning movie? I would say not very high, it is more likely I will end up watching some tv show which I maybe give only 3 or 4 stars but which does not require me to think deeply while watching it because I had enough brain exercise during the day. So recommending to me a list full of Oscar winning movies is not a good recommendation. On the other hand recommending to me a list of these movies during a weekend when I am not tired from work can be a good thing. What we just came across is called a context in recommender systems. In this case it was a context of which day of a week it is.
I don't want to be jumping ahead but I just wanted to demonstrate that RS are much broader topic than just, for example, predicting ratings and that we have to pay attention not to focus ourselves too much to only math. Here is a great blog post from Netflix that talks about that there is much more than rating stars for Netflix - post.
No comments:
Post a Comment