Engineering software

Wednesday, July 5, 2017

Books on recommender systems

Hello fellow enthusiasts,

I just came across a quora post about what books are out there on recommender systems. It seems we are gettier luckier and the amount of literature is nicely increasing. The answers are from known names who, for example, present at RecSys so I recommend looking into the books they suggest.

https://www.quora.com/Do-you-know-a-great-book-about-building-recommendation-systems

Monday, May 29, 2017

Stuff from the internet on evaluating RS

Hello recommender engine enthusiasts,

Just completely accidentally came across this great answer from Xavier on quora on the topic of evaluating the recommender systems and the importance of A/B testing. Here is the link https://www.quora.com/How-do-you-measure-and-evaluate-the-quality-of-recommendation-engines

Inside his response he mentions other very interesting articles on his paper and on talks of other people, recommend reading it.

Tuesday, May 16, 2017

YouTube recommender system article

Hello All,
have been doing some browsing around the Internet and found this interesting article on the YouTube recommender system. It is not one of the super newest but interesting so posting it in case someone finds it useful.

Link

Friday, November 25, 2016

Intro to Recommender systems - Users input to RS

Hello everybody. Before we proceed to talking about actual methods (Content based, Collaborative filtering) I want to briefly talk about user's input aka feedback.

Feedback is what user provides to the system throughout the time. System consumes it, has to interpret it, and then it can give you recommendations based on that. There are two type of feedback explicit and implicit. This coursera course has a very good video called "preferences and ratings".

Explicit feedback

Explicit feedback is explicitly given by the user usually in form of some rating, like 1 - 5 or stars, or simply thumbs up/down (which for example Spotify uses). On RecSys 2016 I heard during a discussion that explicit feedback is not considered as important as it used to be before. The mentioned reasons for that are the fact that it is rare and might not be correct. The rarity comes in place because explicitely rating stuff is putting higher cognitive load on a user. Even just thinking about "is it 3 starts or 4 stars?" needs user's attention. The finer rating scale you have the higher the cognitive load is. Thumb up/down is simpler than 5 star rating scale, although for you as the RS designer of a weaker explanatory value. There is also an issue which this talk mention that there is what is called "a revealed preference". This means that people actually sometimes do something else than what they say they do. So ratings might be saying that you don't really like that much certain tv show but your doing may reveal that this show is the only one that you watch every single night.

With regards to explicit feedback you also have to pay attention to the rating scale of each user. Naturally not all of us will utilize the same rating scale. When you have 5 star rating someone can consider 3 stars to be an average someone may consider 3 stars to be already pretty bad. The same apply for how big range of the scale each user uses. Some may only use range from 3 to 5 stars some one may use from 2 - 4. Hence, you have to do some normalization in order to be able to compare ratings of one user to ratings of the other. One simple way what you can do is that you can calculate user's mean rating and then substract it from all their ratings. This way each record would then hold how many +/- stars the user gave above/below their mean rating to this movie. The same is good to do for the range of each user's scale - to put them on the same scale range. For more information take a look at this video I mentioned earlier "preference ratings" from here and connected articles and homework.

Implicit feedback

Implicit feedback means to interpret user actions in your system. Typical example would be buying/viewing an item. When a user buys an item it is very likely that the user likes the item. If the user viewed the item but did not buy it the probability that the user likes an item is less than if they bought it but higher if they did not even viewed it. As, for example, Netflix study in Recommender system handbook describes implicit feedback is much more abundant than explicit feedback, less noisy, and it does not require a user to do anything extra. User can be just browsing though your system and you can be interpreting their actions on the way.

A special feature of implicit feedback in comparison to explicit feedback is that you don't really have clear negative feedback. When a user does not view an item you cannot tell if the user did not view it because they did not see the link to it, or if they were just particularly focused on finding something else at that time. On the other hand there was a discussion at RecSys 2016 that if you show a user an item, for example, 10 times and you are pretty certain they saw it and did not interact with it, it is then likely they are really not interested in this item. To sumarize it in implicit feedback we work with likely hood of how likely certain action means that the user is interested in an item. There is a general section in this paper talking about implicit feedback.

With regards to negative user feedback you can argue, what if a user dismisses a recommendation, for example, clicking on an "X" button or a thumb down button. I don't remember any particular discussion or mention if this is still implicit feedback or explicit because user is actually making an action to tell you "I don't like this one". Personally, I would put it into the implicit feedback bucket. Talking about dismissing here is one interesting fact. In the online coursera course they mentioned that there was a user study which found out that some users actually were using "X" dismiss button in the fashion of "Show me more recommendations" because the system disappeared the recommendation immediately and filled its spot with new one. Another interesting example I want to mention is around "swiping sideways" to dismiss a recommendation. During this presentation at Recsys 2016 the presenter mentioned that it is actually a tricky area to distinguish between "I read it and I am finished with it" (Sometimes just reading a title is enough) or "I don't like it, don't show it to me again".

Monday, November 21, 2016

Intro to Recommender systems - are they used anywhere?

Hello All, happy to see you back. In this post I want to cover a question: "I have not seen it, it is used anywhere?", if somebody might have that.

Simple answer would be yes, everywhere :) but let's dwell on it a little bit. I usually get a reaction from non-technical people "yeah recommendations are all those adds that pop around the place with offers for the stuff I was recently looking or searching for". That is certainly some recommender system powering those but quite heavy usage of recommender systems are also elsewhere in online services, such as online stores, online music or video sites, social networks, and others. Here are some of the players who heavily utilizes a RS: Netflix, Amazon, YouTube, Google Now, Spotify. Netflix case-study in the Recommender system handbook chapter 11.2 contains more details on this and other examples of places using recommender systems.

Netflix is a video streaming company and they say that about 80% of all viewership comes from what they recommended to users not what a user would actively search for. Their whole home page is produced by a RS speaking of which rows comes on the landing page, what is their order or what is an order of movies within a row. Even such a thing as a thumbnail of the movie is chosen by a recommender system. Netflix even values their RS at a price of $1 billion per year (link). Amazon, an online store, is very heavy on recommending right items to right audience in right spots all around their pages. I don't know it for sure neither have worked for Amazon but I believe it is pretty close when I say that most of the landing Amazon page would be put together with the help from a recommender system. The items on FB feed are picked by a recommender system. And there would be much more examples...

If you were doubting if RS were used anywhere nowadays, now you can see they are and with more and more data they will be more and more important.

Other references:
https://www.cs.umd.edu/~samir/498/Amazon-Recommendations.pdf Description of Amazon's recommender system
https://code.facebook.com/posts/861999383875667/recommending-items-to-more-than-a-billion-people/ Scaling RS at Facebook
http://hpac.rwth-aachen.de/teaching/sem-mus-16/presentations/Pruefer.pdf Spotify's slides about their RS

Saturday, November 19, 2016

Intro to Recommender systems - Personalized & non-personalized

Hello everybody, happy to see you back. In this post I will focus on the most high level distinction of recommendations: Personalized and non-personalized

Non-personalized

Usually people think a recommendation has to always be tailored to each single individual but that is not entirely true. When we come back to the first post to the example about librarian recommending you a book to read, in one example they may say "this book has been super popular among readers within the last month". This is naturally a recommendation because the librarian is advising (recommending) you what to read. But this recommendation is not specific for your person. Everybody walking into that library can hear the same sentence.

Non-personal recommendation, as the name says, are the same for everybody and usually they are put together as some form of aggregated statistics. This can be: most bough, most viewed, most shared, most talked about etc. As you can see there are particular areas where these make sense. If you go to a vacation you might want to welcome a recommendation that says "this is the top most bought location for people from your city". Or when you are looking at hotels you may welcome seeing the hotels with best ratings on the top. If you want to find more details on non-personalized here is a course course on it.

Personalized

On the other hand, as this name says, this group of recommendations is particularly tailored to each individual. So every person sees different recommendations. How this groups is technically composed is usually looking at each person's history of behavior. We do that in normal life as well. When you know that your friend has bough their last 10 shirts of a particular brand you can easily recommend them a new t-shirt from this same brand because you know the history what they bought.

Because we look at person's history there is one assumption that has to hold: People's past has to be able to predict their future. In other words if you like a t-shirt today there is a very high chance you will like it tomorrow. Of course, people's taste evolves over time but it is not jumping from positive to negative back and forth in a course of days. If this does not hold you cannot use people's history to give them recommendations. And that makes sense even in normal life, if I buy t-shirts at complete random (just because I find it enjoyable), you will not be able to recommend me with a repetitive success a t-shirt that I would end up liking and buying it.

How to do it?

When we know that the basic assumption that I said above holds, what approaches we can actually take? Let's take t-shirts as an example. The first approach is, we can be looking at features of the t-shirts that I bough in the past. The features would be, for example, color, size, brand, style, material, quality etc. So if you observe from my buying history that I like blue sport shirts that are cotton made with good quality you can then easily decide if to recommend to me (or not) any other t-shirt you come across. One important thing I have to mention is that we have to decide and pick which features are actually important for people when buying t-shirts and then we have to track these feature. If we happen to track wrong features, like we would only track what the t-shirt label color is, we would never be able to make a good recommendation.

The other approach that you can take is looking just purely at similar behavior of people. When we stay with the t-shirt example, if you know that me and my brother bought in the past the same t-shirts in 90% of the cases, then you can safely recommend to me a t-shirt that my brother just bought and I have not came across it. We need to realize that this approach is based purely on similar history of behavior of people. And because the assumption holds there is a very good likely hood that if we have bought same t-shirts quite often in the past we will continue doing so in near future.

In the field of recommender systems these two approaches are called:

Content based (the one looking at features of the t-shirt)

This method is based on anything related to the actual item. Be it metadata like genre, main actor, or the actual content like written text for books.

Collaborative filtering (the other one looking at behavior of people)

I will talk about each of them in more details in the coming posts.

Conclusion

Non personalized approach is based on aggregated statistics like most popular, most watched, most bought etc. In non-personalized category of recommendations everybody sees the same. Personalized approach is tailoring recommendations to each individual and it can be based on features of items or on the behavior of people in connection to the items.

Thursday, November 17, 2016

Intro to recommender systems - When and why did we start talking about RS?

We covered the "definition" in the first post and we know now that this domain is very broad. In this post I want to elaborate on the question when we actually started being more interested in computer based RS and why.

From the previous post we know the the aspect of receiving a recommendation has been in people's lives all the time. The time we started talking much more about RS came when people started consuming much more content online, be it books, videos, items, anything. There are two reasons why RS became so popular in online consumption. One obvious one, although less important, is that in online world you are facing computers not people in most of the cases but you would still like that aspect of getting advice when you are looking for something. The other reason was much more along the side of the breadth of the offerings that the online stores have. I will illustrate it on an example.