Monday, December 4, 2017

Article on A/B testing

Hello folks!

Just previous week I came across an interesting article from Netflix about A/B testing the article. It drew my attention because I have been also thinking recently about how to shorten the time/users needed for A/B testing. I encourage all of you to read it.

One thought I immediately got (and posted there) is that as you go though the courses and conferences you hear and read that the impression of the whole list matters. So in this approach what would be the limit to which the two tested algorithm outputs can differ? If there are too different would the interleaved list as a whole seemed to be a "random" draw from the items to the end user?

I don't have answers, just the feeling that if you test too different outputs then it might seem too "random" to the user. Anybody has thoughts? I am curious to hear your opinions.