Tuesday, 4 December 2012

Machine learning in facebook

Last week Max Gubin gave a talk on how Facebook exploits machine learning. The talk was more technical then one might had expected, so I share some interesting facts in this posts. I hope it doesn’t disclose any secret information. :)

Max started with a screenshot of the Facebook interface, where each element was highlighted as beneficial of machine learning. Just to name some, they use learning to predict the order of stories in the newsfeed, groups/ads/chat contacts to show you, and even occasions when your account is supposedly hacked, so you need to verify your identity. For most of the tasks learning is probably trivial, but at least two of them involve complicated algorithms (see below).

Facebook engineers face number of difficulties. A user expects to load a page almost instantly, though network infrastructure already imposes some lag. To avoid further delaying, prediction should be done in tens of microseconds. Moreover, half a billion of daily-active users send a lot of queries, so massively-parallel implementations would be too expensive. They have no choice other than sticking to linear models. For example, they train a linear fitness function to rank the stories for the newsfeed (using e.g. hinge loss or logistic loss). It should be trained to satisfy multiple criteria, often contradicting. For example, maximizing personal user experience (showing most interesting stories) might hurt experience of other users (if one has few friends, they are the only users who can read his/her posts) or degrade the system as a whole (showing certain types of news might be not really interesting to anyone, while necessary to improve connectivity of the social network). Those criteria should be balanced in the learning objective, and the coefficients are changing over time. Even the personal user experience cannot be measured easily. The obvious thing to try is to ask users to label interesting stories (or use their Likes). However, such tests are always biased: Facebook tried to use this subjective labelling three times, and all of them were unsuccessful. Users just don’t tell what they really like.

Another challenge is the quickly-changing environment. For example, interest to specific ads may be seasoned. In advertising, one of the strategies is to maximize the click-through rate (CTR). The model for personalized ads should be able to learn online to adapt to changes efficiently. They use probit regression, where online updates can be written in a closed form, unlike to logistic regression (note the linear model again!). It is based on Microsoft’s TrueSkill™ method for learning ranks of players to find good matches and seems similar to what Bing uses for CTR maximization [Graepel et al., 2010].

Finally, Max mentioned the problem of estimating new features. The common practice in the industry is A/B testing, where a group of users is randomly selected to test some feature, and the rest of users are treated as the control group. Then they compare the indicators for those two groups (e.g. average time spent on the website, or clicks made on the newsfeed stories) and apply statistical tests. As usual, samples are typically small. For example, if they want to test a feature for search in Chinese, they take a small group of 10 million users, and hope that some of them will query in Chinese (recall that Facebook is unavailable in China). It is typically hard to prove a statistically significant improvement.

It was partially a hiring event. If you are looking for an internship or a full-time job, you may contact to their HR specialist in Eastern Europe Marina. Facebook also keeps in touch with universities, e.g. invites professors to give talks in their office or develop joint courses. Professors may apply, but I don’t know a contact for that.

18 comments:

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. Yes true i agree with you. we must learn to use all such platform to enhance our skills rather than wasting our time and dealing with nothing productive..

    ReplyDelete
  4. Nice Blog, The way you have explained about the things is very well and the suggestions are the best.
    Keep it up!!!
    Read What are Browser hijacker.

    ReplyDelete
  5. Nice information thank you,if you want more information please visit our link machine learning online training Bangalore

    ReplyDelete

  6. I have read your blog its very attractive and impressive. I like your blog. machine learning online training

    ReplyDelete
  7. This comment has been removed by the author.

    ReplyDelete
  8. I appreciate you sharing this blog post.Really looking forward to read more. Awesome. This is interesing game sims 4 skill cheats on PC and PS4

    ReplyDelete
  9. Thank you so much for this nice information. Hope so many people will get aware of this and useful as well. And please keep update like this.highly informative and professionally written and I am glad to be a visitor of this perfect blog, thank youJava training in Chennai

    Java Online training in Chennai

    Java Course in Chennai

    Best JAVA Training Institutes in Chennai

    Java training in Bangalore

    Java training in Hyderabad

    Java Training in Coimbatore

    Java Training

    Java Online Training

    ReplyDelete
  10. Thank you so much for this nice information. Hope so many people will get aware of this and useful as well. And please keep update like this.highly informative and professionally written and I am glad to be a visitor of this perfect blog.

    angular js training in chennai

    angular training in chennai

    angular js online training in chennai

    angular js training in bangalore

    angular js training in hyderabad

    angular js training in coimbatore

    angular js training

    angular js online training


    ReplyDelete
  11. wow nice blog, Kindly Keep sharing such blogs and review and also Kindly go through our great bondage of resorts in bangalore for corporate outing with adventure nest.we provide great team building activities and a lot of adventure games
    DevOps Training in Chennai

    DevOps Online Training in Chennai

    DevOps Training in Bangalore

    DevOps Training in Hyderabad

    DevOps Training in Coimbatore

    DevOps Training

    DevOps Online Training

    ReplyDelete
  12. Excellent Blog! I would Thanks for sharing this wonderful content.its very useful to us.This is incredible,I feel really happy to have seen your webpage.I gained many unknown information, the way you have clearly explained is really fantastic.keep posting such useful information.
    Full Stack Training in Chennai | Certification | Online Training Course
    Full Stack Training in Bangalore | Certification | Online Training Course

    Full Stack Training in Hyderabad | Certification | Online Training Course
    Full Stack Developer Training in Chennai | Mean Stack Developer Training in Chennai
    Full Stack Training

    Full Stack Online Training


    ReplyDelete
  13. Am Dhinesh Am really impressed about this blog because this blog is very easy to learn and understand clearly.This blog is very useful for the college students and researchers to take a good notes in good manner.
    For more..
    Data Science Training In Chennai

    Data Science Online Training In Chennai

    Data Science Training In Bangalore

    Data Science Training In Hyderabad

    Data Science Training In Coimbatore

    Data Science Training

    Data Science Online Training

    ReplyDelete
  14. AI Patasala provides you with the ideal platform to take Machine Learning Training within Hyderabad and learn about the subject with experts from the industry.
    Machine Learning Training Hyderabad

    ReplyDelete
  15. Emperor Casino Review | Free Play & Bonus Info
    Emperor Casino Review ✓ Get the latest promotions, games, 제왕카지노 가입 코드 bonus codes and promotions from the top online casinos. Play Now. Play Now. Deposit & Withdrawal.

    ReplyDelete
  16. The office consists of focus mode in word but excel equipment. Microsoft Office 2019 Crack may be a new workplace automation software Microsoft Office 2019 Crack Download

    ReplyDelete