Machine learning in facebook

Last week Max Gubin gave a talk on how Facebook exploits machine learning. The talk was more technical then one might had expected, so I share some interesting facts in this posts. I hope it doesn’t disclose any secret information. :)

Max started with a screenshot of the Facebook interface, where each element was highlighted as beneficial of machine learning. Just to name some, they use learning to predict the order of stories in the newsfeed, groups/ads/chat contacts to show you, and even occasions when your account is supposedly hacked, so you need to verify your identity. For most of the tasks learning is probably trivial, but at least two of them involve complicated algorithms (see below).

Facebook engineers face number of difficulties. A user expects to load a page almost instantly, though network infrastructure already imposes some lag. To avoid further delaying, prediction should be done in tens of microseconds. Moreover, half a billion of daily-active users send a lot of queries, so massively-parallel implementations would be too expensive. They have no choice other than sticking to linear models. For example, they train a linear fitness function to rank the stories for the newsfeed (using e.g. hinge loss or logistic loss). It should be trained to satisfy multiple criteria, often contradicting. For example, maximizing personal user experience (showing most interesting stories) might hurt experience of other users (if one has few friends, they are the only users who can read his/her posts) or degrade the system as a whole (showing certain types of news might be not really interesting to anyone, while necessary to improve connectivity of the social network). Those criteria should be balanced in the learning objective, and the coefficients are changing over time. Even the personal user experience cannot be measured easily. The obvious thing to try is to ask users to label interesting stories (or use their Likes). However, such tests are always biased: Facebook tried to use this subjective labelling three times, and all of them were unsuccessful. Users just don’t tell what they really like.

Another challenge is the quickly-changing environment. For example, interest to specific ads may be seasoned. In advertising, one of the strategies is to maximize the click-through rate (CTR). The model for personalized ads should be able to learn online to adapt to changes efficiently. They use probit regression, where online updates can be written in a closed form, unlike to logistic regression (note the linear model again!). It is based on Microsoft’s TrueSkill™ method for learning ranks of players to find good matches and seems similar to what Bing uses for CTR maximization [Graepel et al., 2010].

Finally, Max mentioned the problem of estimating new features. The common practice in the industry is A/B testing, where a group of users is randomly selected to test some feature, and the rest of users are treated as the control group. Then they compare the indicators for those two groups (e.g. average time spent on the website, or clicks made on the newsfeed stories) and apply statistical tests. As usual, samples are typically small. For example, if they want to test a feature for search in Chinese, they take a small group of 10 million users, and hope that some of them will query in Chinese (recall that Facebook is unavailable in China). It is typically hard to prove a statistically significant improvement.

It was partially a hiring event. If you are looking for an internship or a full-time job, you may contact to their HR specialist in Eastern Europe Marina. Facebook also keeps in touch with universities, e.g. invites professors to give talks in their office or develop joint courses. Professors may apply, but I don’t know a contact for that.

Read Users' Comments (19)

19 Response to "Machine learning in facebook"

  1. David talpur says:
    7 March 2014 at 20:11
    This comment has been removed by a blog administrator.
  2. Unknown says:
    20 September 2016 at 16:16
    This comment has been removed by a blog administrator.
  3. superior papers says:
    7 February 2017 at 16:52

    Yes true i agree with you. we must learn to use all such platform to enhance our skills rather than wasting our time and dealing with nothing productive..

  4. Unknown says:
    14 September 2017 at 16:53

    This is such a good post. Keep it up!

    vending business for sale

  5. Unknown says:
    26 May 2018 at 09:38

    nice blog
    android training in bangalore
    ios training in bangalore
    machine learning online training

  6. Unknown says:
    26 May 2018 at 09:40

    useful blog
    python interview questions
    cognos interview questions
    perl interview questions
    vlsi interview questions
    web api interview questions
    msbi interview questions

  7. Unknown says:
    26 May 2018 at 09:40

    laravel interview questions aem interview questions
    salesforce interview questions
    oops abab interview questions itil interview questions
    informatica interview questions
    extjs interview questions

  8. Unknown says:
    26 May 2018 at 09:40

    sap bi interview questions
    hive interview questions
    seo interview questions
    as400 interview questions
    wordpress interview questions accounting interview questions
    basic accounting and financial interview questions

  9. Kelsey says:
    4 July 2018 at 07:52

    Nice Blog, The way you have explained about the things is very well and the suggestions are the best.
    Keep it up!!!
    Read What are Browser hijacker.

  10. Unknown says:
    5 July 2018 at 10:12

    useful blog
    hadoop training in chennai

  11. Unknown says:
    5 July 2018 at 10:13

    nice blog
    android training in bangalore
    ios training in bangalore
    machine learning online training

  12. akhilapriya404 says:
    24 July 2018 at 15:23

    Nice information thank you,if you want more information please visit our link machine learning online training Bangalore

  13. samala swathi says:
    22 August 2018 at 14:22

    I have read your blog its very attractive and impressive. I like your blog. machine learning online training

  14. akhilapriya404 says:
    28 August 2018 at 16:23

    I have read your blog its very attractive and impressive. I like your blog.
    machine learning online training

  15. akhilapriya404 says:
    24 September 2018 at 13:40
    This comment has been removed by the author.
  16. amar says:
    21 December 2018 at 14:18

    Nice Post....
    Python Training in Bangalore
    Best AI Training in Bangalore
    Machine Learning Training in Bangalore

  17. amar says:
    10 January 2019 at 10:13

    Machine Learning Interview Questions and Answers
    Artificial Intelligence Interview Questions and Answers

  18. Sophie Grace says:
    23 November 2019 at 08:17

    I appreciate you sharing this blog post.Really looking forward to read more. Awesome. This is interesing game sims 4 skill cheats on PC and PS4

  19. Priyanka says:
    26 May 2020 at 09:06

    Attend The Machine Learning Course Bangalore From ExcelR. Practical Machine Learning course Bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Machine Learning course Bangalore.
    Machine Learning Course Bangalore

Post a comment