Facebook Data Scientist Interview - Measure Marketplace

Поделиться
HTML-код
  • Опубликовано: 27 ноя 2024

Комментарии • 20

  • @nanfengbb
    @nanfengbb 3 года назад +12

    Great mock interview! Thanks for uploading it. The candidate did a great job on the market place case study. If I have to nitpick, when the candidate followed the AARRR framework and picked # of downloads and # of login as metrics for acquisition/activation, I don't think those general-purpose metrics apply to market place. To my knowledge, MP is not a stand-alone website/app nor requires separate login; you access it through FB website/app using your FB account. Correct me if I am wrong.

  • @Quantum_Quester
    @Quantum_Quester 2 года назад +4

    Thanks for the mock video.
    It is good to know the kinds of notes the interviewers put down.
    I have learnt here that the interview is really about asking intelligent questions (avoid assumptions), being logical in your approach (rather than guess) and 'showing' a good understanding of the problem (having understanding is not enough, you must talk about it in every step).
    However, I am not sure the real interviewers are usually this patient from experience because there is an expectation to answer certain number of questions before the interview is over.

  • @mindfuel-ness
    @mindfuel-ness 3 года назад +10

    I feel the second ask needs more data points to build a query for desired output. You can filter user who published in last 7 days from user_activity but you can not tie these published post to post_reaction without a post_id in user_activity.
    At best you can only work with post_reaction table by filtering the date for last 7 days, and grouping by the post_id without knowing what user published this what day.
    Let me know if someone felt the same way..

    • @pss1030
      @pss1030 3 года назад +1

      why do you need post_id in user_activity?? It does not matter what post it is, interviewer wants you to find number of posts that received more emojis than likes from any users who published in last 7 days. For example, if you published anything (does not matter how many times) in last 7 days, I want you to find number of posts you posted that received more emojis than likes.

    • @kanchansatpute8112
      @kanchansatpute8112 3 года назад +5

      I have the same concern! The second table has UID of users who reacted to the posts, not of the ones who have posted, so it does not make sense to do a join on that field. There should be a post_id field in the first table!

    • @mindfuel-ness
      @mindfuel-ness 3 года назад

      @@pss1030 how do you plan to find out what user posted what posts without user_id in the other table?

    • @66baller
      @66baller 2 года назад

      @@pss1030 The corner case here is that if the publisher likes his own post today that he published two months ago then you are now including this post but shouldnt. Post_id could be added as a foreign key to user_activity to address this but not sure if this messes up some of the other questions.

  • @杰奇
    @杰奇 3 года назад +3

    for the second query, if one user has published 2 posts on the same day, it will left join both posts with each other and thus have 4 rows which only should have 2.

    • @pss1030
      @pss1030 3 года назад

      It does not matter in this case because date is not considered. Although left join was not necessary as I'd assume there wouldn't be any posts without user activity - so inner join should be fine.

  • @ryanblackmon13
    @ryanblackmon13 2 года назад

    im on question 2 right now - feeling like this would be easier with a pivot of the data :)
    --q1
    SELECT Date
    ,Publish_Cnt
    ,Publish_Cnt / (Publish_Cnt+Comment_Cnt+Upload_Cnt) AS Publish_Pct
    ,Comment_Cnt
    ,Comment_Cnt / (Publish_Cnt+Comment_Cnt+Upload_Cnt) AS Comment_Pct
    ,Upload_Cnt
    ,Upload_Cnt / (Publish_Cnt+Comment_Cnt+Upload_Cnt) AS Upload_Pct
    FROM
    (
    SELECT Date
    ,MAX(CASE WHEN Activity = 'Publish' THEN TTL END) AS Publish_Cnt
    ,MAX(CASE WHEN Activity = 'Comment' THEN TTL END) AS Comment_Cnt
    ,MAX(CASE WHEN Activity = 'Upload' THEN TTL END) AS Upload_Cnt
    FROM
    (
    SELECT Date
    ,Activity
    ,COUNT(*) AS TTL
    FROM user_activity
    GROUP BY 1,2
    )SQ1
    )SQ2
    --q2
    COUNT(*)
    FROM
    (
    SELECT Post_Id
    FROM
    (
    SELECT Post_Id
    ,MAX(CASE WHEN Reactions = 'Emoji' THEN TTL END) AS Emoji_Cnt
    ,MAX(CASE WHEN Reactions = 'Like' THEN TTL END) AS Like_Cnt
    FROM
    (
    SELECT Post_Id
    ,Reactions
    ,COUNT(*) AS TTL
    FROM post_reactions PR
    INNER JOIN user_activity UA ON (UA.UID = PR.UID AND UA.Date > today()-7)
    GROUP BY 1,2
    )SQ1
    )SQ2
    WHERE Emoji_Cnt > Like_Cnt
    )SQ3

    • @nj6553
      @nj6553 2 года назад

      Use CTEs for everyone's sakes.

  • @Dipesh_Sol
    @Dipesh_Sol Год назад

    Second query (Replace IF with CASE for PSQL)
    SELECT
    Post_ID,
    SUM(IF(Reactions='Like', 1, 0)) AS like_count,
    SUM(IF(Reactions='Emoji', 1, 0)) AS emoji_count
    FROM post_reactions
    WHERE UID IN (
    SELECT ua.UID
    FROM user_activity
    WHERE Date BETWEEN CURRENT_DATE - INTERVAL '7 days' AND CURRENT_DATE
    AND Activity = 'Publish'
    )
    WHERE Reactions = 'Like' OR Reactions = 'Emoji'
    GROUP BY Post_ID
    HAVING
    SUM(IF(Reactions='Like', 1, 0)) < SUM(IF(Reactions='Emoji', 1, 0))

  • @fmxy
    @fmxy 3 года назад +1

    I think it should be just 2-3 metrics, I didn't like product section that much

  • @saarmadar
    @saarmadar 3 года назад +4

    The 2nd table doesn't make sense to me at all

    • @tankstocks
      @tankstocks 3 года назад +1

      I kinda agree, I am not sure the USER_ID refers to a user who provided the post or the reaction.

  • @songai2618
    @songai2618 2 года назад

    The interviewer is so patient given the situation....

  • @zoeyzhao4446
    @zoeyzhao4446 3 года назад +1

    this is supppper helpful!!!! is someone can help me understandig why expecting candidate elaborate more on the user experience before dive into the question?

    • @mindfuel-ness
      @mindfuel-ness 3 года назад +1

      Because they don't just want to test you for how fast you can solve a leetcode problem but also see how you can tie your numbers to business side

  • @weicaroline1563
    @weicaroline1563 3 года назад +5

    WITH du AS (SELECT
    u.Activity
    , u.Date
    ,count(UID) as sum_act
    FROM user_activity as u
    GROUP BY
    u.Activity, u.Date) ,
    Daily AS (
    SELECT u.Date
    ,COUNT(UID) as d_sum_act
    GROUP BY 1)
    SELECT du.*,du.sum_act/daily.d_sum_act
    FROM du
    INNER JOIN Daily
    ON du.Date=Daily.Date
    Here is my answer to the first part of the question, any comments/feedbacks would be appreciated thank you!

    • @ajitkirpekar4251
      @ajitkirpekar4251 2 года назад

      Thats how I would have done this if it makes you feel any better.