From Scratch: How to Code K-Nearest Neighbors in Python for Machine Learning Interviews!

Поделиться
HTML-код
  • Опубликовано: 4 окт 2024

Комментарии • 29

  • @emma_ding
    @emma_ding  3 года назад +3

    Thanks Jiayi Wu -- At 4:49, for implementation of the self.distance function, you can refer to this video. ruclips.net/video/uLs-EYUpGAw/видео.html

  • @stella123www
    @stella123www 3 года назад +4

    Hey Emma, appreciate the time and effort creating amazing content. Your channel has helped me to get a DS offer from a top tech company, the AB testing series is intuitive. Needless to say, ML-related topics aren't complex as other sources, they are easy to understand and the implementation part is awesome. Looking forward to watching more ML-related videos!

  • @shanggao9970
    @shanggao9970 2 года назад

    Thanks Emma! This is helpful! Looking forward to the videos on optimizing naive KNN algo and more ML algo!

  • @techedu8776
    @techedu8776 8 месяцев назад

    time complexity is not O(MN) as N is actually constant. Thus, complexity is O(M) + O(M log M) = O(M log M)
    also it is unnecessary to save all distances, only the top k matter, for which a constant k-size heap could be used. Thus space complexity is constant.

  • @ZhensongRen
    @ZhensongRen 2 года назад +1

    log(n_samples) is not necessarily larger than n_features. 2**20 ~ 1 million, 2**30 ~ 1 billion. we could easily have more than 30 features

  • @hankyujang7928
    @hankyujang7928 3 года назад

    I really appreciate your effort in preparing these contents! By far, going over your videos is the most efficient way for me to re-cap the key concepts to prepare for data science interviews. Thanks a lot!

    • @emma_ding
      @emma_ding  3 года назад

      I'm glad you like them! Thanks for taking the time to tell me and best of luck with your interviews.

  • @shreekantgosavi4726
    @shreekantgosavi4726 3 года назад +3

    Helpful

  • @ashritkulkarni9186
    @ashritkulkarni9186 3 года назад

    Thanks a lot for your content and efforts. Your videos are very helpful for the revision. Looking for more videos on other algos.

  • @nanwang6927
    @nanwang6927 2 года назад

    Hi Emma, thank you so much for making awesome videos! It helped me a lot!

  • @mrfmorita
    @mrfmorita 2 года назад

    Incredible video! Awesome explanation

  • @MinhNguyen-lz1pg
    @MinhNguyen-lz1pg Год назад

    Thank you Emma! Great content as always. It was so hard to find reliable resources to learn MLE interviews nowsaday haha

    • @emma_ding
      @emma_ding  Год назад

      Thanks for your commend, Minh. I'm glad to hear you're finding my content helpful! 😊

  • @licdad3066
    @licdad3066 2 года назад +1

    I think the predict function is only for one point prediction. I have made some updates to have it for a dataset:
    def predict(self, data, k):
    predict_output = []
    for point in data:
    distance_label = [
    (self.get_distance(point, train_point), train_label)
    for train_point, train_label in zip(self.x, self.y)
    ]
    neighbors = sorted(distance_label)[:k]
    predict_output.append(sum(label for _, label in neighbors) / k)
    return predict_output

  • @jennywu799
    @jennywu799 3 года назад

    Hi Emma, thank you so much for all your videos, they are all super helpful! Can you please do more ML and Python coding videos in the future?

    • @emma_ding
      @emma_ding  3 года назад

      Yep! More to come, stay tuned!

  • @bettysi888
    @bettysi888 Год назад

    Amaizing content, thank you so much 🤟

  • @SuperLOLABC
    @SuperLOLABC 3 года назад +1

    Great video as always Emma! Do you think in new grad interviews they ask about A/B testing if its not in the job description?

    • @emma_ding
      @emma_ding  3 года назад

      It depends on the company, if they are in need of A/B testing experts, they might ask in the interview process. Also, don't fully trust the job description, I'd highly recommend asking the recruiter what kind of questions will be asked so that you could prepare accordingly!

  • @leolloyd2813
    @leolloyd2813 9 месяцев назад

    Isn't the space complexity from the distance_label array going to be O(m) + O(n), since we are first calculating the distance for each feature m, then summing it into a single value, then storing that value for each point in the training set n?

  • @CC-ji1rb
    @CC-ji1rb 2 года назад

    Hi Emma, thank you so much for your videos! I learned so much from them. Can you do one on Decision Tree ?

    • @emma_ding
      @emma_ding  2 года назад

      Hi CC! Thank you for the feedback, glad you find my content helpful! Sure, we can do a video on Decision Tree, stay tuned! :)

  • @roguenoir
    @roguenoir 2 года назад

    Would it be possible to use a min heap instead of sorting the points by distance (and implement this in linear time instead of NlogN)?

  • @likhithadusanapudi8212
    @likhithadusanapudi8212 9 месяцев назад

    nice video, may I get the source code of this video.

  • @Ragnarik17
    @Ragnarik17 3 года назад

    The interviewer ask me about this. It is quite embarrassed to give the wrong answer

  • @jiayiwu4101
    @jiayiwu4101 3 года назад

    Am I missing self.distance somewhere? Thank you!

    • @emma_ding
      @emma_ding  3 года назад +1

      Sorry about missing the function. You can refer to ruclips.net/video/uLs-EYUpGAw/видео.html for the implementation.

  • @hasszhao
    @hasszhao 2 года назад

    Hey Emma, I get stuck with the definition the data structure of x and y ruclips.net/video/P-mM9396Dn8/видео.html , would you mind illustrating this much deeplier?