Vector Database Search - Hierarchical Navigable Small Worlds (HNSW) Explained

Поделиться
HTML-код
  • Опубликовано: 16 дек 2024

Комментарии • 29

  • @datamlistic
    @datamlistic  6 месяцев назад +1

    Text tokenization is one of the most overlooked topics in LLMs, although it plays a key role in how they work. Take a look at the following video to see how the most popular tokenization methods work: ruclips.net/video/hL4ZnAWSyuU/видео.html

  • @poppop101010
    @poppop101010 6 месяцев назад +3

    best explanation on youtube atm

  • @Andies450
    @Andies450 Месяц назад

    You have explained a complex topic in very simple terms. Keep up the good work.

  • @ZoinkDoink
    @ZoinkDoink 5 месяцев назад

    great explanation, massively underrated video

    • @datamlistic
      @datamlistic  5 месяцев назад

      Thanks! Glad you liked the explanation! :)

  • @parth191079
    @parth191079 Месяц назад

    Very good explanation with a good use of animations!

    • @datamlistic
      @datamlistic  28 дней назад

      Thanks! Glad you think so! :D

  • @himanikumar7979
    @himanikumar7979 5 месяцев назад

    Perfect explanation, exactly what I was looking for!

    • @datamlistic
      @datamlistic  5 месяцев назад

      Thanks! Glad you found it helpful! :)

  • @AkshayKadamIN
    @AkshayKadamIN 3 месяца назад +1

    Good Explanation Provided. Thank you vey much for this.

    • @datamlistic
      @datamlistic  2 месяца назад

      Thanks! Glad it was helpful! :)

  • @Lukas-il6xg
    @Lukas-il6xg 2 месяца назад +1

    In 6:37 there is an mistake because the "closest to query"-point was in Level 2 already there and not selected. Do you understand?

  • @emanuelgerber
    @emanuelgerber 3 месяца назад

    Very nicely explained! Thank you for making this video

  • @Omarsayan
    @Omarsayan Месяц назад +1

    When initially building the small world we need to iteratively look for k nearest neighbors while inserting the new documents. How do we find those neighbors

  • @andrefu4166
    @andrefu4166 6 месяцев назад

    great explanation

    • @datamlistic
      @datamlistic  6 месяцев назад

      Thanks! Happy to hear that you liked the explanation! :)

  • @maheswaranparameswaran8532
    @maheswaranparameswaran8532 3 месяца назад

    Hey, have a question, isnt there a risk of getting stuck in a local optima when comparing similarity between query node and db nodes in the graphs?

    • @datamlistic
      @datamlistic  3 месяца назад +1

      Good question! Of course there's always a chance of getting stuck in a local optima because you are basically using a greedy algorithm here, and that's why you usually perform the search algorithm multiple times, so you reduce the chance of that happening.

  • @lynnwilliam
    @lynnwilliam 6 месяцев назад

    How did you go from a group of random vectors to a skip linked list structure?

    • @datamlistic
      @datamlistic  6 месяцев назад +2

      The nodes between levels represent the same vectors. Basically on the top level you have you have a sparse graph of vectors and on the lowest level you have the entire graph. Similar to a skip linked list, you move to another node in the same level if that's closer to the query, or move down if no such nodes exists. This allows you to travel higher distances since you start at a higher level. Hope this makes sense and please let me know if you need further clarification! :)

  • @alicetang8009
    @alicetang8009 5 месяцев назад

    If the K equals to the total number of documents, will this approach also be like brute force? Because it needs to go through each linked document.

    • @datamlistic
      @datamlistic  5 месяцев назад +1

      If k equals to the number of documents, why not just simply return all documents? :)

  • @saisaigraph1631
    @saisaigraph1631 27 дней назад

    could u please come up with nlp bascis course please. also basics of ML course please.

    • @datamlistic
      @datamlistic  21 день назад

      First of all, thanks for becoming a member of this channel!❤️ I thought of making an introductory course for either NLP or ML (although the later is a bit saturated). Stay tuned for updates!