How We're Building AI Search Engines using LLM Embeddings

Поделиться
HTML-код
  • Опубликовано: 22 авг 2024

Комментарии • 29

  • @engKanani
    @engKanani 2 месяца назад

    excellent video, much better than tons of other long "bla bla" videos out there, thanks!

  • @mrdatapsycho
    @mrdatapsycho 11 месяцев назад

    Short and compact. Excellent video to get an Overview of LLM-based search.

  • @bracodescammer
    @bracodescammer 11 месяцев назад

    Awesome. I learned how to build inverted indices and this here now seems so simple, yet versatile in comparison.

  • @shubhamroy7403
    @shubhamroy7403 11 месяцев назад +5

    Great video man. Can you upload more videos like this explaining a bit more on the code side?

    • @thinknimble
      @thinknimble  11 месяцев назад +1

      William here - thanks for watching! And sure thing. I recorded myself building out the backend of this. I'll get that edited and posted soon.

    • @J3R3MI6
      @J3R3MI6 11 месяцев назад +3

      @@thinknimblethis is worth a sub 🙏🏽💎

    • @thinknimble
      @thinknimble  10 месяцев назад +1

      Finally got the more in-depth video posted!
      ruclips.net/video/OPy4dLHdZng/видео.html

  • @WishyIwish
    @WishyIwish 11 месяцев назад +1

    Very nice video - cool to get another perspective on RAGs and how to implement them with a very different stack.

  • @jeromeeusebius
    @jeromeeusebius 10 месяцев назад

    Thanks for sharing. Great video. This is a useful self contained template for a search usecase. I plan to apply this to one of my use case. I watched the video twice now and the second time around I got a much better understanding. Another interesting part, like you mentioned, is using another LLM call to potentially get an explanation for the output. One question I had, which addressed it towards the end, is how do you logically split the document to ensure consistency, not splitting in the middle of toughts/ideas, etc. One could even try different schemes, e.g., using another higher-level ML model to evaluate different splitting schemes.
    Thanks once again.

  • @RustemShaimagambetov
    @RustemShaimagambetov 10 месяцев назад +1

    To be honest, this is not a search through LLM, embeddings that generate large language models, in videos just used sentences transformers (all-MiniLM-L6-v2)

    • @thinknimble
      @thinknimble  10 месяцев назад +1

      That's correct. The goal is not to generate an LLM, but to use an existing LLM to search natural language content.

  • @mvasa2582
    @mvasa2582 10 месяцев назад

    lot of potential - research papers / legal etc. Nice job is explaining . Now how would you productize this so that a customer would simply need to install it on a target system or drive where they have their files stored - you have the ability to automatically (efficiently) consume those ... and inference on any new adds?

    • @thinknimble
      @thinknimble  9 месяцев назад

      Great questions. We'll likely explore these in future videos.

  • @jpops8767
    @jpops8767 10 месяцев назад

    Thanks for the vid!! Have you guys heard of the Bittensor / Opentensor foundation?

    • @thinknimble
      @thinknimble  9 месяцев назад +1

      We have not. We'll check it out!

    • @jpops8767
      @jpops8767 8 месяцев назад

      @@thinknimble You won't be disappointed!

  • @devd4001
    @devd4001 11 месяцев назад +1

    Great video, but I am unable to find the code in the given github link, could you please add the python script!

    • @thinknimble
      @thinknimble  11 месяцев назад +1

      Thank you for your question! The code on GitHub is an entire project with a frontend ('client' folder) and backend ('server' folder). The key Python code demonstrated in this video is a few folders deep in core models.py: github.com/thinknimble/embeddings-search-demo/blob/main/server/vector_demonstration/core/models.py
      I hope this helps!

    • @ShikharDadhich
      @ShikharDadhich 11 месяцев назад

      Thanks a lot ☺@@thinknimble

  • @SiD-hq2fo
    @SiD-hq2fo 11 месяцев назад

    this maybe weird question but, being beginner I'm thinking about getting into it, do you have any other platform like discord where you share and interact with user like me ? :) thank you

    • @thinknimble
      @thinknimble  11 месяцев назад

      Not weird at all! I know many RUclips channels have Discords. We appreciate your interest, but we do not currently have a public Discord, but we will consider setting one up in the future.

  • @andylee8283
    @andylee8283 11 месяцев назад

    FYI!!!!

  • @khatharrmalkavian3306
    @khatharrmalkavian3306 10 месяцев назад

    This honestly seems like the dumbest application of this technology.

  • @wildfotoz
    @wildfotoz 11 месяцев назад

    wow - there are so many things wrong with this video it's not funny. First, Excel documents are structured data, not unstructured. If the data was a bunch of resumes in word or pdf format, then you'd have unstructured data. Second, your csv files are not csv. Csv stands for comma separated values. You showed html files. I feel sorry for your clients.

    • @thinknimble
      @thinknimble  11 месяцев назад +4

      Hello! We appreciate your comment. William didn't mention Excel documents, but you raise an interesting and profound question about what "structured" vs. "unstructured" data even is. LLMs are fascinating, because they seem to be revealing a hidden structure in even the most unstructured, natural language.
      The files in the video and available in the open source codebase are definitely CSVs. So - sorry - you are wrong about that! But as you observed, the 'description' column of each CSV file is HTML. Since HTML is plaintext, it's can be included inside a CSV.

    • @agentDueDiligence
      @agentDueDiligence 10 месяцев назад +2

      LOL - if you think excel files are structured data, then you have never really worked with excel files in the real world 😂
      Nothing could be less structured than your average excel file 😂

  • @timonweb_com
    @timonweb_com 11 месяцев назад

    Hey, nice idea about chunks! Thanks a lot for the video!