Advanced RAG 03 - Hybrid Search BM25 & Ensembles

Поделиться
HTML-код
  • Опубликовано: 25 янв 2025

Комментарии • 35

  • @SpenceDuke
    @SpenceDuke Год назад +11

    Very thankful for these videos, please continue with the series

  • @kenchang3456
    @kenchang3456 Год назад +3

    Thanks Sam, this title caught my eye as something I could use in my POC for better search results. Really appreciate you sharing.

  • @sup5356
    @sup5356 Год назад +3

    concise, interesting and useful content as always. Super series this, very interesting. Many thanks!

  • @micbab-vg2mu
    @micbab-vg2mu Год назад +1

    Great thank you !!! This hybrid approach is quite interesting.

  • @flipper71100
    @flipper71100 Год назад +2

    This was awesome and quite informative, I heard about BM25 a couple of days back and now I know where it fits.
    Also I would request you to do a video on RAG Fusion if possible.

    • @samwitteveenai
      @samwitteveenai  Год назад +1

      Yeah will certainly going to do a RAG Fusion video

  • @muhammadhasnain8177
    @muhammadhasnain8177 Год назад

    Thanks for this video please continue this series

  • @aimaniaco
    @aimaniaco 5 месяцев назад

    I’ve come across multiple videos of this channel and I am not even a subscriber. The way I’ve realized it’s always this channel is because the guy says “here” a lot 🤣 great content tho!

  • @arindamdas70
    @arindamdas70 Год назад +1

    @sam thanks for the content, will you please explain how can we use hybrid search for the document which you have used for self querying retrieval.

  • @K-Djoon
    @K-Djoon Год назад

    Thank you so much for your sharing. This is so amazing!!

  • @robertsun161
    @robertsun161 3 месяца назад

    Okay, but I have a doubt. Does it make sense that if I search "Apple Phones" it returns to me "I love fruit juice"? In which cases would we be interested in these cases in real projects? I think it returns more noise to the documents context.

  • @ChairmanHehe
    @ChairmanHehe Год назад +2

    how is bm25 able to retrieve documents that do not contain any verbatim words from the query?

    • @SachinChavan13
      @SachinChavan13 7 месяцев назад

      This is very important question to understand. There are lot of videos of RUclips but many of them just do not deep dive explain. They just explain what's working and ignore what's not working. I am facing tons of problems while implementing RAG in actual projects.

  • @mehmetbakideniz
    @mehmetbakideniz 3 месяца назад

    I presume we can use ensemble retriever's retrievals in a reranker right?
    would the chaining be the same ? And would you recommend doing it ?

  • @toddnedd2138
    @toddnedd2138 Год назад

    By evaluating vector DBs for production, i came across weaviate, which supports hybrid search out-of-the-box also with weighting the search results. Maybe it depends on your use-case if you go with langchain or a build-in DB solution.

    • @sup5356
      @sup5356 Год назад

      yes! same, it's excellent

    • @morespinach9832
      @morespinach9832 11 месяцев назад

      How is Weaviate different from Pinecone or the functionality inside Neo4J?

    • @KeithBourne
      @KeithBourne 8 месяцев назад

      Weaviate lets you pick the ranking algorithm you want to use, which is a step up from using LangChain directly. But not everyone uses Weaviate for various important reasons, and there are only two algorithms currently available. I imagine that is something LangChain will add directly eventually, and Weaviate will continue to build out. The Reciprocal Rank Fusion algorithm that LangChain uses is probably good enough for most use cases, so you can probably live without Weaviate if you are already committed to something else. But definitely good to consider weaviate if hybrid is important to you, plus a whole bunch of other reasons.
      But as far as this demo goes, it shows the default way to do hybrid search with LangChain, so very useful for anyone looking into this approach. Then you just build your knowledge from there! For example, you may want to write a function that adds a third retriever to the rankings weighted in a certain way that is beneficial to your specific use case. Start with this demo, and then replace the ensemble retriever with your own function within the LangChain chain.

  • @carlosperezcpe
    @carlosperezcpe Год назад

    Hey man, don't forget to use night mode. If watching on a big screen it's had. Thanks for the video 👊

  • @ninonazgaidze1360
    @ninonazgaidze1360 Год назад

    Which model would you use in production for a search over in documents? Thank you

  • @tinyentropy
    @tinyentropy Год назад

    How do I use this with more than a single key word?

  • @henkhbit5748
    @henkhbit5748 Год назад

    Interesting add-in feature from langchain. You can search with BM25retriever in source documents (pdf etc.)? Does it search directly in the source document or in the embedding? I suppose ensemble both searches will affect performance when querying a lot of documents/embeddings... Thanks for the update!👍

    • @samwitteveenai
      @samwitteveenai  Год назад

      The ensemble allows you to do both. Yes you can use the BM25 alone if you just want that.

  • @cjp3288
    @cjp3288 Год назад

    Hey Sam!, these videos are great, thank you for taking the time to make them. In your oppinion what is the best, managed, large scale RAG solutions provider? I'm helping a company that has around 2000 plus documents,

  • @aifarmerokay
    @aifarmerokay Год назад

    Thanks a lot, please add agents with vectordb & rag video

  • @hqcart1
    @hqcart1 Год назад

    how to do this if you have millions of text rows and you want your search to be sub 100ms?

  • @dontknowmyname5973
    @dontknowmyname5973 Год назад

    Very useful, thank you! Is it possible to update it with opensearch vectorstore instead of FAISS?

    • @KeithBourne
      @KeithBourne 8 месяцев назад

      Thats what makes LangChain great, you just swap out vectorstores/retreivers. For example, for ChromaDB:
      vectorstore = Chroma.from_documents(
      documents=documents,
      embedding=embedding_function,
      collection_name=collection_name,
      client=chroma_client
      )
      dense_retriever = vectorstore.as_retriever()

  • @perpetuallearner8257
    @perpetuallearner8257 Год назад

    Hey SAM, can you please make a video on RAG using knowledge graphs?? Thanks

    • @samwitteveenai
      @samwitteveenai  Год назад

      Yeah I really want to do this, one challenge is find a way for people to easily run Neo4J. I have been looking at alternatives if you have any suggestions please let me know.

  • @pavanpraneeth4659
    @pavanpraneeth4659 Год назад

    Awesome

  • @stephenthumb2912
    @stephenthumb2912 Год назад

    this is a nice idea but the actual implementation is way different than the example. there's no way you can implement this at all like the example which is disappointing. keyword style searches are imo very poorly supported in langchain. there are tons of gotchas for e.g. in the elasticseach classes which seems to me be one of the only realistic ways to implement this.