Chunking Strategies for RAG Applications!

Поделиться
HTML-код
  • Опубликовано: 14 окт 2024
  • LLMs are bound to hallucinate and then we have different strategies to mitigate this hallucination behaviour of LLMs. One such strategy is Retrieval Augmented Generation (RAG), where a knowledge base is already augmented/provided to the LLM to retrieve the information from and this way LLMs won't hallucinate since the knowledge base is already specified.
    RAG involves a step by step process of loading the documents/data, splitting the documents into chunks using any AI framework such as LangChain or LlamaIndex, and vector embeddings generation for the data and storing these embeddings in a vector database.
    So, broadly we can divide the RAG into two main parts, Storing and Retrieval.
    While enhancing our RAG pipeline, one thing we need to look at is the retrieval strategy and techniques involved. We can improve retrieval in RAG using the proper chunking strategy. But finding the right chunk size for any given text is a very hard question in general.
    Today, we will see how semantic chunking works.
    Semantic Chunking considers the relationships within the text. It divides the text into meaningful, semantically complete chunks. This approach ensures the information’s integrity during retrieval, leading to a more accurate and contextually appropriate outcome.
    Let's experiment with Semantic chunking & see the results.
    Here is the complete Notebook code: github.com/pav...
    You need SingleStore free account to get started with the tutorial.
    Sign up for free to SingleStore here: bit.ly/3Y2I4cV

Комментарии • 4

  • @digitalcoder5772
    @digitalcoder5772 2 месяца назад +1

    Great Video Pavan, Your content on Linkdin is really commendable, Thanks for share

    • @pavanbelagatti
      @pavanbelagatti  2 месяца назад

      Thanks a lot for the support here also. Really appreciate it:)

  • @chaithanyavamshi2898
    @chaithanyavamshi2898 2 месяца назад +1

    Excellent Video Pavan. Please share the slide deck

    • @pavanbelagatti
      @pavanbelagatti  2 месяца назад +1

      Thanks. I did not prepare a slide deck for this one. The notebook link is in the description so you can try.