What is Retrieval Augmented Generation (RAG) and JinaAI?

Поделиться
HTML-код
  • Опубликовано: 31 дек 2023
  • Retrieval Augmented Generation (RAG) is one of the big AI patterns you must know for 2024, in this tutorial i break down the RAG pattern, What the Jina AI embeddings model isand why JinaAI is a game changer for LLM's such as GPT, Llama and Mistral.
    In the video, chris breaks down what the issues with LLM's such as GPT, Mixtral 7B and Llama-2 are and how the RAG pattern helps solves the problems of hallucinations, extending data.
    Chris also shows you in detail on how the RAG pattern exacty works under the hood, so you can truly understand what's going on
    He also talks about how JinaAI is different, how it works, how compares to openai ada embeddings model and how Jina AI will kick off the next model trend for 2024.
  • НаукаНаука

Комментарии • 21

  • @JAnders-oy2sv
    @JAnders-oy2sv 2 месяца назад

    Another clear and informative video, thank you! I agree, I think RAG will be huge in 2024. One thing I would like to know, is it possible to have the LLM list or identify the chunk or chunks used to produce a response? Perhaps metadata or indexes can be added to the chunks which the LLM can use when generating a response.

  • @mohamedghazal3630
    @mohamedghazal3630 4 месяца назад +2

    Thank you! very simple, precise, yet very informative!

    • @chrishayuk
      @chrishayuk  4 месяца назад

      Glad it was helpful!

  • @jonb9806
    @jonb9806 4 месяца назад +1

    Great video Chris. Even I could understand your explanation!

    • @chrishayuk
      @chrishayuk  4 месяца назад

      thank you, it was a really difficult one to find the right angle for, glad it was useful.

  • @willarnold3121
    @willarnold3121 5 месяцев назад +2

    Great video Chris!

  • @spheroid77
    @spheroid77 4 месяца назад +1

    Chris, great video as always. I learn so much from your channel, thanks. One thing I at least didn't quite "get" from this - where you talk about vectorization and embeddings - what actually *is* that process? The general concept I understand - turn the chunk into a numerical vector and compare them for similarity - but the vectorization itself - what is JinaAI doing at that point and how does it overcome e.g. the challenges of mismatching vocab between the question and the knowledge chunk without being externally trained itself on a bunch of stuff? Or maybe the embeddings are based on some other training from elsewhere? Was just a bit hazy on that point... maybe a thought for a future video if you're inclined :)

    • @chrishayuk
      @chrishayuk  4 месяца назад +1

      you're right.. that's probably a really good video to do, as it's quite complex. it's quite interesting. my latest video on tiktoken, explains vectorization for decoder models, but that's a little simplistic compared to embedding models. will do a video on embeddings

    • @spheroid77
      @spheroid77 4 месяца назад

      Fantastic, thanks! @@chrishayuk

  • @Kopp141
    @Kopp141 5 месяцев назад +1

    Thank you for a fantastic breakdown of RAG. I can now see why my Copilot trial at work is so bad at information retrieval. I'm guessing that as the queries get more complex and the spread of the data becomes wider, the less useful this method will become less effective. Does that push us toward a rolling fine-tune approach to a base model?

    • @chrishayuk
      @chrishayuk  4 месяца назад

      i think the hybrid model where you have a circle of RAG and finetune is likely the way forward

  • @harrykekgmail
    @harrykekgmail 3 месяца назад +2

    quality explanations!

    • @chrishayuk
      @chrishayuk  3 месяца назад

      Thank you, glad it was useful

  • @juliussakalys9600
    @juliussakalys9600 5 месяцев назад +1

    Hi Chris, I appreciate the high quality content! Could you do a video or just give a simple reply on where are you taking your expertise from? Maybe some communities, projects or anything of the sort. Personally (and I am sure that other people as well), I would like to become proficient in basically the same things that you are an expert at when it comes to engineering solutions for AI related problems :)

    • @chrishayuk
      @chrishayuk  5 месяцев назад +2

      That is a pretty difficult one to answer, in all honesty just let myself play. I’ve been using the RAG pattern a lot at work. However, I felt the explanations of RAG were either 1 level too high, or 2 levels too deep. That’s usually when I’ll do a video like this one, just to go under the covers and show what’s really happening

    • @juliussakalys9600
      @juliussakalys9600 5 месяцев назад

      Thanks, I guess the takeaway is to try and play more with such things myself.

    • @chrishayuk
      @chrishayuk  5 месяцев назад +1

      @@juliussakalys9600 it’s tough to know where to look and start, I’ll try and put some sort of guide together

  • @path1024
    @path1024 5 месяцев назад

    If you want facts you need to pump the determinism by lowering the temperature.

    • @chrishayuk
      @chrishayuk  5 месяцев назад

      Only works if the data is in the training set, also doesn’t solve the traceability issue. Finally the models shouldn’t be making up answers for Q&A type questions this is where models will get better through routing questions to the correct expert with MoE

    • @path1024
      @path1024 5 месяцев назад

      @@chrishayuk Well, it just makes it stick to the highest confidence answer. You were saying it kept putting out a different answer. That's usually temperature and its effect on topP and topK. And the higher the temperature, the more it hallucinates. It sounds like you're just saying it should know when to lower the temperature on its own for the type of question.