RAG explained step-by-step up to GROKKED RAG sys

Поделиться
HTML-код
  • Опубликовано: 9 июн 2024
  • Today I try to answer all questions by my subscriber about my last three videos, w/ focus on the new Grokked LLM integration into traditional RAG systems.
    I'll cover a wide array of questions, incl ARM Graph based re-ranker for optimal RAG systems to new "Buffer of Thoughts" BoT reasoning methods of LLMs (so we have Chain-of-Thoughts, Tree-of-Thoughts, Graph-of-Thoughts and now: Buffer-of-Thoughts to kind of force the LLM to solve a causal reasoning task, really?).
    With the final answer to GROKKED RAG systems. Smile.
    So in this video only a little bit on new AI research, but a lot of explanations for optimizing currently operating RAG systems, that depend on a vector store and how to improve their overall performance with grokked transformers (Grokked LLMs).
    #airesearch
    #grokkedLLM
    #grokkedRAG
  • НаукаНаука

Комментарии • 33

  • @MBR7833
    @MBR7833 22 дня назад +8

    Never commented on a RUclips video in 15 years but I need to now. thank you so much for your work and questioning the status quo.

    • @code4AI
      @code4AI  21 день назад +2

      You are so welcome!

    • @wibulord926
      @wibulord926 17 дней назад

      well, actually your channel just created 10 years ago

  • @gileneusz
    @gileneusz 22 дня назад +6

    44:59 I came into conclusion that maybe I should get 3 months off from AI and just get a long holiday, I'll come back when it will be all sorted out lol

  • @mulderbm
    @mulderbm 23 дня назад +1

    Love the undertone not sure everyone gets it 😅 interesting take on the VCs trying to force the scientists to make it work to get their ROI out of failed tech

  • @manslaughterinc.9135
    @manslaughterinc.9135 9 дней назад

    The examples that you give are rag for specifically causal reasoning. The gains are much higher when rag is used for domain specific knowledge.

    • @code4AI
      @code4AI  9 дней назад

      Causal reasoning works on domain specific knowledge.

  • @gileneusz
    @gileneusz 22 дня назад +2

    38:10 1000 dimensions and 10% efficiency. We need to make more dimensions, like 10 000 or 100 000 000 000 to get 15% efficiency 😵‍💫

  • @LamontCranston-qh2rv
    @LamontCranston-qh2rv 22 дня назад +1

    A commenter on an earlier video suggested using quaternions instead of vectors. I wonder if that approach might actually save the VCs and startups from total ruin? Maybe worth a try? I do love that we are full circle back to building models though in the meantime! Outstanding work professor, as always! Thank you so much for creating and sharing these videos!

    • @BjornHeijligers
      @BjornHeijligers 15 дней назад

      Nice idea. Quaternions solve the problem of mathematical singularities when wanting to calculating ANGLES between vectors. As we are not working with the actual angles, but are working with inner products or similarity scores quaternions, as far as I can see, are not a natural fit to working with high dimensional semantic vectors.
      Would love to be proven wrong, though.

  • @agsystems8220
    @agsystems8220 23 дня назад +2

    The use of an 'apply logical reasoning' step should not be thought of as a weakness of the approach, because reasoning is inherently recursive. Any set up that doesn't have something like this is fundamentally bounded, no matter how well built or trained. That can be extremely powerful in an idiot savant way, but can never really be called intelligent. There is no finite machine that could answer any question in one step.
    I am completely with you that we are not doing it quite right, but I don't agree with you that parametric memory is the way forward (didn't bring it up here, but you did on the last one). It has fixed size, It doesn't scale efficiently, and I don't think you can really call a system that needs to train heavily on a reasoning technique before using it a reasoning engine. For that moniker it needs to be able to have a logical technique explained to it, and immediately apply it to a problem without retraining. It needs to be able to reason about reasoning, and do it in a one shot setting.
    The reason it needs to be able to one shot reasoning is that complex reasoning is not bounded in complexity, so it needs some way of unbounding it's memory and run time. The obvious way of doing this is to let it fill in more tokens. At that point you might have some tokens saying something like "we should try induction, induction is done by ...", and the model needs to be able to follow that recipe. That will probably involve searching for further decompositions and relevant facts, often dead ends, though training can improve how this search occurs. Importantly you need to be able to mix information from global memory with local context specific information, so it absolutely makes sense for them to be in the same 'language'.
    Maintaining knowledge as a separate database is the only way to build something that really scales, and having this database also holding all but a minimal set of foundational logical tools seems sensible. The specifics of how you do this are hard though.

  • @_Han-xk1zv
    @_Han-xk1zv 20 дней назад

    Are you familiar with the reasons for conducting re-ranking step? Specifically, given the premise of extracting relevant document candidates using only DPR, I'm curious about your perspective on why we'd need to conduct re-ranking using a cross-encoder, in addition to extracting relevant document candidates by computing cosine similarity with a bi-encoder.

  • @iham1313
    @iham1313 21 день назад

    in regards to your aversion on RAG (which i can relate to): how to build an ai system, that is able to cite back from texts (or video/audio), if not using embeddings, metadata & rag?
    lets say you want to build an domain specific ai tool. it should gather information from different sources and when asked about something from within that domain, an answer should be provided including the reference. (page of a document, timestamp from audio/video, text block from a webpage).
    i struggle to see when to use which strategy.

    • @mattager5548
      @mattager5548 20 дней назад +1

      I don’t think the end game of grokking is to get really good at giving users existing data, the hope is to be able to reason about novel things that humans might be unable to or just haven’t gotten around to yet. RAG seems like the best solution for our current generation of AI that isn’t that reliable

  • @ngbrother
    @ngbrother 22 дня назад

    With a long enough pre-prompt and context window? Can you trigger a grokked transformer phase transition through in-context learning only?

    • @adinsoftic
      @adinsoftic 22 дня назад

      For in context learning, there is no actual learning and updating of model parameters. So no grokking for that

  • @fire17102
    @fire17102 23 дня назад

    So you have to train a transformer for this? Can we fine-tune a base model on our data? Is this what X's Grok is doing ?

    • @code4AI
      @code4AI  23 дня назад +1

      Unfortunately I have no insights into Musk.

  • @En1Gm4A
    @En1Gm4A 23 дня назад

    Just a basic question - isnt reasoning also able to be done via search in semantic graph as shown in that one paper? They where able to visually show the trace needed to solve the task. why does one need a grogged transformer? shortest path search between semantic concepts or so on

    • @En1Gm4A
      @En1Gm4A 23 дня назад +1

      q* ?

    • @code4AI
      @code4AI  23 дня назад

      Sure, we have all the graph based message passing in the world. Like this video I did 2 years ago ruclips.net/video/i_Tm3ZQScv8/видео.html&t
      But here we are talking about a different technology ... take a minute and think about it.

    • @En1Gm4A
      @En1Gm4A 23 дня назад +1

      @@code4AI you mean a more expensive way to do the same thing?

    • @jaredgreen2363
      @jaredgreen2363 23 дня назад

      Yes, but it’s inherently slow. If the graph of inferred facts branches at all it will require exponential time for the length of the resulting line of reasoning at least. Providing heuristics to pick the most promising path to extend significantly reduces that. That is what a language model can be used for.

  • @gileneusz
    @gileneusz 22 дня назад

    20:11 This is a framework for large companies or suicidal startups

  • @fire17102
    @fire17102 23 дня назад

    Is there a grokked gpt4 level model on ollama?

    • @code4AI
      @code4AI  23 дня назад

      Smile.

    • @fire17102
      @fire17102 23 дня назад

      @@code4AI how about gpt3.5 level model? Is this all purely hypethetical? Or everyone is just in stealth mode ?

    • @code4AI
      @code4AI  23 дня назад

      Stealth mode???? Meta is publishing about it. Google is publishing about it. Microsoft is publishing about it. OpenAI is publishing about it ......

    • @fire17102
      @fire17102 13 дней назад

      @@code4AI maybe I misunderstood... Basically I'm asking if there's a grokked model to play with, or not yet.. thanks 🙏

  • @artur50
    @artur50 23 дня назад

    any github project on that?

    • @code4AI
      @code4AI  22 дня назад

      About 1000 on RAG ... for Grokked LLM you have to go with the research published by OpenAI, Microsoft, Meta and Google, just to name a few ... I haven't heard back from Apple yet.

  • @michaelmcwhirter
    @michaelmcwhirter 22 дня назад

    Are you monetized on RUclips yet? 😃