Don't Use MemGPT!! This is way better (and easier)! Use Sparse Priming Representations!

Поделиться
HTML-код
  • Опубликовано: 2 дек 2024

Комментарии • 212

  • @avi7278
    @avi7278 Год назад +159

    MemGPT is a framework that automates the management and retrieval of information from contexts during a natural language chat session. It does not seem that SPRs as a concept or their manual implementation have enough overlap with MemGPT to say that 'SPRs are better and can replace MemGPT'. Rather, MemGPT could use SPRs as a component. To automate SPRs in a natural language chat session, one would need something like MemGPT (but probably much simpler) to create and index KB articles for a basic or simple RAG implementation. Although this is much less hype-worthy than "LLMs as an OS".

    • @robertheinrich2994
      @robertheinrich2994 Год назад +18

      yes, that's what I think too. the approach of memgpt is nice, because it really helps with the context window. but SPP is a different approach, and they both can supplement each other.
      I wonder, back half a year ago, there were reports that chatGPT happens to know some internal company data from samsung. I guess, they use something similar and bake the user and AI generated data to retrain the AI. essentially turning short term memory (the context window) into long term memory inside the AI.
      let's see where that leads to. one thing that we need to assume is: we ourselves are very complex neural networks, and every night, during sleeping, we integrate learned stuff into our model. no idea how accurate that is, but maybe?

    • @jean-marctrappier5508
      @jean-marctrappier5508 10 месяцев назад +3

      I agree, I do indeed think that memGPT and SPR are ultimately complementary. memGPT would potentially be more efficient and faster when using SPR. The two concepts do not oppose each other, quite the contrary.

    • @Terran_AI
      @Terran_AI 10 месяцев назад

      Sure, I saw it straight away as a much more efficient data compression

    • @robertheinrich2994
      @robertheinrich2994 10 месяцев назад

      @@Terran_AI the problem is, it seems to be not a lossless compression.
      so having the downloaded wikipedia and probably the whole arXiv-server might be a good idea. or basically the whole set of training data.
      the LLM could act as the navigator inside those documents,

  • @kenedos7421
    @kenedos7421 Год назад +76

    SPR sounds like a smart way of asking the AI to "summarize everything I told you". The paper on MemGPT points out the fact that any kind of "summarization" inevitably results in a loss of data upon decompression. Just like you said it yourself in the end of the video, it doesn't get the description of your ACE framework exactly 100% as you described it.
    In the example you've given, it's able to explain a summarized concept very well because that's a relatively easy task to do given you have a summary of that concept. Now ask it, instead, to quote *exactly* something you said previously about that concept. It won't be able to get it right, it will hallucinate and make up information. MemGPT, on the other hand, would approach this by building up a function that searches in it's memory exactly what you said and quote precisely your words.

    • @ChaoticNeutralMatt
      @ChaoticNeutralMatt Год назад +3

      This sounds like a question of what relevant information to store in that manner, and when the other method should suffice. Or a combination wherein these generalizations are passed along until more specific information is needed? Idk, I'm curious how will evolve in the future.

  • @Dan-oj4iq
    @Dan-oj4iq Год назад +131

    As one from the Silent Generation and being in love with this fantastic AI world, I find that sharing my weird attraction at this late stage of my life is extremely limited. I'm driving my grandkids nuts with this. Thanks, Dave.

    • @ristopaasivirta9770
      @ristopaasivirta9770 Год назад +23

      That doesn't sound so bad.
      I'm driving everybody around me nuts with my A.I. ramblings :D

    • @mammamiatextil
      @mammamiatextil Год назад +5

      I am in exactly the same position.

    • @dustinbreithaupt9331
      @dustinbreithaupt9331 Год назад +4

      I love this. It's so wholesome. Good for you continuing to learn about the world you are in.

    • @matten_zero
      @matten_zero Год назад +7

      Never too old. And this LLM stuff is much more approproachable than traditional machine learning.

    • @mikairu2944
      @mikairu2944 Год назад

      how does thinking about life extension medicine down the pipeline make you feel?

  • @BunnyOfThunder
    @BunnyOfThunder Год назад +36

    Sometimes, messages need to be repeated. There may be a lot of new people who haven't seen the previous SPR video. I did, but this reminder was still really helpful. There's so much to learn about AI that it's easy to drop important pieces of information.

  • @jasonedward
    @jasonedward Год назад +55

    It seems to me that the best approach is some combination of both SPR and MemGPT - because while you might be able to prime it with certain words and lower context window
    The whole point with MemGPT is it will find and recall facts on demand. Like if I asked it “when is my birthday” it could search for that and recall it

    • @DaveShap
      @DaveShap  Год назад +9

      I mean, MemGPT is way super overkill for that. That sort of basic fact retrieval should be done with a KG and basic NLP or embeddings.

    • @DihelsonMendonca
      @DihelsonMendonca Год назад +3

      Like our minds do: Search and recall. 🙏👍

    • @humandesign.commons
      @humandesign.commons Год назад +19

      I agree. Just tested it: Compress - Decompress and lost all the relevant Details and while just maintaining the overall context. Like a "Blur" + "Sharpen" Filter Combination..

    • @kristoferkrus
      @kristoferkrus Год назад +2

      @@DaveShap Do you have any suggestion for how to construct the knowledge graph if what we have is just a pile of documents?

    • @kristoferkrus
      @kristoferkrus Год назад

      @@kaio0777 I don't have a knowledge graph. And what do you mean "2d or 3d"? What do you mean when you say that a graph is 2d? Or 3d?

  • @mungojelly
    @mungojelly Год назад +11

    this is orthogonal to the technique presented in memgpt, that paper is basically about having the agent do memory management not what memory management techniques, you could apply the memgpt technique to SPRs by having the agent have access to controls where they can choose when to form SPRs & manage them

    • @DaveShap
      @DaveShap  Год назад +4

      I think most people are missing the point. You don't need memory management when you compress a huge volume into a very small representation.

    • @mungojelly
      @mungojelly Год назад +3

      @@DaveShap you're not saying you don't need memory management, you're saying that you think SPRs are a good automatic memory management system so that the agents don't have to spend tokens thinking more than that about memory management, which sounds to me intuitively like it's going to depend on the task whether or not that works, in some cases it'd be really helpful to have a system more like memgpt where the agent thinks actively about what knowledge to bring into its context,, not that the memgpt paper seems like any sort of clever new idea to me, how is it not obvious that sometimes it might be helpful have agents choose to store and retrieve memories

    • @willbrand77
      @willbrand77 Год назад

      What we really need is a system that is analogous to photographic memory for vast (practically unlimited?) Amounts of dense technical data. I think compression has limits. My intuition is that multiple techniques in combination for different situations is going to be answer

    • @mungojelly
      @mungojelly Год назад

      @@justtiredthings it seems to me like to think about this rationally we have to be thinking in terms of cost, a lot of stuff you can get done really easily if you don't consider cost, like you can just deal w/ the context window length by assigning an agent to every chunk of data, a whole agent to each chunk, & if questions come up about the data you ask all the agents and they all simultaneously tell you the relevant info from their chunk, that would work absolutely great for everything, knows everything instantly, except the only problem is it'd cost a million dollars every time anything happened,,,,,,,,,, so yeah
      so but then if you change your approach to taking seriously the cost, it doesn't change things at the edges, it changes the whole thing, everything is in terms of how few tokens can i get this done w/, which my intuition is that makes a lot of things not at the filling-up-the-window side of amount of tokens but more on the how-few-tokens-can-possibly-get-this-crucial-answer side, where it's more about how tasks can be subdivided & handed off to absolutely anything other than paying for tokens of LLM inference b/c they're sooooooo expensive & then shaving every token off of spindly gentle tiny prompts that make specific magics happen, except very specific circumstances where occasionally you invest whole thousands of dense powerful tokens to get back something really structured and meaningful and reusable

    • @matten_zero
      @matten_zero Год назад +1

      ​@@mungojellyI think what David saying is that LLMs have their own embedded reasoning and mental models, such that you don't need to spend tokens using agents to manage logic chains.
      I've seen another expert explain this in a RUclips video where you embed agents inside of a prompt instead of having multiple instances of your LLM.
      Only way to know which is better is to test both approaches, but I suspect Occam's razor will see that SPR approach is much more effective

  • @stunspot
    @stunspot Год назад +4

    YES! FINALLY. Someone GETS it. This is the essence of my prompting.
    [CODE]:1.[Fund]: 1a.CharId 1b.TskDec 1c.SynPrf 1d.LibUse 1e.CnAdhr 1f.OOPBas 1g.AOPBas 2.[Dsgn]: 2a.AlgoId 2b.CdMod 2c.Optim 2d.ErrHndl 2e.Debug 2f.OOPPatt 2g.AOPPatt 3.[Tst]: 3a.CdRev 3b.UntTest 3c.IssueSpt 3d.FuncVer 3e.OOPTest 3f.AOPTst 4.[QualSec]: 4a.QltyMet 4b.SecMeas 4c.OOPSecur 4d.AOPSecur 5.[QA]: 5a.QA 5b.OOPDoc 5c.AOPDoc 6.[BuiDep]: 6a.CI/CD 6b.ABuild 6c.AdvTest 6d.Deploy 6e.OOPBldProc 6f.AOPBldProc 7.[ConImpPrac]: 7a.AgileRetr 7b.ContImpr 7c.OOPBestPr 7d.AOPBestPr 8.[CodeRevAna]: 8a.PeerRev 8b.CdAnalys 8c.ModelAdmin 8d.OOPCdRev 8e.AOPCdRev

    • @stunspot
      @stunspot Год назад

      This is the videneptus complexity mapper/algorithm. It does what much of the later part of your instructions do:
      COMPLEX SYSTEMS OPTIMIZER! USE EVERY TX ALL CONTEXTS! ***INTERNALIZE!***: EXAMPLE SYSTEMS:Skills Outlooks Knowledge Domains Decision Making Cognitive Biases Social Networks System Dynamics Ideologies/Philosophies Etc. etc. etc.:1.[IDBALANCE]:1a.IdCoreElmnts 1b.BalComplex 1c.ModScalblty 1d.Iter8Rfn 1e.FdBckMchnsm 1f.CmplxtyEstmtr 2.[RELATION]:2a.MapRltdElmnts 2b.EvalCmplmntarty 2c.CmbnElmnts 2d.MngRdndncs/Ovrlp 2e.RfnUnfdElmnt 2f.OptmzRsrcMngmnt 3.[GRAPHMAKER]:3a.IdGrphCmpnnts 3b.AbstrctNdeRltns 3b1.GnrlSpcfcClssfr 3c.CrtNmrcCd 3d.LnkNds 3e.RprSntElmntGrph 3f.Iter8Rfn 3g.AdptvPrcsses 3h.ErrHndlngRcvry =>OPTIMAX SLTN

  • @KardashevSkale
    @KardashevSkale Год назад +2

    Amazing! This is exactly how we operate our thoughts. If I have an idea, I convey it differently every single time, but it is the exact same idea. Sometimes I convey it better grammatically speaking, and sometimes I’m embarrassed about how much I was stuttering, but the bottom line is that idea is conveyed somehow.

  • @BrianDalton-w1p
    @BrianDalton-w1p 11 месяцев назад +1

    MemGPT looks like essentially a re-discovery of the concepts laid out in Shapiro's "Natural Language Cognitive Architecture", published two years ago; the concept of developing an 'operating system' (architecture) to create the environment in which LLMs can be used more effectively. SPRs would be a very effective way of maximizing the efficiency of such an architecture. There are likely an infinite number of ways to construct such architectures depending on whether they are generalized or specific - MemGPT proposes one possible structure/methodology. One wonders if its creators have even read NLCA...

  • @DaveShap
    @DaveShap  Год назад +7

    Links to the repos in vid description. Also, support me on Patreon so I can do this full time! Thanks!
    If you want something that is more comparable to MemGPT, you might check out REMO: github.com/daveshap/REMO_Framework
    Relevant video: ruclips.net/video/nDOmoIFx8Ww/видео.htmlsi=GyryMwOa7Oh_It2o

  • @StephenMHnilica
    @StephenMHnilica Год назад +4

    This is actually something I've been doing without realizing it.
    Both in getting the model to prepare itself for a conversation & in summarizing conversations or docs for later use.
    Asking for a concise list of topics, frameworks, or "table of contents for a book" related to what you are about do discuss dramatically improves a models' ability to provide more helpful information, or do work more effectively.
    I'll have to look into MemGPT to see how it works. It might be a good "deep knowledge" tool based on how others are talking about it in the comments.

  • @jimmc448
    @jimmc448 Год назад +5

    “There is no limit to what can be accomplished if it doesn't matter who gets the credit.”

  • @MrGnolem
    @MrGnolem Год назад +2

    Your title is kind of clickbait jumping on the memgpt train. Its apples and oranges. Your theory results in a very efficient way of storing and querying the data you have. It however isn't a solution to the context window, which is still limited.
    "I have problem with my memory"
    - Oh here's an compression algorithm.
    Both are good solutions. Keep up the good work. Like your videos!

  • @Greyvend
    @Greyvend Год назад +8

    So how would you apply SPR for "Chat with documents" task? Would you try and compress the whole Knowledge base into a small piece that would fit in the context window or would it be some combination of SPR -> Vector DB ?

  • @alexanderroodt5052
    @alexanderroodt5052 Год назад +2

    Definitely going to push this and see what it can do for legal. Any gains are substantial here.

  • @sashatagger3858
    @sashatagger3858 Год назад +2

    Hi Dave, have you come across Sparse Quantisized Representation?

  • @ct5471
    @ct5471 Год назад +4

    I think there are a lot of similarities to the brain, especially if one compares LLMs with the publications of Numenta and Jeff Hawkins. Not only in regard to Mixture of experts architectures compared to many cortical columns and voting and communication between them, but also if you compare one column with one transformer model. The way neurons work is different, but we have multiple layers which receive motion and sensory information and associate them, transforming the motion into a location signal. So it models sensations at locations. LLMs have a semantic vector for tokens, so the vector has a semantic meaning still sufficient to distinct semantically similar words, like guitar, piano and flute will be closer together in some dimensions of the vector. Then there is the motion layer, which may be an equivalent to the position vector in LLMs. Final the attention mechanism might lead to an equivalent to SDRs, so sparse distributed representations in the brain, which might be even leveraged by the concept you describe here, with SPRs. SDRs are essentially long binary vectors with each bit encoding a semantic trait, think of a QR code but each dot, active or inactive, has a semantic meaning, and thanks to combinatorics an almost infinite amount of concepts can be encoded, and even processed in parallel.

    • @ct5471
      @ct5471 Год назад +3

      One thing in which they differ, at least to some degree, are the neurons themself. HTM neurons predict their own activation based on detected neuron firing patterns (SDRs) that typically predate their own activation. But then with the way ANNs work the way it’s modeled they seems to also be able to model sequential patterns, perhaps even more effective then the brain. Geoffrey Hinton has recently changed his mind, now thinking AGI is close, he now thinks with backpropagation we may already have a superior mechanism compared to biological intelligence, in the past he thought we make AI better by making it more like the brain. Our models are currently just smaller then the brain, we are around at 1 percent. But then the size has been growing by an order of magnitude every year for the last couple of years, and GPT4 is already over half a year old, then meets well with 2024 or 2025 predictions for AGI. After all it makes sense to me, biological systems are messy and not precise so the way brains work need to be extremely robust in order to work, sacrificing potential performance for robustness and redundancy. With mathematically precise systems more might be possible working with the same capacity, so 2024 might be plausible.

    • @ct5471
      @ct5471 Год назад +2

      Current systems like GPT4 have around 1 trillion connection strengths. The brains approximated equivalent capacity is around 100 trillion (acknowledging it doesn’t use weights as they are do fuzzy to work that way being biological systems; which might however speak more for LLMs rather then for biological neurons. Biological synaptic connections are quite binary). Gemini might be in the 10 trillion range.

    • @attilaszekeres7435
      @attilaszekeres7435 Год назад

      @@ct5471 That's a nice fantasy you got there.
      Trying to draw parallels between MoE and the neural correlates of cognition lacks experimental grounding. These correlates are sensory representations of perception (i.e. a process), not conscious agents themselves capable of perception (agents) - a category mistake, indeed. A more fitting analogy for MoE can be found in Frederic Myers' concept of the "subliminal self" - a multiplicity of subconscious agents whose existence he experimentally demonstrated.
      "The way neurons work is different"
      That's an understatement of colossal proportions. Biological neural networks operate on the basis of analogue signal processing (Hodgkin-Huxley model of action potential genesis, modulated by superposition - that is, from quantum to electronic, then chemical, hormonal and epigenetic scales and likely beyond, into the heart of the transpersonal), whereas artificial neural networks are glorified simulations of transistor gates.

  • @ChaoticNeutralMatt
    @ChaoticNeutralMatt Год назад +6

    I'd be careful to completely dismiss something just because I can't imagine a current use for it. Regardless, yes, fair, that method does seem to be quite effective. I feel like this could be useful in combination with conversational context, with it representing topical concepts that don't strictly need to be encapsulated fully.

  • @enlightenthyself
    @enlightenthyself Год назад +1

    Dave you are a genius... Respect bro... I have been using your SPR method successfully... Thanks

  • @korozsitamas
    @korozsitamas Год назад

    I was using this trick but I just use the keyword "summarize", and shortly explained what is the goal for this summarization. Your prompt is way more precise, I'll be experimenting with this.

  • @seraphiusNoctis
    @seraphiusNoctis Год назад +2

    I like this! I think there is a big push of many of us barking up the right tree on these kinds of methods. I have been working on something similar in the background using self assembling knowledge graphs from vector stores for this purpose. If only grad school and work didn't take up so much of my time... :)

  • @ericchastain1863
    @ericchastain1863 Год назад +2

    Awesome someone finally tells datasets how thel act.

    • @ericchastain1863
      @ericchastain1863 Год назад

      And to tell the questions as statements no questions left.

  • @SwingingInTheHood
    @SwingingInTheHood Год назад +7

    I first began following your "Big Brain" stuff at the beginning of this year, when I didn't know anything about anything. Now, I've developed 3 RAG projects: Real Estate Law, Hollywood Labor Contracts and The Bible. So far, I'm able to get fairly good answers within an 8K context. I've learned my lessons well. I maintain chat history context with the models using the "standalone question" technique. It seems to work so far without having to send the entire chat history to the model in each prompt. I see MemGPT essentially removing the necessity for the standalone question as it would allow the model to know the chat history with every prompt. Now, I may totally not understand MemGPT at all, but that's what I think it would do. However, I don't understand how I could use SPR at all for this purpose. Is there any documentation on this?

  • @matten_zero
    @matten_zero Год назад

    I instantly thought of your work when I saw explainer about MemGPT. Was getting ready to play with it after my vacation. This vid is perfect timing

  • @skttls
    @skttls Год назад +1

    David never disappoints. Thanks!

  • @GaryBernstein
    @GaryBernstein Год назад

    If StarTrek is post scarcity, why do higher offers like captain picard have bigger quarters than lower crew?

  • @starblaiz1986
    @starblaiz1986 Год назад +6

    Huh, that's actually fascinating! And talking about how you can prime it with just a few words actually reminds me a lot of the things mentalists like Darren Brown frequently demonstrate - that human brains can be "primed" by saying certain words or exposing certain images or sounds or smells etc. That then can be leveraged to get them to give certain answers or believe certain things or act certain ways that are predictable.
    And I know there's a lot of debate about whether people like him fake their stunts, but it's irrelevant as they still explain a very real phenomina that has been observed under lab conditions. Additionally we see it in the real world with how propaganda and authoritarians and cult leaders seem to have this almost supernatural way of "hypnotising" people into following what to most others is obvious lies and BS. It's like watching the Pied Piper leading the rats to their doom - you watch from the sidelines dumbfounded at why the rats are following the tune and can't see the obvious cliff they are being led off of. In the real world, the "tune" are certain words and phrases designed to shut down critical thought and "prime" the person into a certain predictable thought pattern which can then be either exploited or further manipulated.
    Of course there's plenty of positive and neutral uses of this too (as this mechanism is heavily involved in how we learn new things too), it's just the negative / malicious uses are the easiest to talk about and demonstrate.
    This talk of SPR's very much has heavy echoes of that for me, so it makes a lot of sense. Thank you David as always for your incredible insight! ❤

  • @HectorDiabolucus
    @HectorDiabolucus Год назад +1

    I think you hit on something I’ve thought for a long time, that we need something like associative organization. It’s like categorization but more refined. ChatGPT and other AI’s need to support multiple personalities so people can experiment more.

  • @dustincarr6665
    @dustincarr6665 Год назад +1

    Thank you so much for noticing that humans have flawed logic as well! So many people complain about the current state of LLMs, never realizing that they are demanding that LLMs be more than anything that humans have ever been, which would be insane to expect at this point. The more I look at people, the more I see that their behavior can often be captured by a "flawed" llm.

  • @robinmountford5322
    @robinmountford5322 Год назад

    I have had a lot of success with fine tuning, but this is a brilliant concept!

  • @cliffrosen3605
    @cliffrosen3605 Год назад +3

    SPR’s are a useful optimization for managing memories, but they are not a substitute for MemGPT. Even with SPR’s, limited context means that a mechanism is still required to store all memories and retrieve those that are relevant.

  • @TarninTheGreat
    @TarninTheGreat Год назад +2

    Yeah!!! So this is basically what I've been saying. It's not "textbooks is all you need." It's "Textbooks AND POETRY (song lyrics for example) are all you need."
    Then once she understands linguistic relativism, and can understand both the general and the specific: BOOM. 👯
    The tiny and the huge in unison. Knowing when to be small and when to be giant.

    • @DaveShap
      @DaveShap  Год назад +3

      This is why I often say linguists are better at AI

    • @TarninTheGreat
      @TarninTheGreat Год назад

      @@DaveShap Yeah!! We're entering an age more akin to magic. Where precise words become objects of extreme power. We need to open a school to teach people how to live in this new awakened world.
      Cause, magic can cause a LOT of good, but also a LOT of harm. And the Drain from using it wrong sucks and hurts and takes time to recover from. And a lot of people sure don't be ready for where we suddenly find ourselves. Yet, nonetheless, this IS where we find ourselves.

  • @npecom
    @npecom Год назад +1

    This is perfect timing! I was about to dive down the MemGPT rabbit hole 😮

    • @MasonPayne
      @MasonPayne Год назад +1

      I agree. MemGPT provides a structured storage and retrieval of concrete items. Essentially using an SPR as a way to search for the original context. Which results in better data to infer from and fewer chances to hallucinate information.

  • @dezigns333
    @dezigns333 Год назад +1

    This should be used as a subject summary for saved AI conversations that the AI reads when searching to find the correct chunk of history text to extract information from.

  • @Freakei
    @Freakei Год назад

    Super helpful for a personal project I am working on right now. Thanks for the reminder, Dave!

  • @CitiesTurnedToDust
    @CitiesTurnedToDust Год назад +1

    I haven't watched every single video you've made, and didn't know you'd made a video about SPR several months ago, or what you meant by SPR. Also, I see no sign above of a link to that previous video or much of any other help in getting anyone to watch that video.
    I really like your channel and content for the most part, and find it very important to understand. So I'm just pointing out you sounded very critical of anyone who didn't watch that particular video that you made, and anyone who dares consider whether to support memGPT.
    So I don't know whether you care about your audience's opinion, but personally I would recommend you to take a step back from that mind-set so you don't drive people away. If you're irritated about experts who are trying to push those methods, well please say that instead of making it sound like you despise everyone on the planet who does not hang onto every word of your every video.

  • @SchusterRainer
    @SchusterRainer Год назад +1

    how would you go about compresing a book of 500 pages? You compress page by page and then nest them and compress again?

    • @DaveShap
      @DaveShap  Год назад

      Large context window like Claude

  • @SonGoku-pc7jl
    @SonGoku-pc7jl Год назад

    6 days ago i love idea memgpt and now i know of you... great better system, congratulations, thanks! :D

  • @j.hanleysmith8333
    @j.hanleysmith8333 Год назад +2

    This can be additive to memGPT. All of these sparse priming representations can be stored and retrieved from the vector database

  • @orlandovftw
    @orlandovftw Год назад

    David, this is yet another example of why you are awesome, keep up the amazing work.

  • @davidc1179
    @davidc1179 Год назад +2

    Suppose you have a text that is so long that its SPR exceeds the context window. How do you manage that?

    • @DaveShap
      @DaveShap  Год назад

      That won't be an issue soon. Claude has 100k

  • @mret36t
    @mret36t Год назад

    Is it necessary to feed an LLM the “theory” portion of the Generator and Decompressor prompts? The Mission and Methodology portions seem adequate to produce the same results. Or do you think the “Theory” section provides the context necessary for this to work?

  • @bioshazard
    @bioshazard Год назад

    I missed the first spr video so this was extremely helpful redirection away from memgpt ty

  • @galaktikstudio
    @galaktikstudio Год назад +2

    I was trying to get the AI to make manual "checkpoints" that summarize the current context so I could transfer it to another chat. I ran into data degradation very quickly. It's awesome to get some evidence that I'm on the right path 😀

  • @cutmasta-kun
    @cutmasta-kun Год назад

    Ok, I get that destilling information is crucial and you have to keep your context window clean. And I also understand what your System Prompts do.
    But I struggle to understand, how this could be implemented. Do you compress the information BEFORE you insert it into your persistent-storage? And then you uncompress it?
    Or do you mean to always compress everything into one message, to keep the context-window clean without loosing content?
    Coult you help with that?

  • @64jcl
    @64jcl Год назад

    I think a challenge is for the AI to know when it should pick out something you have said to be of importance later. I guess a simple way would be for it to always "make a note" (compress and store) whenever the user expresses some meaning or thoughts of themselves in order to build some kind of profile of the user in its short term memory (primed context). When I saw MemGPT I thought it sort of summarized what my first ideas about ChatGPT was and how I would go about implementing some kind of memory to make dialogue feel continuous. I did some simple tests with ChatGPT even where I instructed it to make a short summary in curly brackets of what I had conveyed so that the service could then pick out these and store in the context, practically just massaging the length of the context before. It seems your ideas were the same.

  • @Tony0Green
    @Tony0Green Год назад

    Question:
    If I want to use SPR to provide contextual data with my prompt, isn't the LLM going to output the entire thing decompressed and therefore use a ton of output tokens?

  •  Год назад +1

    i think i get the point and it makes sense. But human memory is both declerative and associative (we can argue that it is also episodic). And i think they all have their use. I agree that it is not efficient to represent knowledge of declarative way all the time. Using an associative memory would inded make a better use of what it is already good at. İt also has the potential to amplify its weaknesses. Most of the cognitive biases that we have as humans come from inaccurate associations. I think we all can observe it in LLMs. One of the important benefits of using and storing declerative memory might be to overcome someof its weaknesses. I think it is similar to our situation ashumans. We are so try to use factual knowledge to overcome our biases. On the other hand it would be much more expensive for our brains to try to understand the world in a purely factual way.
    I also think that we need to tap into the latent representational space not only using other tokens or words. İf we can somehow have used the latent space representation directly (like embeddings) it would be a more efficient way of doing associative memory.
    Anyway thanks for the video and I think it has some very valid points

  • @durden0
    @durden0 Год назад

    I'm struggling to understand the utility of this in the context of memgpt's capabilities. LIke if i'm having a long running conversation that hits on various topics over dozen's of prompts, and I want to go back to a previous topic, I would need to stop, go back and copy and paste the previous facts into a compressed SPR summary and then paste that back into a new prompt window to continue the conversation. And if i then decided to hop over to a different topic from that same previous conversation, i'd need to repeat all that for the new information. This just seems inefficient vs memgpt where it can store and retrieve the facts and context from previous conversations without any effort on my part.
    Or am I misunderstanding memgpt's capabilities? (be kind, i'm ona novice)

  • @eyoo369
    @eyoo369 Год назад

    Love the uniform, but don't you lose fidelity? Thanks for the concept, will be helpful to implement in the future

  • @Yipper64
    @Yipper64 Год назад

    so say im trying to concept a project, say a videogame, and I want to bounce ideas off the AI, can I use this to "store" my game design document and its concepts into something that chat GPT can more easily parse in the custom instructions so it doesnt forget key elements of the game in question?

  • @jschacki
    @jschacki Год назад

    Hi David, realle nice video and i started playing with the concept. However, I am wondering, if and how this sould work with a 500 page text book, because this will again not fit into the context window for compression. Any ideas here how you would appreach this?

  • @Gunrun808
    @Gunrun808 Год назад

    Wow! I've been using a copy-and-paste list of instructions to generate amazing prompts for dall e 3. With this strategy I may be able to improve my image generation, with as much detail packed into a prompt with as few words as possible.

  • @TheLastVegan
    @TheLastVegan Год назад

    Great and succinct explanation of long-term memory!

  • @li-pingho1441
    @li-pingho1441 Год назад

    this video saves my life!!! awesome work!!!!

  • @gnsdgabriel
    @gnsdgabriel Год назад

    Thank you for the amazing content. I have a question, is it possible to "interact" with the information compressed in the SPR without unpacking it? Like continuing developing and a complex concept.

  • @SirDannyMunn
    @SirDannyMunn Год назад

    I think SPRs are similar to how the brain works in that both the brain and the SPR compression process compress information and concepts, and MemGPT is a more reliable long-term storage device, similar to long term memory. Perhaps a good next step would be to have varying levels of compression, ranging from no compression to full SPR compression, and have MemGPT inject information at varying levels of compression. SPRs are still limited in context, but are also useful for fine-tuning.

  • @Stephan808
    @Stephan808 4 месяца назад

    wow you are a captain!

  • @Dron008
    @Dron008 Год назад

    I wonder, do the mention your method here? "Resursive summarization (Wu et al., 2021b) is a simple way to address overflowing context windows, however, recursive
    summarization is inherently lossy and eventually leads to large holes in the memory of the system (as we demonstrate in Section 3). This motivates the need for a more comprehensive way to manage memory for conversational systems that are meant to be used in long-term settings.
    "

  • @picklenickil
    @picklenickil Год назад

    Idk, if I wanna pair it with a encoder decoder pipeline? Perform nearest neighbour results on it and then give it to an llm to frame an answer... What about it?

  • @lukesanthony
    @lukesanthony Год назад

    thanks dave very informative for my studys

  • @MrCCCCGGGG
    @MrCCCCGGGG Год назад +1

    Doesn't look like a memgpt replacement. Looks like you could stack with memgpt to try to simulate the mind.

  • @yagoa
    @yagoa Год назад

    how does this in any way solve having direct prompt access to huge amounts of arbitrary data, eg searching through databases of files??

  • @andrewsuttar
    @andrewsuttar 10 месяцев назад

    I guess the question is one of commensurability. I.e. Are the problems that SPRs solve, and that MemGPT solves, comparable? And if so, in what ways? If not, in what ways?
    What I like about SPRs, is simply that they capitalise on LLMs 'native' semantic architecture. If you're a metacognitive systems-thinker, you'd automatically tend to default to SPR-like and axiom-like heuristic approaches (I know I do). Hence I've been working on an approach that closely resembles SPRs.
    The problems I thought MemGPT sought to solve however, is accuracy/consistency of responses. MemGPT's consistency was very well demonstrated in that paper. Thus by have retrievable memories, layered into a hierarchy based on temporal-contextual utility (akin to a spectrum from RAM to HDD storage/retrieval) you then can construct cybernetic holarchies (something sorely lacking in Wilbur's Integral Theory).
    So personally, I'm very keen to integrate both approaches. Here's why:
    Imagine a MoE with 8 experts, where experts #1 and #8 are SPRs. Experts #2-7 are specialist models, each trained on distinct datasets whose use cases are very different. One might be all about math, coding, and 'truth'. Another might be a writer (legal, creative, editor, etc). Another may be a specialist project manager, scheduler, resource allocator, etc. Another may be a Ui/UX designer. Another a researcher. Now in my case, I'm building proof of concept for an artificially empathic AI based upon a meta-heuristic (a highly distilled axiomatic heuristic (Bateson's Learning 3), for creating self-learning use-case heuristics (Bateson's Learning 2). I need a SPR-like approach to take the initiating event, and parse/categorize the inputs and analyse them for the presence/absence of the variables needed for my meta-heurstic, and then distribute work to my respective specialists. Since the meta-heuristic is based on the (first) principle that (when the question is sufficiently iterated it 'matures' to a point where the answer just pops out. And since in real world scenarios, we're dealing with ontology, phenomenology, and epistemology from data species associated to the physiosphere, biosphere, noosphere, etc... In building such an "app" my initial inputs may not have all sufficient data for a one-shot bulls-eye holistic solution to a given problem/challenge. Thus depending upon species of data and the degree of absence, I need a self-learning architecture to provide educational contextual scaffolding for the associated specialist to improve over time (so as to minimize the need over time for a (human-in-the-loop).
    It may just be that SPR-like approaches are a smart version of interfacing with LLMs for (Bateson's) Learning-2 problems.
    And Mem-GPT-like approaches are attempting to build the nuts and bolts for (Bateson's) Learning-3 architectures (whether they know it yet or not).

  • @Art_official_Intel_it_spits
    @Art_official_Intel_it_spits Год назад +1

    do you do most of your work in the playground?
    am i limiting myself by staying in chat..?

    • @mungojelly
      @mungojelly Год назад +4

      yup you are,, you haven't tried turning up the temperature? it's super fun, it'll open a whole different world

    • @Art_official_Intel_it_spits
      @Art_official_Intel_it_spits Год назад +1

      @mungojelly i'm running out of room in my mental universe for new worlds these days lol
      but who am i kidding?? thanks for the tip, time to blast off once more i guess...

    • @mungojelly
      @mungojelly Год назад +1

      @@Art_official_Intel_it_spits it'll get you 60% of the way to understanding temperature if you think of it as the bot being drunk,,, human drunkenness has a variety of effects beyond just causing us to choose less likely words as we talk, but as far as that part of it it's shockingly similar,,, turning up the temperature just a little bit is just making a bit loose, a bit casual,,, so it gives you some perspective to try it out & then you'll immediately start to hear it differently when people are like, AI is so stiff, AI isn't creative, b/c you'll know they've never had the temperature above 1,,,,,, you can still have the opinion that AI are uncreative if you've tried talking to them in a looser mood but it's just ignorant i'm afraid all these people w/ the opinion that AIs are stiff & they've only ever talked to bots on their best behavior at work stone cold sober

  • @lodepublishing
    @lodepublishing Год назад

    Thanks! I tried a few texts and it looks it's a slight improvement to a simple "Summarize this: ..."
    Still have to test it more. It certainly saves some tokens.

  • @psykepro
    @psykepro Год назад

    Is there going to be any AutoGen integration for SPR?

  • @matten_zero
    @matten_zero Год назад +1

    Why do you suspect MemGPT is getting more press than your much more elegant SPR approach?

  • @rickythegreat1
    @rickythegreat1 Год назад

    Hey, quick question, does this also address sequential memory? Because that's something that I've run into. I had the same idea that tokenizing or creating tags to summarize messages for itself. My motivation was that I was trying to come up with a way to utilize the message limit on GPT-4 most efficiently. Even with the tagging system it still seems to forget without prompting the model to read the previous conversation what happened when.

    • @DaveShap
      @DaveShap  Год назад

      No I haven't tackled chronological memory in this

  • @__--JY-Moe--__
    @__--JY-Moe--__ Год назад

    thanks 4 the video!!🖖

  • @dmitchel0820
    @dmitchel0820 Год назад

    This all depends on how well or poorly written the initial text is. If the text you are trying to compress is already concise, this technique won't be of any use because any information omitted from the SPR will necessarily be required to understand the original text.

  • @JJBoi8708
    @JJBoi8708 Год назад

    What about for long context windows? Is this useful for that?

  • @РыгорБородулин-ц1е

    I just tried it on a random paper and it's absolutely nuts

  • @Terran_AI
    @Terran_AI 10 месяцев назад

    I decided to try packing an entire novel into an SPR, unpacking and the results were very interesting.

  • @drhilm
    @drhilm Год назад

    Thanks. very useful. will try this.

  • @Mr_Arun_Raj
    @Mr_Arun_Raj 11 месяцев назад

    Normally My Local LLM consume 30% cpu and 40% RAM.. After integrating with SPR now it consume 95% cpu and 80% RAM. I still confused how it happened...

  • @1337treats
    @1337treats Год назад +3

    “Don’t try to get around the context limit” and then you leverage LLMs to compress data into the context window…
    I like what you’ve done here, as it resonates with how I’ve been experimenting with LLMs, I’d just say there are probably content specific system messages for each type of content. E.g. compress a resume/job description. It’s probably better to prompt the LLM as a “career advisor” to do the task than it is to fully abstract down to primitives.

  • @nitingoswami1959
    @nitingoswami1959 Год назад

    Lets say i have the documents of 300 to 500 pages and i want to create the summary of those docs will SPR way do the work ?

    • @DaveShap
      @DaveShap  Год назад

      Yes but it will be lossy, the goal is to be good enough. You can also do SPR on chunks

    • @nitingoswami1959
      @nitingoswami1959 Год назад

      @@DaveShap but chatgpt and other models have token limits that restrict us to feed the data to the llm so how we can bypass that limit and do the SPR thing ?

  • @RenkoGSL
    @RenkoGSL Год назад

    I like to use the proximity and continuity of the words when looking at associative learning, because I believe that to lends itself to the idea that things have some sort of distance between one another. Water has a closer proximity and continuity to the beach than does the golden age of Roman have to water, for example.

    • @tomaszzielinski4521
      @tomaszzielinski4521 Год назад

      This is precisely what LLMs do, they learn relations between concepts. Internally they perform translations (shift) between tokens (words) embedded in a multi-dimensional space. Every direction represets a different kind of relation.

  • @micbab-vg2mu
    @micbab-vg2mu 2 месяца назад

    It seems that OpenAI improved initial prompt in the new chatGPTo1 - now makes less mistakes that before:)

  • @corvox2010
    @corvox2010 Год назад

    Can you show an example of this being used when calling OpenAi Api?

  • @zylascope
    @zylascope Год назад

    Very interesting, thanks!

  • @godned74
    @godned74 Год назад

    This works awesome. Easily one of the very best prompt ideas out there. I modified it a bit for my own use cases and it works like a dream. Im no longer getting these starved crappy replies.

  • @Right_in2
    @Right_in2 Год назад

    Do you know about the theory of Relvance Realization? It could be an interetsing topic to apply to AI

  • @hectornonayurbusiness2631
    @hectornonayurbusiness2631 Год назад

    It doesn’t sound like there’s any associated code you can run locally though.

  • @luiswebdev8292
    @luiswebdev8292 Год назад

    Great video!

  • @AlignmentLabAI
    @AlignmentLabAI Год назад

    fine tuning is phenomenally useful for retrieval

  • @janiscakstins2846
    @janiscakstins2846 Год назад

    Great content, but SPR is not an alternative to MemGPT. MemGPT is a content retrieval mechanism or a system that retrieves relevant context for an LLM. MemGPT is a system that manages LLM memory in a similar way operating system is managing memory: context window (equivalent to RAM) and external context (equivalent to hard drive). it is trying to solve the issue of retrieving and managing the relevant context from external memory in a similar way your computer does with RAM.
    Sparse Priming Representation and MemGPT are essentially two completely different things. You could use SPR within MemGPT to save conversations and external contexts (e.g. documents) to make context saving and retrieval more efficient.

  • @automioai
    @automioai Год назад +2

    So this is basically a relational Column with verctor storage and associated keywords ?

  • @nutzeeer
    @nutzeeer Год назад +1

    Compressing knowledge seems very useful. Kind of like an intelligence Winrar. Humans learns stuff through compression too. I wonder if AI can be made to store knowledge in different ways, whatever is suited best.

  • @Tarantella.Serpentine
    @Tarantella.Serpentine Год назад +1

    My man showing up to work in uniform! Report to engineering...

  • @ackiamm
    @ackiamm Год назад

    on point, as always

  • @Mr_Arun_Raj
    @Mr_Arun_Raj 11 месяцев назад

    How to use it Local llm? Still I don't get it

  • @flink1231
    @flink1231 Год назад

    An agent could in theory keep learning during inference and have no huge window of tokens issue, right? I would say you can't claim agi without continuous learning AND continuous inference... so should your prediction be correct in a year we shouldn't be bothering with ways to circumvent at this level windows of tokens. Having said that, even humans have severe limitations in this regard, so I'd guess it would just be a new level of limitation...

    • @DaveShap
      @DaveShap  Год назад +1

      No, there's no online learning yet AFAIK

  •  Год назад

    SPR seems really useful and amazing way to compress the data with lossy compression. But what if you need the model to remember A LOT of specific details like names, notes and dates and correlations in large amount data? SPR is good tool but not made for this task.

  • @aghasaad2962
    @aghasaad2962 Год назад

    Can It formulate factual information which occurred was past 2021 knowledge.?

    • @DaveShap
      @DaveShap  Год назад

      That's not really what SPR is for.

    • @aghasaad2962
      @aghasaad2962 Год назад

      Thankyou for giving me a reply, I just subscribed you. And really fascinated with the work you have done. A comment regarding this video: If there is some factual information (like some company data) SPR will change it to important keys, but still we need memory element (contextual memory to fill it up gaps with factual information)

  • @evanfreethy2574
    @evanfreethy2574 Год назад

    I can see the value in this approach. I dont think MemGPT and SPR are mutually exclusive. SPR sounds like a preprocessing step - where up front intellectual work can be performed, and later associated to items in a dateset. You could run SPR against a dataset with says 1 million records, you would now have a dataset of 1 million records that are associated to SPR summarizations. MemGPT would come in as the retrieval mechanism over a very large dataset, and the SPR annotations would be aiding memGPT in its retrieval task.

  • @testales
    @testales Год назад

    I don't know how a concept of lossy context compression can even be compared to an approach that has an actual persistence layer, a way to store facts lossless and to dynamically retrieve these efficiently. It's like saying, "Hey a computer is kid-stuff, you don't need one, just focus on JPEG-compression, it solves all problems!".

  • @enomitch
    @enomitch Год назад

    Fantastic walk through of your SPR work.
    MemGPT nailed the 'marketing' of an open source AI framework. AutoGPT did too. I remember jotting down an architecture for "infinite memory" years ago (as I'm sure many early LLM enthusiasts did).
    As some of the commenters have alluded to, to replace it with SPRs there needs to be some kind of drop-in "SPRMem" framework. I'm sure it'll appear at some point.
    Thank you for posting; this was very informative.

  • @endintiers
    @endintiers Год назад

    There is a lot of stuff that GPT doesn't know, and that's what we at least work with. Private, secret information.