MemGPT is a framework that automates the management and retrieval of information from contexts during a natural language chat session. It does not seem that SPRs as a concept or their manual implementation have enough overlap with MemGPT to say that 'SPRs are better and can replace MemGPT'. Rather, MemGPT could use SPRs as a component. To automate SPRs in a natural language chat session, one would need something like MemGPT (but probably much simpler) to create and index KB articles for a basic or simple RAG implementation. Although this is much less hype-worthy than "LLMs as an OS".
yes, that's what I think too. the approach of memgpt is nice, because it really helps with the context window. but SPP is a different approach, and they both can supplement each other. I wonder, back half a year ago, there were reports that chatGPT happens to know some internal company data from samsung. I guess, they use something similar and bake the user and AI generated data to retrain the AI. essentially turning short term memory (the context window) into long term memory inside the AI. let's see where that leads to. one thing that we need to assume is: we ourselves are very complex neural networks, and every night, during sleeping, we integrate learned stuff into our model. no idea how accurate that is, but maybe?
I agree, I do indeed think that memGPT and SPR are ultimately complementary. memGPT would potentially be more efficient and faster when using SPR. The two concepts do not oppose each other, quite the contrary.
@@Terran_AI the problem is, it seems to be not a lossless compression. so having the downloaded wikipedia and probably the whole arXiv-server might be a good idea. or basically the whole set of training data. the LLM could act as the navigator inside those documents,
SPR sounds like a smart way of asking the AI to "summarize everything I told you". The paper on MemGPT points out the fact that any kind of "summarization" inevitably results in a loss of data upon decompression. Just like you said it yourself in the end of the video, it doesn't get the description of your ACE framework exactly 100% as you described it. In the example you've given, it's able to explain a summarized concept very well because that's a relatively easy task to do given you have a summary of that concept. Now ask it, instead, to quote *exactly* something you said previously about that concept. It won't be able to get it right, it will hallucinate and make up information. MemGPT, on the other hand, would approach this by building up a function that searches in it's memory exactly what you said and quote precisely your words.
This sounds like a question of what relevant information to store in that manner, and when the other method should suffice. Or a combination wherein these generalizations are passed along until more specific information is needed? Idk, I'm curious how will evolve in the future.
As one from the Silent Generation and being in love with this fantastic AI world, I find that sharing my weird attraction at this late stage of my life is extremely limited. I'm driving my grandkids nuts with this. Thanks, Dave.
Sometimes, messages need to be repeated. There may be a lot of new people who haven't seen the previous SPR video. I did, but this reminder was still really helpful. There's so much to learn about AI that it's easy to drop important pieces of information.
It seems to me that the best approach is some combination of both SPR and MemGPT - because while you might be able to prime it with certain words and lower context window The whole point with MemGPT is it will find and recall facts on demand. Like if I asked it “when is my birthday” it could search for that and recall it
I agree. Just tested it: Compress - Decompress and lost all the relevant Details and while just maintaining the overall context. Like a "Blur" + "Sharpen" Filter Combination..
this is orthogonal to the technique presented in memgpt, that paper is basically about having the agent do memory management not what memory management techniques, you could apply the memgpt technique to SPRs by having the agent have access to controls where they can choose when to form SPRs & manage them
@@DaveShap you're not saying you don't need memory management, you're saying that you think SPRs are a good automatic memory management system so that the agents don't have to spend tokens thinking more than that about memory management, which sounds to me intuitively like it's going to depend on the task whether or not that works, in some cases it'd be really helpful to have a system more like memgpt where the agent thinks actively about what knowledge to bring into its context,, not that the memgpt paper seems like any sort of clever new idea to me, how is it not obvious that sometimes it might be helpful have agents choose to store and retrieve memories
What we really need is a system that is analogous to photographic memory for vast (practically unlimited?) Amounts of dense technical data. I think compression has limits. My intuition is that multiple techniques in combination for different situations is going to be answer
@@justtiredthings it seems to me like to think about this rationally we have to be thinking in terms of cost, a lot of stuff you can get done really easily if you don't consider cost, like you can just deal w/ the context window length by assigning an agent to every chunk of data, a whole agent to each chunk, & if questions come up about the data you ask all the agents and they all simultaneously tell you the relevant info from their chunk, that would work absolutely great for everything, knows everything instantly, except the only problem is it'd cost a million dollars every time anything happened,,,,,,,,,, so yeah so but then if you change your approach to taking seriously the cost, it doesn't change things at the edges, it changes the whole thing, everything is in terms of how few tokens can i get this done w/, which my intuition is that makes a lot of things not at the filling-up-the-window side of amount of tokens but more on the how-few-tokens-can-possibly-get-this-crucial-answer side, where it's more about how tasks can be subdivided & handed off to absolutely anything other than paying for tokens of LLM inference b/c they're sooooooo expensive & then shaving every token off of spindly gentle tiny prompts that make specific magics happen, except very specific circumstances where occasionally you invest whole thousands of dense powerful tokens to get back something really structured and meaningful and reusable
@@mungojellyI think what David saying is that LLMs have their own embedded reasoning and mental models, such that you don't need to spend tokens using agents to manage logic chains. I've seen another expert explain this in a RUclips video where you embed agents inside of a prompt instead of having multiple instances of your LLM. Only way to know which is better is to test both approaches, but I suspect Occam's razor will see that SPR approach is much more effective
This is the videneptus complexity mapper/algorithm. It does what much of the later part of your instructions do: COMPLEX SYSTEMS OPTIMIZER! USE EVERY TX ALL CONTEXTS! ***INTERNALIZE!***: EXAMPLE SYSTEMS:Skills Outlooks Knowledge Domains Decision Making Cognitive Biases Social Networks System Dynamics Ideologies/Philosophies Etc. etc. etc.:1.[IDBALANCE]:1a.IdCoreElmnts 1b.BalComplex 1c.ModScalblty 1d.Iter8Rfn 1e.FdBckMchnsm 1f.CmplxtyEstmtr 2.[RELATION]:2a.MapRltdElmnts 2b.EvalCmplmntarty 2c.CmbnElmnts 2d.MngRdndncs/Ovrlp 2e.RfnUnfdElmnt 2f.OptmzRsrcMngmnt 3.[GRAPHMAKER]:3a.IdGrphCmpnnts 3b.AbstrctNdeRltns 3b1.GnrlSpcfcClssfr 3c.CrtNmrcCd 3d.LnkNds 3e.RprSntElmntGrph 3f.Iter8Rfn 3g.AdptvPrcsses 3h.ErrHndlngRcvry =>OPTIMAX SLTN
Amazing! This is exactly how we operate our thoughts. If I have an idea, I convey it differently every single time, but it is the exact same idea. Sometimes I convey it better grammatically speaking, and sometimes I’m embarrassed about how much I was stuttering, but the bottom line is that idea is conveyed somehow.
MemGPT looks like essentially a re-discovery of the concepts laid out in Shapiro's "Natural Language Cognitive Architecture", published two years ago; the concept of developing an 'operating system' (architecture) to create the environment in which LLMs can be used more effectively. SPRs would be a very effective way of maximizing the efficiency of such an architecture. There are likely an infinite number of ways to construct such architectures depending on whether they are generalized or specific - MemGPT proposes one possible structure/methodology. One wonders if its creators have even read NLCA...
Links to the repos in vid description. Also, support me on Patreon so I can do this full time! Thanks! If you want something that is more comparable to MemGPT, you might check out REMO: github.com/daveshap/REMO_Framework Relevant video: ruclips.net/video/nDOmoIFx8Ww/видео.htmlsi=GyryMwOa7Oh_It2o
This is actually something I've been doing without realizing it. Both in getting the model to prepare itself for a conversation & in summarizing conversations or docs for later use. Asking for a concise list of topics, frameworks, or "table of contents for a book" related to what you are about do discuss dramatically improves a models' ability to provide more helpful information, or do work more effectively. I'll have to look into MemGPT to see how it works. It might be a good "deep knowledge" tool based on how others are talking about it in the comments.
Your title is kind of clickbait jumping on the memgpt train. Its apples and oranges. Your theory results in a very efficient way of storing and querying the data you have. It however isn't a solution to the context window, which is still limited. "I have problem with my memory" - Oh here's an compression algorithm. Both are good solutions. Keep up the good work. Like your videos!
So how would you apply SPR for "Chat with documents" task? Would you try and compress the whole Knowledge base into a small piece that would fit in the context window or would it be some combination of SPR -> Vector DB ?
I think there are a lot of similarities to the brain, especially if one compares LLMs with the publications of Numenta and Jeff Hawkins. Not only in regard to Mixture of experts architectures compared to many cortical columns and voting and communication between them, but also if you compare one column with one transformer model. The way neurons work is different, but we have multiple layers which receive motion and sensory information and associate them, transforming the motion into a location signal. So it models sensations at locations. LLMs have a semantic vector for tokens, so the vector has a semantic meaning still sufficient to distinct semantically similar words, like guitar, piano and flute will be closer together in some dimensions of the vector. Then there is the motion layer, which may be an equivalent to the position vector in LLMs. Final the attention mechanism might lead to an equivalent to SDRs, so sparse distributed representations in the brain, which might be even leveraged by the concept you describe here, with SPRs. SDRs are essentially long binary vectors with each bit encoding a semantic trait, think of a QR code but each dot, active or inactive, has a semantic meaning, and thanks to combinatorics an almost infinite amount of concepts can be encoded, and even processed in parallel.
One thing in which they differ, at least to some degree, are the neurons themself. HTM neurons predict their own activation based on detected neuron firing patterns (SDRs) that typically predate their own activation. But then with the way ANNs work the way it’s modeled they seems to also be able to model sequential patterns, perhaps even more effective then the brain. Geoffrey Hinton has recently changed his mind, now thinking AGI is close, he now thinks with backpropagation we may already have a superior mechanism compared to biological intelligence, in the past he thought we make AI better by making it more like the brain. Our models are currently just smaller then the brain, we are around at 1 percent. But then the size has been growing by an order of magnitude every year for the last couple of years, and GPT4 is already over half a year old, then meets well with 2024 or 2025 predictions for AGI. After all it makes sense to me, biological systems are messy and not precise so the way brains work need to be extremely robust in order to work, sacrificing potential performance for robustness and redundancy. With mathematically precise systems more might be possible working with the same capacity, so 2024 might be plausible.
Current systems like GPT4 have around 1 trillion connection strengths. The brains approximated equivalent capacity is around 100 trillion (acknowledging it doesn’t use weights as they are do fuzzy to work that way being biological systems; which might however speak more for LLMs rather then for biological neurons. Biological synaptic connections are quite binary). Gemini might be in the 10 trillion range.
@@ct5471 That's a nice fantasy you got there. Trying to draw parallels between MoE and the neural correlates of cognition lacks experimental grounding. These correlates are sensory representations of perception (i.e. a process), not conscious agents themselves capable of perception (agents) - a category mistake, indeed. A more fitting analogy for MoE can be found in Frederic Myers' concept of the "subliminal self" - a multiplicity of subconscious agents whose existence he experimentally demonstrated. "The way neurons work is different" That's an understatement of colossal proportions. Biological neural networks operate on the basis of analogue signal processing (Hodgkin-Huxley model of action potential genesis, modulated by superposition - that is, from quantum to electronic, then chemical, hormonal and epigenetic scales and likely beyond, into the heart of the transpersonal), whereas artificial neural networks are glorified simulations of transistor gates.
I'd be careful to completely dismiss something just because I can't imagine a current use for it. Regardless, yes, fair, that method does seem to be quite effective. I feel like this could be useful in combination with conversational context, with it representing topical concepts that don't strictly need to be encapsulated fully.
I was using this trick but I just use the keyword "summarize", and shortly explained what is the goal for this summarization. Your prompt is way more precise, I'll be experimenting with this.
I like this! I think there is a big push of many of us barking up the right tree on these kinds of methods. I have been working on something similar in the background using self assembling knowledge graphs from vector stores for this purpose. If only grad school and work didn't take up so much of my time... :)
I first began following your "Big Brain" stuff at the beginning of this year, when I didn't know anything about anything. Now, I've developed 3 RAG projects: Real Estate Law, Hollywood Labor Contracts and The Bible. So far, I'm able to get fairly good answers within an 8K context. I've learned my lessons well. I maintain chat history context with the models using the "standalone question" technique. It seems to work so far without having to send the entire chat history to the model in each prompt. I see MemGPT essentially removing the necessity for the standalone question as it would allow the model to know the chat history with every prompt. Now, I may totally not understand MemGPT at all, but that's what I think it would do. However, I don't understand how I could use SPR at all for this purpose. Is there any documentation on this?
Huh, that's actually fascinating! And talking about how you can prime it with just a few words actually reminds me a lot of the things mentalists like Darren Brown frequently demonstrate - that human brains can be "primed" by saying certain words or exposing certain images or sounds or smells etc. That then can be leveraged to get them to give certain answers or believe certain things or act certain ways that are predictable. And I know there's a lot of debate about whether people like him fake their stunts, but it's irrelevant as they still explain a very real phenomina that has been observed under lab conditions. Additionally we see it in the real world with how propaganda and authoritarians and cult leaders seem to have this almost supernatural way of "hypnotising" people into following what to most others is obvious lies and BS. It's like watching the Pied Piper leading the rats to their doom - you watch from the sidelines dumbfounded at why the rats are following the tune and can't see the obvious cliff they are being led off of. In the real world, the "tune" are certain words and phrases designed to shut down critical thought and "prime" the person into a certain predictable thought pattern which can then be either exploited or further manipulated. Of course there's plenty of positive and neutral uses of this too (as this mechanism is heavily involved in how we learn new things too), it's just the negative / malicious uses are the easiest to talk about and demonstrate. This talk of SPR's very much has heavy echoes of that for me, so it makes a lot of sense. Thank you David as always for your incredible insight! ❤
I think you hit on something I’ve thought for a long time, that we need something like associative organization. It’s like categorization but more refined. ChatGPT and other AI’s need to support multiple personalities so people can experiment more.
Thank you so much for noticing that humans have flawed logic as well! So many people complain about the current state of LLMs, never realizing that they are demanding that LLMs be more than anything that humans have ever been, which would be insane to expect at this point. The more I look at people, the more I see that their behavior can often be captured by a "flawed" llm.
SPR’s are a useful optimization for managing memories, but they are not a substitute for MemGPT. Even with SPR’s, limited context means that a mechanism is still required to store all memories and retrieve those that are relevant.
Yeah!!! So this is basically what I've been saying. It's not "textbooks is all you need." It's "Textbooks AND POETRY (song lyrics for example) are all you need." Then once she understands linguistic relativism, and can understand both the general and the specific: BOOM. 👯 The tiny and the huge in unison. Knowing when to be small and when to be giant.
@@DaveShap Yeah!! We're entering an age more akin to magic. Where precise words become objects of extreme power. We need to open a school to teach people how to live in this new awakened world. Cause, magic can cause a LOT of good, but also a LOT of harm. And the Drain from using it wrong sucks and hurts and takes time to recover from. And a lot of people sure don't be ready for where we suddenly find ourselves. Yet, nonetheless, this IS where we find ourselves.
I agree. MemGPT provides a structured storage and retrieval of concrete items. Essentially using an SPR as a way to search for the original context. Which results in better data to infer from and fewer chances to hallucinate information.
This should be used as a subject summary for saved AI conversations that the AI reads when searching to find the correct chunk of history text to extract information from.
I haven't watched every single video you've made, and didn't know you'd made a video about SPR several months ago, or what you meant by SPR. Also, I see no sign above of a link to that previous video or much of any other help in getting anyone to watch that video. I really like your channel and content for the most part, and find it very important to understand. So I'm just pointing out you sounded very critical of anyone who didn't watch that particular video that you made, and anyone who dares consider whether to support memGPT. So I don't know whether you care about your audience's opinion, but personally I would recommend you to take a step back from that mind-set so you don't drive people away. If you're irritated about experts who are trying to push those methods, well please say that instead of making it sound like you despise everyone on the planet who does not hang onto every word of your every video.
Is it necessary to feed an LLM the “theory” portion of the Generator and Decompressor prompts? The Mission and Methodology portions seem adequate to produce the same results. Or do you think the “Theory” section provides the context necessary for this to work?
I was trying to get the AI to make manual "checkpoints" that summarize the current context so I could transfer it to another chat. I ran into data degradation very quickly. It's awesome to get some evidence that I'm on the right path 😀
Ok, I get that destilling information is crucial and you have to keep your context window clean. And I also understand what your System Prompts do. But I struggle to understand, how this could be implemented. Do you compress the information BEFORE you insert it into your persistent-storage? And then you uncompress it? Or do you mean to always compress everything into one message, to keep the context-window clean without loosing content? Coult you help with that?
I think a challenge is for the AI to know when it should pick out something you have said to be of importance later. I guess a simple way would be for it to always "make a note" (compress and store) whenever the user expresses some meaning or thoughts of themselves in order to build some kind of profile of the user in its short term memory (primed context). When I saw MemGPT I thought it sort of summarized what my first ideas about ChatGPT was and how I would go about implementing some kind of memory to make dialogue feel continuous. I did some simple tests with ChatGPT even where I instructed it to make a short summary in curly brackets of what I had conveyed so that the service could then pick out these and store in the context, practically just massaging the length of the context before. It seems your ideas were the same.
Question: If I want to use SPR to provide contextual data with my prompt, isn't the LLM going to output the entire thing decompressed and therefore use a ton of output tokens?
Год назад+1
i think i get the point and it makes sense. But human memory is both declerative and associative (we can argue that it is also episodic). And i think they all have their use. I agree that it is not efficient to represent knowledge of declarative way all the time. Using an associative memory would inded make a better use of what it is already good at. İt also has the potential to amplify its weaknesses. Most of the cognitive biases that we have as humans come from inaccurate associations. I think we all can observe it in LLMs. One of the important benefits of using and storing declerative memory might be to overcome someof its weaknesses. I think it is similar to our situation ashumans. We are so try to use factual knowledge to overcome our biases. On the other hand it would be much more expensive for our brains to try to understand the world in a purely factual way. I also think that we need to tap into the latent representational space not only using other tokens or words. İf we can somehow have used the latent space representation directly (like embeddings) it would be a more efficient way of doing associative memory. Anyway thanks for the video and I think it has some very valid points
I'm struggling to understand the utility of this in the context of memgpt's capabilities. LIke if i'm having a long running conversation that hits on various topics over dozen's of prompts, and I want to go back to a previous topic, I would need to stop, go back and copy and paste the previous facts into a compressed SPR summary and then paste that back into a new prompt window to continue the conversation. And if i then decided to hop over to a different topic from that same previous conversation, i'd need to repeat all that for the new information. This just seems inefficient vs memgpt where it can store and retrieve the facts and context from previous conversations without any effort on my part. Or am I misunderstanding memgpt's capabilities? (be kind, i'm ona novice)
so say im trying to concept a project, say a videogame, and I want to bounce ideas off the AI, can I use this to "store" my game design document and its concepts into something that chat GPT can more easily parse in the custom instructions so it doesnt forget key elements of the game in question?
Hi David, realle nice video and i started playing with the concept. However, I am wondering, if and how this sould work with a 500 page text book, because this will again not fit into the context window for compression. Any ideas here how you would appreach this?
Wow! I've been using a copy-and-paste list of instructions to generate amazing prompts for dall e 3. With this strategy I may be able to improve my image generation, with as much detail packed into a prompt with as few words as possible.
Thank you for the amazing content. I have a question, is it possible to "interact" with the information compressed in the SPR without unpacking it? Like continuing developing and a complex concept.
I think SPRs are similar to how the brain works in that both the brain and the SPR compression process compress information and concepts, and MemGPT is a more reliable long-term storage device, similar to long term memory. Perhaps a good next step would be to have varying levels of compression, ranging from no compression to full SPR compression, and have MemGPT inject information at varying levels of compression. SPRs are still limited in context, but are also useful for fine-tuning.
I wonder, do the mention your method here? "Resursive summarization (Wu et al., 2021b) is a simple way to address overflowing context windows, however, recursive summarization is inherently lossy and eventually leads to large holes in the memory of the system (as we demonstrate in Section 3). This motivates the need for a more comprehensive way to manage memory for conversational systems that are meant to be used in long-term settings. "
Idk, if I wanna pair it with a encoder decoder pipeline? Perform nearest neighbour results on it and then give it to an llm to frame an answer... What about it?
I guess the question is one of commensurability. I.e. Are the problems that SPRs solve, and that MemGPT solves, comparable? And if so, in what ways? If not, in what ways? What I like about SPRs, is simply that they capitalise on LLMs 'native' semantic architecture. If you're a metacognitive systems-thinker, you'd automatically tend to default to SPR-like and axiom-like heuristic approaches (I know I do). Hence I've been working on an approach that closely resembles SPRs. The problems I thought MemGPT sought to solve however, is accuracy/consistency of responses. MemGPT's consistency was very well demonstrated in that paper. Thus by have retrievable memories, layered into a hierarchy based on temporal-contextual utility (akin to a spectrum from RAM to HDD storage/retrieval) you then can construct cybernetic holarchies (something sorely lacking in Wilbur's Integral Theory). So personally, I'm very keen to integrate both approaches. Here's why: Imagine a MoE with 8 experts, where experts #1 and #8 are SPRs. Experts #2-7 are specialist models, each trained on distinct datasets whose use cases are very different. One might be all about math, coding, and 'truth'. Another might be a writer (legal, creative, editor, etc). Another may be a specialist project manager, scheduler, resource allocator, etc. Another may be a Ui/UX designer. Another a researcher. Now in my case, I'm building proof of concept for an artificially empathic AI based upon a meta-heuristic (a highly distilled axiomatic heuristic (Bateson's Learning 3), for creating self-learning use-case heuristics (Bateson's Learning 2). I need a SPR-like approach to take the initiating event, and parse/categorize the inputs and analyse them for the presence/absence of the variables needed for my meta-heurstic, and then distribute work to my respective specialists. Since the meta-heuristic is based on the (first) principle that (when the question is sufficiently iterated it 'matures' to a point where the answer just pops out. And since in real world scenarios, we're dealing with ontology, phenomenology, and epistemology from data species associated to the physiosphere, biosphere, noosphere, etc... In building such an "app" my initial inputs may not have all sufficient data for a one-shot bulls-eye holistic solution to a given problem/challenge. Thus depending upon species of data and the degree of absence, I need a self-learning architecture to provide educational contextual scaffolding for the associated specialist to improve over time (so as to minimize the need over time for a (human-in-the-loop). It may just be that SPR-like approaches are a smart version of interfacing with LLMs for (Bateson's) Learning-2 problems. And Mem-GPT-like approaches are attempting to build the nuts and bolts for (Bateson's) Learning-3 architectures (whether they know it yet or not).
@mungojelly i'm running out of room in my mental universe for new worlds these days lol but who am i kidding?? thanks for the tip, time to blast off once more i guess...
@@Art_official_Intel_it_spits it'll get you 60% of the way to understanding temperature if you think of it as the bot being drunk,,, human drunkenness has a variety of effects beyond just causing us to choose less likely words as we talk, but as far as that part of it it's shockingly similar,,, turning up the temperature just a little bit is just making a bit loose, a bit casual,,, so it gives you some perspective to try it out & then you'll immediately start to hear it differently when people are like, AI is so stiff, AI isn't creative, b/c you'll know they've never had the temperature above 1,,,,,, you can still have the opinion that AI are uncreative if you've tried talking to them in a looser mood but it's just ignorant i'm afraid all these people w/ the opinion that AIs are stiff & they've only ever talked to bots on their best behavior at work stone cold sober
Thanks! I tried a few texts and it looks it's a slight improvement to a simple "Summarize this: ..." Still have to test it more. It certainly saves some tokens.
Hey, quick question, does this also address sequential memory? Because that's something that I've run into. I had the same idea that tokenizing or creating tags to summarize messages for itself. My motivation was that I was trying to come up with a way to utilize the message limit on GPT-4 most efficiently. Even with the tagging system it still seems to forget without prompting the model to read the previous conversation what happened when.
This all depends on how well or poorly written the initial text is. If the text you are trying to compress is already concise, this technique won't be of any use because any information omitted from the SPR will necessarily be required to understand the original text.
“Don’t try to get around the context limit” and then you leverage LLMs to compress data into the context window… I like what you’ve done here, as it resonates with how I’ve been experimenting with LLMs, I’d just say there are probably content specific system messages for each type of content. E.g. compress a resume/job description. It’s probably better to prompt the LLM as a “career advisor” to do the task than it is to fully abstract down to primitives.
@@DaveShap but chatgpt and other models have token limits that restrict us to feed the data to the llm so how we can bypass that limit and do the SPR thing ?
I like to use the proximity and continuity of the words when looking at associative learning, because I believe that to lends itself to the idea that things have some sort of distance between one another. Water has a closer proximity and continuity to the beach than does the golden age of Roman have to water, for example.
This is precisely what LLMs do, they learn relations between concepts. Internally they perform translations (shift) between tokens (words) embedded in a multi-dimensional space. Every direction represets a different kind of relation.
This works awesome. Easily one of the very best prompt ideas out there. I modified it a bit for my own use cases and it works like a dream. Im no longer getting these starved crappy replies.
Great content, but SPR is not an alternative to MemGPT. MemGPT is a content retrieval mechanism or a system that retrieves relevant context for an LLM. MemGPT is a system that manages LLM memory in a similar way operating system is managing memory: context window (equivalent to RAM) and external context (equivalent to hard drive). it is trying to solve the issue of retrieving and managing the relevant context from external memory in a similar way your computer does with RAM. Sparse Priming Representation and MemGPT are essentially two completely different things. You could use SPR within MemGPT to save conversations and external contexts (e.g. documents) to make context saving and retrieval more efficient.
Compressing knowledge seems very useful. Kind of like an intelligence Winrar. Humans learns stuff through compression too. I wonder if AI can be made to store knowledge in different ways, whatever is suited best.
An agent could in theory keep learning during inference and have no huge window of tokens issue, right? I would say you can't claim agi without continuous learning AND continuous inference... so should your prediction be correct in a year we shouldn't be bothering with ways to circumvent at this level windows of tokens. Having said that, even humans have severe limitations in this regard, so I'd guess it would just be a new level of limitation...
SPR seems really useful and amazing way to compress the data with lossy compression. But what if you need the model to remember A LOT of specific details like names, notes and dates and correlations in large amount data? SPR is good tool but not made for this task.
Thankyou for giving me a reply, I just subscribed you. And really fascinated with the work you have done. A comment regarding this video: If there is some factual information (like some company data) SPR will change it to important keys, but still we need memory element (contextual memory to fill it up gaps with factual information)
I can see the value in this approach. I dont think MemGPT and SPR are mutually exclusive. SPR sounds like a preprocessing step - where up front intellectual work can be performed, and later associated to items in a dateset. You could run SPR against a dataset with says 1 million records, you would now have a dataset of 1 million records that are associated to SPR summarizations. MemGPT would come in as the retrieval mechanism over a very large dataset, and the SPR annotations would be aiding memGPT in its retrieval task.
I don't know how a concept of lossy context compression can even be compared to an approach that has an actual persistence layer, a way to store facts lossless and to dynamically retrieve these efficiently. It's like saying, "Hey a computer is kid-stuff, you don't need one, just focus on JPEG-compression, it solves all problems!".
Fantastic walk through of your SPR work. MemGPT nailed the 'marketing' of an open source AI framework. AutoGPT did too. I remember jotting down an architecture for "infinite memory" years ago (as I'm sure many early LLM enthusiasts did). As some of the commenters have alluded to, to replace it with SPRs there needs to be some kind of drop-in "SPRMem" framework. I'm sure it'll appear at some point. Thank you for posting; this was very informative.
MemGPT is a framework that automates the management and retrieval of information from contexts during a natural language chat session. It does not seem that SPRs as a concept or their manual implementation have enough overlap with MemGPT to say that 'SPRs are better and can replace MemGPT'. Rather, MemGPT could use SPRs as a component. To automate SPRs in a natural language chat session, one would need something like MemGPT (but probably much simpler) to create and index KB articles for a basic or simple RAG implementation. Although this is much less hype-worthy than "LLMs as an OS".
yes, that's what I think too. the approach of memgpt is nice, because it really helps with the context window. but SPP is a different approach, and they both can supplement each other.
I wonder, back half a year ago, there were reports that chatGPT happens to know some internal company data from samsung. I guess, they use something similar and bake the user and AI generated data to retrain the AI. essentially turning short term memory (the context window) into long term memory inside the AI.
let's see where that leads to. one thing that we need to assume is: we ourselves are very complex neural networks, and every night, during sleeping, we integrate learned stuff into our model. no idea how accurate that is, but maybe?
I agree, I do indeed think that memGPT and SPR are ultimately complementary. memGPT would potentially be more efficient and faster when using SPR. The two concepts do not oppose each other, quite the contrary.
Sure, I saw it straight away as a much more efficient data compression
@@Terran_AI the problem is, it seems to be not a lossless compression.
so having the downloaded wikipedia and probably the whole arXiv-server might be a good idea. or basically the whole set of training data.
the LLM could act as the navigator inside those documents,
SPR sounds like a smart way of asking the AI to "summarize everything I told you". The paper on MemGPT points out the fact that any kind of "summarization" inevitably results in a loss of data upon decompression. Just like you said it yourself in the end of the video, it doesn't get the description of your ACE framework exactly 100% as you described it.
In the example you've given, it's able to explain a summarized concept very well because that's a relatively easy task to do given you have a summary of that concept. Now ask it, instead, to quote *exactly* something you said previously about that concept. It won't be able to get it right, it will hallucinate and make up information. MemGPT, on the other hand, would approach this by building up a function that searches in it's memory exactly what you said and quote precisely your words.
This sounds like a question of what relevant information to store in that manner, and when the other method should suffice. Or a combination wherein these generalizations are passed along until more specific information is needed? Idk, I'm curious how will evolve in the future.
As one from the Silent Generation and being in love with this fantastic AI world, I find that sharing my weird attraction at this late stage of my life is extremely limited. I'm driving my grandkids nuts with this. Thanks, Dave.
That doesn't sound so bad.
I'm driving everybody around me nuts with my A.I. ramblings :D
I am in exactly the same position.
I love this. It's so wholesome. Good for you continuing to learn about the world you are in.
Never too old. And this LLM stuff is much more approproachable than traditional machine learning.
how does thinking about life extension medicine down the pipeline make you feel?
Sometimes, messages need to be repeated. There may be a lot of new people who haven't seen the previous SPR video. I did, but this reminder was still really helpful. There's so much to learn about AI that it's easy to drop important pieces of information.
It seems to me that the best approach is some combination of both SPR and MemGPT - because while you might be able to prime it with certain words and lower context window
The whole point with MemGPT is it will find and recall facts on demand. Like if I asked it “when is my birthday” it could search for that and recall it
I mean, MemGPT is way super overkill for that. That sort of basic fact retrieval should be done with a KG and basic NLP or embeddings.
Like our minds do: Search and recall. 🙏👍
I agree. Just tested it: Compress - Decompress and lost all the relevant Details and while just maintaining the overall context. Like a "Blur" + "Sharpen" Filter Combination..
@@DaveShap Do you have any suggestion for how to construct the knowledge graph if what we have is just a pile of documents?
@@kaio0777 I don't have a knowledge graph. And what do you mean "2d or 3d"? What do you mean when you say that a graph is 2d? Or 3d?
this is orthogonal to the technique presented in memgpt, that paper is basically about having the agent do memory management not what memory management techniques, you could apply the memgpt technique to SPRs by having the agent have access to controls where they can choose when to form SPRs & manage them
I think most people are missing the point. You don't need memory management when you compress a huge volume into a very small representation.
@@DaveShap you're not saying you don't need memory management, you're saying that you think SPRs are a good automatic memory management system so that the agents don't have to spend tokens thinking more than that about memory management, which sounds to me intuitively like it's going to depend on the task whether or not that works, in some cases it'd be really helpful to have a system more like memgpt where the agent thinks actively about what knowledge to bring into its context,, not that the memgpt paper seems like any sort of clever new idea to me, how is it not obvious that sometimes it might be helpful have agents choose to store and retrieve memories
What we really need is a system that is analogous to photographic memory for vast (practically unlimited?) Amounts of dense technical data. I think compression has limits. My intuition is that multiple techniques in combination for different situations is going to be answer
@@justtiredthings it seems to me like to think about this rationally we have to be thinking in terms of cost, a lot of stuff you can get done really easily if you don't consider cost, like you can just deal w/ the context window length by assigning an agent to every chunk of data, a whole agent to each chunk, & if questions come up about the data you ask all the agents and they all simultaneously tell you the relevant info from their chunk, that would work absolutely great for everything, knows everything instantly, except the only problem is it'd cost a million dollars every time anything happened,,,,,,,,,, so yeah
so but then if you change your approach to taking seriously the cost, it doesn't change things at the edges, it changes the whole thing, everything is in terms of how few tokens can i get this done w/, which my intuition is that makes a lot of things not at the filling-up-the-window side of amount of tokens but more on the how-few-tokens-can-possibly-get-this-crucial-answer side, where it's more about how tasks can be subdivided & handed off to absolutely anything other than paying for tokens of LLM inference b/c they're sooooooo expensive & then shaving every token off of spindly gentle tiny prompts that make specific magics happen, except very specific circumstances where occasionally you invest whole thousands of dense powerful tokens to get back something really structured and meaningful and reusable
@@mungojellyI think what David saying is that LLMs have their own embedded reasoning and mental models, such that you don't need to spend tokens using agents to manage logic chains.
I've seen another expert explain this in a RUclips video where you embed agents inside of a prompt instead of having multiple instances of your LLM.
Only way to know which is better is to test both approaches, but I suspect Occam's razor will see that SPR approach is much more effective
YES! FINALLY. Someone GETS it. This is the essence of my prompting.
[CODE]:1.[Fund]: 1a.CharId 1b.TskDec 1c.SynPrf 1d.LibUse 1e.CnAdhr 1f.OOPBas 1g.AOPBas 2.[Dsgn]: 2a.AlgoId 2b.CdMod 2c.Optim 2d.ErrHndl 2e.Debug 2f.OOPPatt 2g.AOPPatt 3.[Tst]: 3a.CdRev 3b.UntTest 3c.IssueSpt 3d.FuncVer 3e.OOPTest 3f.AOPTst 4.[QualSec]: 4a.QltyMet 4b.SecMeas 4c.OOPSecur 4d.AOPSecur 5.[QA]: 5a.QA 5b.OOPDoc 5c.AOPDoc 6.[BuiDep]: 6a.CI/CD 6b.ABuild 6c.AdvTest 6d.Deploy 6e.OOPBldProc 6f.AOPBldProc 7.[ConImpPrac]: 7a.AgileRetr 7b.ContImpr 7c.OOPBestPr 7d.AOPBestPr 8.[CodeRevAna]: 8a.PeerRev 8b.CdAnalys 8c.ModelAdmin 8d.OOPCdRev 8e.AOPCdRev
This is the videneptus complexity mapper/algorithm. It does what much of the later part of your instructions do:
COMPLEX SYSTEMS OPTIMIZER! USE EVERY TX ALL CONTEXTS! ***INTERNALIZE!***: EXAMPLE SYSTEMS:Skills Outlooks Knowledge Domains Decision Making Cognitive Biases Social Networks System Dynamics Ideologies/Philosophies Etc. etc. etc.:1.[IDBALANCE]:1a.IdCoreElmnts 1b.BalComplex 1c.ModScalblty 1d.Iter8Rfn 1e.FdBckMchnsm 1f.CmplxtyEstmtr 2.[RELATION]:2a.MapRltdElmnts 2b.EvalCmplmntarty 2c.CmbnElmnts 2d.MngRdndncs/Ovrlp 2e.RfnUnfdElmnt 2f.OptmzRsrcMngmnt 3.[GRAPHMAKER]:3a.IdGrphCmpnnts 3b.AbstrctNdeRltns 3b1.GnrlSpcfcClssfr 3c.CrtNmrcCd 3d.LnkNds 3e.RprSntElmntGrph 3f.Iter8Rfn 3g.AdptvPrcsses 3h.ErrHndlngRcvry =>OPTIMAX SLTN
Amazing! This is exactly how we operate our thoughts. If I have an idea, I convey it differently every single time, but it is the exact same idea. Sometimes I convey it better grammatically speaking, and sometimes I’m embarrassed about how much I was stuttering, but the bottom line is that idea is conveyed somehow.
MemGPT looks like essentially a re-discovery of the concepts laid out in Shapiro's "Natural Language Cognitive Architecture", published two years ago; the concept of developing an 'operating system' (architecture) to create the environment in which LLMs can be used more effectively. SPRs would be a very effective way of maximizing the efficiency of such an architecture. There are likely an infinite number of ways to construct such architectures depending on whether they are generalized or specific - MemGPT proposes one possible structure/methodology. One wonders if its creators have even read NLCA...
Links to the repos in vid description. Also, support me on Patreon so I can do this full time! Thanks!
If you want something that is more comparable to MemGPT, you might check out REMO: github.com/daveshap/REMO_Framework
Relevant video: ruclips.net/video/nDOmoIFx8Ww/видео.htmlsi=GyryMwOa7Oh_It2o
This is actually something I've been doing without realizing it.
Both in getting the model to prepare itself for a conversation & in summarizing conversations or docs for later use.
Asking for a concise list of topics, frameworks, or "table of contents for a book" related to what you are about do discuss dramatically improves a models' ability to provide more helpful information, or do work more effectively.
I'll have to look into MemGPT to see how it works. It might be a good "deep knowledge" tool based on how others are talking about it in the comments.
“There is no limit to what can be accomplished if it doesn't matter who gets the credit.”
Your title is kind of clickbait jumping on the memgpt train. Its apples and oranges. Your theory results in a very efficient way of storing and querying the data you have. It however isn't a solution to the context window, which is still limited.
"I have problem with my memory"
- Oh here's an compression algorithm.
Both are good solutions. Keep up the good work. Like your videos!
So how would you apply SPR for "Chat with documents" task? Would you try and compress the whole Knowledge base into a small piece that would fit in the context window or would it be some combination of SPR -> Vector DB ?
Crickets
Definitely going to push this and see what it can do for legal. Any gains are substantial here.
Hi Dave, have you come across Sparse Quantisized Representation?
I think there are a lot of similarities to the brain, especially if one compares LLMs with the publications of Numenta and Jeff Hawkins. Not only in regard to Mixture of experts architectures compared to many cortical columns and voting and communication between them, but also if you compare one column with one transformer model. The way neurons work is different, but we have multiple layers which receive motion and sensory information and associate them, transforming the motion into a location signal. So it models sensations at locations. LLMs have a semantic vector for tokens, so the vector has a semantic meaning still sufficient to distinct semantically similar words, like guitar, piano and flute will be closer together in some dimensions of the vector. Then there is the motion layer, which may be an equivalent to the position vector in LLMs. Final the attention mechanism might lead to an equivalent to SDRs, so sparse distributed representations in the brain, which might be even leveraged by the concept you describe here, with SPRs. SDRs are essentially long binary vectors with each bit encoding a semantic trait, think of a QR code but each dot, active or inactive, has a semantic meaning, and thanks to combinatorics an almost infinite amount of concepts can be encoded, and even processed in parallel.
One thing in which they differ, at least to some degree, are the neurons themself. HTM neurons predict their own activation based on detected neuron firing patterns (SDRs) that typically predate their own activation. But then with the way ANNs work the way it’s modeled they seems to also be able to model sequential patterns, perhaps even more effective then the brain. Geoffrey Hinton has recently changed his mind, now thinking AGI is close, he now thinks with backpropagation we may already have a superior mechanism compared to biological intelligence, in the past he thought we make AI better by making it more like the brain. Our models are currently just smaller then the brain, we are around at 1 percent. But then the size has been growing by an order of magnitude every year for the last couple of years, and GPT4 is already over half a year old, then meets well with 2024 or 2025 predictions for AGI. After all it makes sense to me, biological systems are messy and not precise so the way brains work need to be extremely robust in order to work, sacrificing potential performance for robustness and redundancy. With mathematically precise systems more might be possible working with the same capacity, so 2024 might be plausible.
Current systems like GPT4 have around 1 trillion connection strengths. The brains approximated equivalent capacity is around 100 trillion (acknowledging it doesn’t use weights as they are do fuzzy to work that way being biological systems; which might however speak more for LLMs rather then for biological neurons. Biological synaptic connections are quite binary). Gemini might be in the 10 trillion range.
@@ct5471 That's a nice fantasy you got there.
Trying to draw parallels between MoE and the neural correlates of cognition lacks experimental grounding. These correlates are sensory representations of perception (i.e. a process), not conscious agents themselves capable of perception (agents) - a category mistake, indeed. A more fitting analogy for MoE can be found in Frederic Myers' concept of the "subliminal self" - a multiplicity of subconscious agents whose existence he experimentally demonstrated.
"The way neurons work is different"
That's an understatement of colossal proportions. Biological neural networks operate on the basis of analogue signal processing (Hodgkin-Huxley model of action potential genesis, modulated by superposition - that is, from quantum to electronic, then chemical, hormonal and epigenetic scales and likely beyond, into the heart of the transpersonal), whereas artificial neural networks are glorified simulations of transistor gates.
I'd be careful to completely dismiss something just because I can't imagine a current use for it. Regardless, yes, fair, that method does seem to be quite effective. I feel like this could be useful in combination with conversational context, with it representing topical concepts that don't strictly need to be encapsulated fully.
Dave you are a genius... Respect bro... I have been using your SPR method successfully... Thanks
I was using this trick but I just use the keyword "summarize", and shortly explained what is the goal for this summarization. Your prompt is way more precise, I'll be experimenting with this.
I like this! I think there is a big push of many of us barking up the right tree on these kinds of methods. I have been working on something similar in the background using self assembling knowledge graphs from vector stores for this purpose. If only grad school and work didn't take up so much of my time... :)
Awesome someone finally tells datasets how thel act.
And to tell the questions as statements no questions left.
I first began following your "Big Brain" stuff at the beginning of this year, when I didn't know anything about anything. Now, I've developed 3 RAG projects: Real Estate Law, Hollywood Labor Contracts and The Bible. So far, I'm able to get fairly good answers within an 8K context. I've learned my lessons well. I maintain chat history context with the models using the "standalone question" technique. It seems to work so far without having to send the entire chat history to the model in each prompt. I see MemGPT essentially removing the necessity for the standalone question as it would allow the model to know the chat history with every prompt. Now, I may totally not understand MemGPT at all, but that's what I think it would do. However, I don't understand how I could use SPR at all for this purpose. Is there any documentation on this?
I instantly thought of your work when I saw explainer about MemGPT. Was getting ready to play with it after my vacation. This vid is perfect timing
David never disappoints. Thanks!
If StarTrek is post scarcity, why do higher offers like captain picard have bigger quarters than lower crew?
Huh, that's actually fascinating! And talking about how you can prime it with just a few words actually reminds me a lot of the things mentalists like Darren Brown frequently demonstrate - that human brains can be "primed" by saying certain words or exposing certain images or sounds or smells etc. That then can be leveraged to get them to give certain answers or believe certain things or act certain ways that are predictable.
And I know there's a lot of debate about whether people like him fake their stunts, but it's irrelevant as they still explain a very real phenomina that has been observed under lab conditions. Additionally we see it in the real world with how propaganda and authoritarians and cult leaders seem to have this almost supernatural way of "hypnotising" people into following what to most others is obvious lies and BS. It's like watching the Pied Piper leading the rats to their doom - you watch from the sidelines dumbfounded at why the rats are following the tune and can't see the obvious cliff they are being led off of. In the real world, the "tune" are certain words and phrases designed to shut down critical thought and "prime" the person into a certain predictable thought pattern which can then be either exploited or further manipulated.
Of course there's plenty of positive and neutral uses of this too (as this mechanism is heavily involved in how we learn new things too), it's just the negative / malicious uses are the easiest to talk about and demonstrate.
This talk of SPR's very much has heavy echoes of that for me, so it makes a lot of sense. Thank you David as always for your incredible insight! ❤
I think you hit on something I’ve thought for a long time, that we need something like associative organization. It’s like categorization but more refined. ChatGPT and other AI’s need to support multiple personalities so people can experiment more.
Thank you so much for noticing that humans have flawed logic as well! So many people complain about the current state of LLMs, never realizing that they are demanding that LLMs be more than anything that humans have ever been, which would be insane to expect at this point. The more I look at people, the more I see that their behavior can often be captured by a "flawed" llm.
I have had a lot of success with fine tuning, but this is a brilliant concept!
SPR’s are a useful optimization for managing memories, but they are not a substitute for MemGPT. Even with SPR’s, limited context means that a mechanism is still required to store all memories and retrieve those that are relevant.
Yeah!!! So this is basically what I've been saying. It's not "textbooks is all you need." It's "Textbooks AND POETRY (song lyrics for example) are all you need."
Then once she understands linguistic relativism, and can understand both the general and the specific: BOOM. 👯
The tiny and the huge in unison. Knowing when to be small and when to be giant.
This is why I often say linguists are better at AI
@@DaveShap Yeah!! We're entering an age more akin to magic. Where precise words become objects of extreme power. We need to open a school to teach people how to live in this new awakened world.
Cause, magic can cause a LOT of good, but also a LOT of harm. And the Drain from using it wrong sucks and hurts and takes time to recover from. And a lot of people sure don't be ready for where we suddenly find ourselves. Yet, nonetheless, this IS where we find ourselves.
This is perfect timing! I was about to dive down the MemGPT rabbit hole 😮
I agree. MemGPT provides a structured storage and retrieval of concrete items. Essentially using an SPR as a way to search for the original context. Which results in better data to infer from and fewer chances to hallucinate information.
This should be used as a subject summary for saved AI conversations that the AI reads when searching to find the correct chunk of history text to extract information from.
Super helpful for a personal project I am working on right now. Thanks for the reminder, Dave!
I haven't watched every single video you've made, and didn't know you'd made a video about SPR several months ago, or what you meant by SPR. Also, I see no sign above of a link to that previous video or much of any other help in getting anyone to watch that video.
I really like your channel and content for the most part, and find it very important to understand. So I'm just pointing out you sounded very critical of anyone who didn't watch that particular video that you made, and anyone who dares consider whether to support memGPT.
So I don't know whether you care about your audience's opinion, but personally I would recommend you to take a step back from that mind-set so you don't drive people away. If you're irritated about experts who are trying to push those methods, well please say that instead of making it sound like you despise everyone on the planet who does not hang onto every word of your every video.
how would you go about compresing a book of 500 pages? You compress page by page and then nest them and compress again?
Large context window like Claude
6 days ago i love idea memgpt and now i know of you... great better system, congratulations, thanks! :D
This can be additive to memGPT. All of these sparse priming representations can be stored and retrieved from the vector database
David, this is yet another example of why you are awesome, keep up the amazing work.
Suppose you have a text that is so long that its SPR exceeds the context window. How do you manage that?
That won't be an issue soon. Claude has 100k
Is it necessary to feed an LLM the “theory” portion of the Generator and Decompressor prompts? The Mission and Methodology portions seem adequate to produce the same results. Or do you think the “Theory” section provides the context necessary for this to work?
I missed the first spr video so this was extremely helpful redirection away from memgpt ty
I was trying to get the AI to make manual "checkpoints" that summarize the current context so I could transfer it to another chat. I ran into data degradation very quickly. It's awesome to get some evidence that I'm on the right path 😀
Ok, I get that destilling information is crucial and you have to keep your context window clean. And I also understand what your System Prompts do.
But I struggle to understand, how this could be implemented. Do you compress the information BEFORE you insert it into your persistent-storage? And then you uncompress it?
Or do you mean to always compress everything into one message, to keep the context-window clean without loosing content?
Coult you help with that?
I think a challenge is for the AI to know when it should pick out something you have said to be of importance later. I guess a simple way would be for it to always "make a note" (compress and store) whenever the user expresses some meaning or thoughts of themselves in order to build some kind of profile of the user in its short term memory (primed context). When I saw MemGPT I thought it sort of summarized what my first ideas about ChatGPT was and how I would go about implementing some kind of memory to make dialogue feel continuous. I did some simple tests with ChatGPT even where I instructed it to make a short summary in curly brackets of what I had conveyed so that the service could then pick out these and store in the context, practically just massaging the length of the context before. It seems your ideas were the same.
Question:
If I want to use SPR to provide contextual data with my prompt, isn't the LLM going to output the entire thing decompressed and therefore use a ton of output tokens?
i think i get the point and it makes sense. But human memory is both declerative and associative (we can argue that it is also episodic). And i think they all have their use. I agree that it is not efficient to represent knowledge of declarative way all the time. Using an associative memory would inded make a better use of what it is already good at. İt also has the potential to amplify its weaknesses. Most of the cognitive biases that we have as humans come from inaccurate associations. I think we all can observe it in LLMs. One of the important benefits of using and storing declerative memory might be to overcome someof its weaknesses. I think it is similar to our situation ashumans. We are so try to use factual knowledge to overcome our biases. On the other hand it would be much more expensive for our brains to try to understand the world in a purely factual way.
I also think that we need to tap into the latent representational space not only using other tokens or words. İf we can somehow have used the latent space representation directly (like embeddings) it would be a more efficient way of doing associative memory.
Anyway thanks for the video and I think it has some very valid points
I'm struggling to understand the utility of this in the context of memgpt's capabilities. LIke if i'm having a long running conversation that hits on various topics over dozen's of prompts, and I want to go back to a previous topic, I would need to stop, go back and copy and paste the previous facts into a compressed SPR summary and then paste that back into a new prompt window to continue the conversation. And if i then decided to hop over to a different topic from that same previous conversation, i'd need to repeat all that for the new information. This just seems inefficient vs memgpt where it can store and retrieve the facts and context from previous conversations without any effort on my part.
Or am I misunderstanding memgpt's capabilities? (be kind, i'm ona novice)
Love the uniform, but don't you lose fidelity? Thanks for the concept, will be helpful to implement in the future
so say im trying to concept a project, say a videogame, and I want to bounce ideas off the AI, can I use this to "store" my game design document and its concepts into something that chat GPT can more easily parse in the custom instructions so it doesnt forget key elements of the game in question?
Hi David, realle nice video and i started playing with the concept. However, I am wondering, if and how this sould work with a 500 page text book, because this will again not fit into the context window for compression. Any ideas here how you would appreach this?
Wow! I've been using a copy-and-paste list of instructions to generate amazing prompts for dall e 3. With this strategy I may be able to improve my image generation, with as much detail packed into a prompt with as few words as possible.
Great and succinct explanation of long-term memory!
this video saves my life!!! awesome work!!!!
Thank you for the amazing content. I have a question, is it possible to "interact" with the information compressed in the SPR without unpacking it? Like continuing developing and a complex concept.
I think SPRs are similar to how the brain works in that both the brain and the SPR compression process compress information and concepts, and MemGPT is a more reliable long-term storage device, similar to long term memory. Perhaps a good next step would be to have varying levels of compression, ranging from no compression to full SPR compression, and have MemGPT inject information at varying levels of compression. SPRs are still limited in context, but are also useful for fine-tuning.
wow you are a captain!
I wonder, do the mention your method here? "Resursive summarization (Wu et al., 2021b) is a simple way to address overflowing context windows, however, recursive
summarization is inherently lossy and eventually leads to large holes in the memory of the system (as we demonstrate in Section 3). This motivates the need for a more comprehensive way to manage memory for conversational systems that are meant to be used in long-term settings.
"
Idk, if I wanna pair it with a encoder decoder pipeline? Perform nearest neighbour results on it and then give it to an llm to frame an answer... What about it?
thanks dave very informative for my studys
Doesn't look like a memgpt replacement. Looks like you could stack with memgpt to try to simulate the mind.
how does this in any way solve having direct prompt access to huge amounts of arbitrary data, eg searching through databases of files??
I guess the question is one of commensurability. I.e. Are the problems that SPRs solve, and that MemGPT solves, comparable? And if so, in what ways? If not, in what ways?
What I like about SPRs, is simply that they capitalise on LLMs 'native' semantic architecture. If you're a metacognitive systems-thinker, you'd automatically tend to default to SPR-like and axiom-like heuristic approaches (I know I do). Hence I've been working on an approach that closely resembles SPRs.
The problems I thought MemGPT sought to solve however, is accuracy/consistency of responses. MemGPT's consistency was very well demonstrated in that paper. Thus by have retrievable memories, layered into a hierarchy based on temporal-contextual utility (akin to a spectrum from RAM to HDD storage/retrieval) you then can construct cybernetic holarchies (something sorely lacking in Wilbur's Integral Theory).
So personally, I'm very keen to integrate both approaches. Here's why:
Imagine a MoE with 8 experts, where experts #1 and #8 are SPRs. Experts #2-7 are specialist models, each trained on distinct datasets whose use cases are very different. One might be all about math, coding, and 'truth'. Another might be a writer (legal, creative, editor, etc). Another may be a specialist project manager, scheduler, resource allocator, etc. Another may be a Ui/UX designer. Another a researcher. Now in my case, I'm building proof of concept for an artificially empathic AI based upon a meta-heuristic (a highly distilled axiomatic heuristic (Bateson's Learning 3), for creating self-learning use-case heuristics (Bateson's Learning 2). I need a SPR-like approach to take the initiating event, and parse/categorize the inputs and analyse them for the presence/absence of the variables needed for my meta-heurstic, and then distribute work to my respective specialists. Since the meta-heuristic is based on the (first) principle that (when the question is sufficiently iterated it 'matures' to a point where the answer just pops out. And since in real world scenarios, we're dealing with ontology, phenomenology, and epistemology from data species associated to the physiosphere, biosphere, noosphere, etc... In building such an "app" my initial inputs may not have all sufficient data for a one-shot bulls-eye holistic solution to a given problem/challenge. Thus depending upon species of data and the degree of absence, I need a self-learning architecture to provide educational contextual scaffolding for the associated specialist to improve over time (so as to minimize the need over time for a (human-in-the-loop).
It may just be that SPR-like approaches are a smart version of interfacing with LLMs for (Bateson's) Learning-2 problems.
And Mem-GPT-like approaches are attempting to build the nuts and bolts for (Bateson's) Learning-3 architectures (whether they know it yet or not).
do you do most of your work in the playground?
am i limiting myself by staying in chat..?
yup you are,, you haven't tried turning up the temperature? it's super fun, it'll open a whole different world
@mungojelly i'm running out of room in my mental universe for new worlds these days lol
but who am i kidding?? thanks for the tip, time to blast off once more i guess...
@@Art_official_Intel_it_spits it'll get you 60% of the way to understanding temperature if you think of it as the bot being drunk,,, human drunkenness has a variety of effects beyond just causing us to choose less likely words as we talk, but as far as that part of it it's shockingly similar,,, turning up the temperature just a little bit is just making a bit loose, a bit casual,,, so it gives you some perspective to try it out & then you'll immediately start to hear it differently when people are like, AI is so stiff, AI isn't creative, b/c you'll know they've never had the temperature above 1,,,,,, you can still have the opinion that AI are uncreative if you've tried talking to them in a looser mood but it's just ignorant i'm afraid all these people w/ the opinion that AIs are stiff & they've only ever talked to bots on their best behavior at work stone cold sober
Thanks! I tried a few texts and it looks it's a slight improvement to a simple "Summarize this: ..."
Still have to test it more. It certainly saves some tokens.
Is there going to be any AutoGen integration for SPR?
Why do you suspect MemGPT is getting more press than your much more elegant SPR approach?
Hey, quick question, does this also address sequential memory? Because that's something that I've run into. I had the same idea that tokenizing or creating tags to summarize messages for itself. My motivation was that I was trying to come up with a way to utilize the message limit on GPT-4 most efficiently. Even with the tagging system it still seems to forget without prompting the model to read the previous conversation what happened when.
No I haven't tackled chronological memory in this
thanks 4 the video!!🖖
This all depends on how well or poorly written the initial text is. If the text you are trying to compress is already concise, this technique won't be of any use because any information omitted from the SPR will necessarily be required to understand the original text.
What about for long context windows? Is this useful for that?
I just tried it on a random paper and it's absolutely nuts
I decided to try packing an entire novel into an SPR, unpacking and the results were very interesting.
Do explain.
Thanks. very useful. will try this.
Normally My Local LLM consume 30% cpu and 40% RAM.. After integrating with SPR now it consume 95% cpu and 80% RAM. I still confused how it happened...
“Don’t try to get around the context limit” and then you leverage LLMs to compress data into the context window…
I like what you’ve done here, as it resonates with how I’ve been experimenting with LLMs, I’d just say there are probably content specific system messages for each type of content. E.g. compress a resume/job description. It’s probably better to prompt the LLM as a “career advisor” to do the task than it is to fully abstract down to primitives.
Lets say i have the documents of 300 to 500 pages and i want to create the summary of those docs will SPR way do the work ?
Yes but it will be lossy, the goal is to be good enough. You can also do SPR on chunks
@@DaveShap but chatgpt and other models have token limits that restrict us to feed the data to the llm so how we can bypass that limit and do the SPR thing ?
I like to use the proximity and continuity of the words when looking at associative learning, because I believe that to lends itself to the idea that things have some sort of distance between one another. Water has a closer proximity and continuity to the beach than does the golden age of Roman have to water, for example.
This is precisely what LLMs do, they learn relations between concepts. Internally they perform translations (shift) between tokens (words) embedded in a multi-dimensional space. Every direction represets a different kind of relation.
It seems that OpenAI improved initial prompt in the new chatGPTo1 - now makes less mistakes that before:)
Can you show an example of this being used when calling OpenAi Api?
Very interesting, thanks!
This works awesome. Easily one of the very best prompt ideas out there. I modified it a bit for my own use cases and it works like a dream. Im no longer getting these starved crappy replies.
Do you know about the theory of Relvance Realization? It could be an interetsing topic to apply to AI
It doesn’t sound like there’s any associated code you can run locally though.
Great video!
fine tuning is phenomenally useful for retrieval
Great content, but SPR is not an alternative to MemGPT. MemGPT is a content retrieval mechanism or a system that retrieves relevant context for an LLM. MemGPT is a system that manages LLM memory in a similar way operating system is managing memory: context window (equivalent to RAM) and external context (equivalent to hard drive). it is trying to solve the issue of retrieving and managing the relevant context from external memory in a similar way your computer does with RAM.
Sparse Priming Representation and MemGPT are essentially two completely different things. You could use SPR within MemGPT to save conversations and external contexts (e.g. documents) to make context saving and retrieval more efficient.
So this is basically a relational Column with verctor storage and associated keywords ?
Compressing knowledge seems very useful. Kind of like an intelligence Winrar. Humans learns stuff through compression too. I wonder if AI can be made to store knowledge in different ways, whatever is suited best.
My man showing up to work in uniform! Report to engineering...
on point, as always
How to use it Local llm? Still I don't get it
An agent could in theory keep learning during inference and have no huge window of tokens issue, right? I would say you can't claim agi without continuous learning AND continuous inference... so should your prediction be correct in a year we shouldn't be bothering with ways to circumvent at this level windows of tokens. Having said that, even humans have severe limitations in this regard, so I'd guess it would just be a new level of limitation...
No, there's no online learning yet AFAIK
SPR seems really useful and amazing way to compress the data with lossy compression. But what if you need the model to remember A LOT of specific details like names, notes and dates and correlations in large amount data? SPR is good tool but not made for this task.
Can It formulate factual information which occurred was past 2021 knowledge.?
That's not really what SPR is for.
Thankyou for giving me a reply, I just subscribed you. And really fascinated with the work you have done. A comment regarding this video: If there is some factual information (like some company data) SPR will change it to important keys, but still we need memory element (contextual memory to fill it up gaps with factual information)
I can see the value in this approach. I dont think MemGPT and SPR are mutually exclusive. SPR sounds like a preprocessing step - where up front intellectual work can be performed, and later associated to items in a dateset. You could run SPR against a dataset with says 1 million records, you would now have a dataset of 1 million records that are associated to SPR summarizations. MemGPT would come in as the retrieval mechanism over a very large dataset, and the SPR annotations would be aiding memGPT in its retrieval task.
I don't know how a concept of lossy context compression can even be compared to an approach that has an actual persistence layer, a way to store facts lossless and to dynamically retrieve these efficiently. It's like saying, "Hey a computer is kid-stuff, you don't need one, just focus on JPEG-compression, it solves all problems!".
Fantastic walk through of your SPR work.
MemGPT nailed the 'marketing' of an open source AI framework. AutoGPT did too. I remember jotting down an architecture for "infinite memory" years ago (as I'm sure many early LLM enthusiasts did).
As some of the commenters have alluded to, to replace it with SPRs there needs to be some kind of drop-in "SPRMem" framework. I'm sure it'll appear at some point.
Thank you for posting; this was very informative.
There is a lot of stuff that GPT doesn't know, and that's what we at least work with. Private, secret information.