It’s simple. The developer gives the LLM a description of the world, the kinds of technologies that are available, modes of transportation, etc. and then give it an instruction to only answer as a character living in that world. Problem solved.
@@JingIeFett currently GAI can't handle those kinds of exclusions very well (that's how you ended up with the whole AI only generating images of america's founding fathers as minorities thing), and due to how the AI is developer, i.e. using machine learning to process petabytes of data to generate a working language model without having to spends the hundreds of thousands of hours it would take to code something like that by hand, you'd always be running a risk of the data you use containing even a passing mention of something you don't want in your setting and it ending up in the AI's dialogue Now you could build one, carefully, with a more curated dataset to restrict its influence, but there's still a chance it could slip up or something could worm its way in. But that would take an ungodly amount of time to build and would, generally speaking, be a complete waste of time and resources for most game developers. Why spend thousands of hours developing an AI to do something when you could have a bunch of voice actors do it in a hundred. best case you have the AI generate a library of responses saving the writers a lot of time as they'd then only have to proof everything which presumably is part of the process of these things anyway. However that is the kind of thing entry level employees do, so you've got the same problem again, how do you continue to inject talent into the industry to do the things isn't advanced enough to do if you let AI take over the roles new employees would usually fill? Enter them in at a higher level? Maybe but if you wanted to avoid that tanking product quality they'd have to go through months or years of additional training which would massively narrow the pool of people willing or able to study for that long just to get their foot in the door.
The main issue I see with normal people and AI (normal as in non-tech people and people who've not studied AI) is that they think GAI can do logic. I've seen videos of people explaining GAI as "it doesn't actually know anything, it just uses logic and patterns." Which is false. GAI has no logic. It solely works on probabilities of patterns. It doesn't know anything, all that's happening is basically a list of values (weights) tell what letter comes next. It gets several contenders and picks the one with the highest probability. So even if it feels like it "knows" how things work, it really is just acting. It acts confident and might even apologize before making the same mistake again. It has no clue what is going on because that's not how the technology works. It's quite literally brute-forcing a "mimicry" of human behaviour, or more specifically, human text. This means it will also learn from all the mistakes that text has and is unable to correct itself as it HAS NO LOGIC. And no, GAI won't be getting much better. We might find better ways to do things, but the training itself is showing stagnation already. This is also due to people using AI-generated content to train AI models which basically lobotomized the AI model as you keep training it. We just can't do better with vector based matrix multiplication anymore lmao.
I saw a ai short where it seems to think that the correct answer to a handshake is to swap out your body and at the same time do a arm lock that also fused the arms into one
The issues of hallucinations (7:00) is that generative models have no comprehension of what they are doing. An LLM like ChatGPT just keeps writing whatever text "comes next" based on its parameters. It is NOT a search engine! It is NOT just repeating what it has read! Instead, it's doing something way cooler but also a lot more prone to weirdness: It is creating text that follows the same patterns as things it has "read", with absolutely no context regarding what that means. Supposed you train the model on a thousand Wikipedia articles. Then you ask it to write an article about something made up. It won't copypasta together an article, it doesn't do that. It won't "search" through a database to find stuff similar to what you asked for, it doesn't do that. Instead, it will just start outputting words in an order that's similar to how words in Wikipedia articles normally go. This forms recognizable sentence structures, paragraphs, even poetry, but none of it is planned out, none of it actually relates to itself. When something is heavily reinforced by training data, something like "the answer to life the universe and everything" being "42" it will generally reliably output the "correct" answer, the further you stray into uncharted territory the less likely there are any reinforced paths for it to follow to generate a sensible output.
Anyone who thinks AI can't replace certain types of RPG characters, does not understand RPG characters... or RPG games... or even video games... A GAI NPC is not really useful for general stuff, it would be useful for an ever evolving game, like Kenshi.
When it comes to integration into a game, it's actually really hard to make these things DO stuff. You gave the example of bartering, and the big problem there is that the game really has no way of knowing the contents of the conversation. In a normal dialog tree, certain functionality is associated with each option. But if the options are generated by an LLM, they aren't. You can try to use an LLM to assign them to functionality in the game... but I think you know what the problem with that would be. It's a really difficult problem to solve. In gaming, I think the strongest use cases would be for "fluff", for ambiance. Imagine walking through a crowded street, dozens of conversations taking pace around you. Baldur's Gate 3 has some incredibly immersive scenes like this, where they crafted tons of conversations, recorded voices, and play them all over top of each other. It's super cool, but probably cost a ton of money, and if you stand around in one spot too long people start repeating themselves. But what if instead, they prompted an LLM with a few dozen basic conversion premises to generate brand new conversations every time you go into the area, then generate voices to go along with it? If you talked to an NPC, you would get a canned dialog tree, but the ambiance could be made up using a very narrowly scoped set of prompts that do not involve any user interaction. This doesn't side step the issue of paying voice actors or writers though, that's a whole ethical issue that goes beyond the technology.
At 5:43 is a really important point: A critical flaw in current generative systems, particularly LLMs, is that they don't know what they don't know. ChatGPT makes stuff up, when a human would say "I don't know" which leads to some really big problems. This is hard to fix, because the "AI" doesn't really know what it's doing, it doesn't really understand language or pictures or anything. It can't even understand the idea of not knowing something. And no, you can't just "code it to say I don't know" because well... it doesn't know that it doesn't know. I take issue with the idea that the AI NPC not knowing something would be a bad thing though? Like, real people don't know everything, I would expect the NPC's knowledge to be limited. That's actually one of the problems, which you identified separately: If it knows everything, it might say stuff they don't want it to like spoilers. And any system sophisticated enough to filter the output would by definition be a more sophisticated system than the model itself... making it redundant.
I think you are making a hardline generalization that you likely shouldn't. You can't say the LLM doesn't know what it is doing, especially considering we don't know what the LLM is doing. It is far more fair to say it doesn't understand pictures, but there seems to be a significant amount of understanding of language. And saying stuff like "Code it to say I don't know" is something you can't do, is wrong, because if you code an LLM to say it doesn't know, when it doesn't have an answer, that is something that would be integrated into the neural net when it is actually trained. And as you should know, if you actually know anything about an LLM, but it seems like you don't know what you don't know, is that we already have models that filters LLMs that are no more sophisticated than the LLMs by themselves, so it doesn't make the models redundant. You seem to have a very "Basic" understand of what an LLM is, what it is capable of, etc. You can frankly, in the same vein as an LLM, argue that a human doesn't understand the very language they are speaking... because we can't actually prove that a human does understand the language they are speaking, or if they are merely mimicking the understanding of the language they are speaking... and it comes to the philosophical point of "If it quacks like a duck, it's a duck"... If it acts like it understands, it understands. The primary issue with LLMs is that they are made to ALWAYS output... regardless of context, and because they have been made to always output, then like you holding your own breath until death is essentially impossible, in the same vein it is impossible for an LLM not to output something... So whether the LLM understands or not, is completely irrelevant, because it is physically limited to always make an output... And if you say "That means it doesn't understand language", then... Welcome to the world of pathological liars. They are almost nearly incapable of not just inventing lies and stories. I've known a guy like this, and the lies could be extremely wild... (From "I broke my foot yesterday" (coming into work with a fine foot) to "A dog ate my sister" (He has never had a sister) )... So if an LLM always outputting something, whether it is true or not, means it doesn't understand language, my friend from when I was younger also didn't understand language. You are suffering from what you claim LLMs are suffering from, simply not knowing what you don't know, and an incredibly simplified understanding of what "understanding" even is, and using really bad arguments for it.
This shit is why it drives me absolutely BONKERS when people say they use shit like ChatGPT to help them study and whatnot. These things have been repeatedly shown to make shit up and take things from literally any source regardless of veracity, no, I don't care that you're just telling it to summarize a Wikipedia page, YOU SHOULD NOT BE USING IT FOR THIS OR ANYTHING ELSE.
Re: 8:08 it's always randomly guessing. The model doesn't contain the training data, just the parameters that were stored based on it. It is always "randomly guessing" it's just that the more prevalent data was in the training, the more certain parameters were reinforced. Enough reinforcement, and it will reliably repeat the same information though not by "finding it in the training data" but by actually reproducing it the long way around. The glue on pizza thing is a slightly different situation, because Google's AI search feature is a hybrid system that uses both traditional search algorithms and an LLM to read pages then write synthesized results. Basically, Google search found a stupid answer then had its LLM read/summarize it. Google has had info cards that appear at the top of searches for years that are supposed to surface basic facts or pull short quotes from sites. These cards have been scrutinized since they were first implemented, as they commonly will surface inaccurate data. The more recent issue is just compounding the old one by having the LLM re-write the information. That one was technically not a hallucination. Google and Bing both have these search/LLM hybrid systems, but it's important to understand how they are distinct from things like ChatGPT which is just an LLM. And importantly, Google and Bing are taking the output from a search and using it to prompt the LLM, or the output of an LLM to perform a search query, theses are two systems that talk to each other but each work in very different ways.
I am a big fan of RPG (especially CRPG) genre and the biggest part why i love this types of the game are simply roleplay and most importantly the companions. Not only i feel attached to them i also care about writers which made them so alive. I was happy to learn that all my favourite Dragon Age and Mass Effect characters were written by same person. It was amusing to know that both Astarion (BG3) and Fane (DOS2) had the same author behind them. When the RPG has memorable companion i always check author's portfolio to see more of their work. Sometimes i analyse the writing and favourite archetypes of certain writers for hours and why it's unique and special. AI just takes away this aspect completely.
And that’s all it can ever do, regurgitate already existing stuff. Anyone complaining about Disney slop will only get regurgitated Disney slop with GAI, cuz that’s what it learns on… the hubris is just so thick I’m constantly impressed by it
I mean I was worried that gaming companies were gonna force NFTs and such down our throats for years to go and everyone has already forgotten about them by now. GAI will probably continue to exist in some form or another, but once the hype dies down I think that most companies will start using it for much more reasonable applications rather than trying to use it for literally everything like they are now. AI research will always continue of course but I think that the idea that the GAI of today will evolve into AGI or Artificial General Intelligence aka Sci-Fi level AI is just fundamentally not going to happen. It's just a completely different branch of AI research
Re: 13:20 we know it's not actually learning, because we know how the system works. The question of "is Gen AI sentient?" isn't answered by a turning test or by any other philosophical question, it's actually really simple: No, it can't be, it doesn't even know what it's doing. Asking it the rules actually tells you nothing at all, because all it's doing is creating the form of the answer. The rules aren't in a database, they aren't in it at all. It just knows how to create the text that comes after "what are the rules of chess?" without understanding them. It plays chess by outputting text that seems similar to what someone playing chess might write, but it has no idea that it means anything. And I know, it really seems like it does, it even listed what all the labels mean! But again, that list itself was just prediction of what comes next, not something it found in a database, and not something it is programmed to actually be able to do.
You REALLY post a lot here about things you don't even really understand. You say "We know it's not actually learning", while people far smarter than us are discussing whether what it is doing is considered learning. It is frankly not too far from what we do as learning... And saying that Gen AI can't be sentient, because it doesn't even know what it is doing, is also like I pointed out before, a really bad argument, because we don't know if it understands... and you can't say it doesn't understand, if you don't start trying to define understanding. I'll bet quite a few definitions of understanding you put out, I could quite literally use that definition to logically conclude that humans can't understand. We don't know how Gen AI actually really works under the hood, so we can't really answer a lot of the questions about Generative AI that we might have... This message itself I can argue proves that you are incapable of understanding. You are generating quite a few arguments, but most of these arguments have nothing to really do with evidencing whether Generative AI is capable of learning or understanding. You are saying "It is just generating the thing that comes after the rules of chess" (heavily paraphrasing), but... that isn't really far off what humans do. We take input, a lot of input, and we generate output, a lot of various output... In terms of the general concept between neural AI and humans, there aren't really a big difference, outside of the human brain being focused on continues input and output, and changing the weights of the nodes as we go along. There is nothing that defines understanding that you can attribute to a brain, that you also can't attribute to a Neural AI made more similarly. Understanding is an emergent property, and emergent properties usually has this really annoying thing where it can't really be clearly defined... It's like defining "Going fast"... Do you go fast if you go 5 mph? What about 10? What about 15? What if you are going 500 mph? But what if you are a rocket? Then 500 mph isn't really that fast, so are you fast at 1000 mph? The same thing with understanding.
This. "Why generative AI characters in role play games are bullshit" isn't that much longer, and only has one abbreviation - and it's the one abbreviation that in this context is actually completely meaningless. Yeah yeah, "generative machine learning systems" is harder to say, but I hate the use of "AI" as a buzzword. Please stop taking cool technology words and making them sound like sleezy garbage. These days talking about cryptography sounds icky, even if you aren't talking about bitcoin or blockchain.
I fully agree with what you said at 17:48, they show off art because it's subjective. Hallucinations don't always make art worse, in fact they can make it better and more novel.
At 4:00 the explanation of how image generation works is highly inaccurate. An image generation model does not contain a database of images, doesn't search through anything, and doesn't "remix" images. This is a very commonly repeated description, one that I suspect originated in a good-faith attempt to simplify the concept, but it leads to a lot of misconceptions about these models that have some big implications for arguments further along in the topic. As a result, we keep seeing this description repeated and the same misconceptions and the arguments that result from them again and again. This doesn't mean that some of the conclusions aren't still valid, but the supporting arguments needs reworked in light of a more accurate understanding of the technology. I'll try to give a simple explanation that is more accurate than the video, though it may itself be overly-simplistic in its own ways. To train a generative model, one uses a database of "training data" that has been tagged with descriptions and feed each item into the training program. The specifics of what the training program does varies depending on the type of model, but the basic idea is that the system "looks" at the contents of the input data and adjusts a massive array of values (known as parameters) in some way dependent on that input. Each input causes that massive set of parameters to change just a little bit. None of the data from the input is actually stored in the model though, the input just causes the training system to tweak the parameters. After running thousands or millions of inputs through the trainer, you have a very big set of special values in your parameters that constitute the "model". Later, another program that works basically the same as the trainer but in reverse (sometimes it's actually the same program) takes some randomly generated data, then starts modifying it based on the parameters in the model until it produces an output. Some key insights: The model doesn't contain any of the data used to train it. In fact, the file size of the model is orders of magnitude smaller than the combined input data, it literally couldn't contain it. When multiple pieces of training data are similar, their effect on the parameters will be similar, this "reinforces" the learning of those similarities. Over-enforcing something can lead to a model with a bias. This could be something like generating white people more often than black people because there were more white people in the input images, or it could lead to things like reproducing watermarks from stock images sites. The model isn't copy/pasting the watermark in a case like that, the issue is that so many images with that mark were fed in, it over-enforced the idea that that watermark is what images are supposed to look like... so it makes images with it. It's no different than images of faces training the model what eyes look like, but in this case it's an undesirable outcome. Regardless, the inputs aren't encoded into the model, they just have a slightly impact on the parameters. You could extremely over-reinforce an model until it only output nearly identical images, but it still technically wouldn't contain the image. It could absolutely constitute plagiarism, but it's never literally "copying" anything. The model also has no idea what anything actually IS. This is extremely important to understand. It doesn't know what a "cat" is. When an image of a cat is fed in to train it, and labeled "cat" it effects the parameters in some way, such that when you ask for "cat" it will output something based on those parameters. But it doesn't understand the words or the pictures, it only uses them to either update its parameters, or uses its parameters to output a picture that goes with the words. Or text models like Chat-GPT, they don't know what anything means either - they just predict what text "should come next" based on their parameters (200 billion of them!) with no real context of what any of it means. Just like an image model, an LLM is trained on a bunch of data but rather than storing it to use later, or just tweaks its parameters a little at a time as it "reads" until it has a set of parameters that when used to generate, will create text that "looks like" something a human might write. That is to say, an image generator can not draw a cat. But it can create an image that looks like a cat. An LLM can not write a poem. But it can create text that looks like a poem. If either does a good enough job, then the result might be indistinguishable from the real thing. This I will compare to imitation almond extract: It's actually the same chemical compound that gives it that flavor, one is made inside a plant, and the other is made inside a lab... but it's the same stuff once your remove any impurities, the lab stuff is just as safe as the "natural" because it's the same chemical. But as anyone familiar with the matter knows, they don't actually taste identical - because the natural almond HAS impurities, and they add something to it, something hard to quantify and hard to reproduce. Something you could easily miss, but something that just makes it more real. (Which is not to say that's always better, the "fake" stuff won't set off allergies, so... )
While you have a much better understanding than the video, your explanation is also not entirely correct. While most genAI models don’t know what a cat is, some can. For example, some models consist of a VAE for images and one for text. The idea is to use image-description pairs to couple the latent space of both the image model and the text model to one another. First, you try to force the text and images to map to themselves and add loss terms to encourage them to map to one another as well. Some success was found with this using VQ-VAE if I recall correctly. Then, you can generate random images by randomly sampling the latent space. Alternatively, you can use the text encoder to calculate a value from the latent space and then introduce some random noise. And there’s a myriad of other strategies out there. Too many to encompass in a youtube comment
It is also arguable whether or not an autoregressive model ChatGPT does or does not understand the language. I’d recommend looking up distributional semantics and specifically the distributional hypothesis. A popular quote for it is “a word is characterized by the company it keeps” and is the foundational principle behind models like word2vec. In other words, one might argue that knowing what words could or are most likely to appear next in a sentence is the same as understanding the sentence. From a practical standpoint though, you are right.
@@Foulgaz3 You are correct, but that's an overly-philosophical way of thinking about it, and I would argue it obfuscates the point. Like I said, my explanation is by nature overly-simplistic itself because I'm trying to use terms that most people stumbling across these comments could understand. There are many stages to the process of training or generating, and each type of "AI" does things differently. But to understand the generalities, those implementation details aren't important. Anyway, I contest that "knowing what a cat is" is not merely a matter of being able to associate the word "cat" with an arrangement of shapes and colors. Yes there is a complex system for mapping words to images that allows an image generator to produce a "cat" when asked for a "cat" but that doesn't mean the generator knows what a cat is. Just because something can reliably identify an image of a cat or produce an image of a cat doesn't mean it actually understands the concept of a cat. Furthermore just because an LLM can describe a cat, doesn't mean it has a fundamental understanding of it. "One might argue that knowing what words could or are most likely to appear next in a sentence is the same as understanding the sentence." Sure, and one might argue that a child playing with LEGOs has an understanding of architecture. But one would be wrong. Understanding is not mere association. Understanding is not linear. Understanding isn't even the ability to produce an accurate answer. Understanding is something far more complicated. To understand a cat, you would need a multi-modal model with many trillions of parameters capable of processing language, sound, texture, 3d space, movement, mechanics, physics, and a dozen other modes trained entirely on cats. To understand a cat, you need a "cat model" that encompasses all the ways that one could observe a cat, all the ways one could depict a cat, all the things there are to know about cats, etc (EDIT: This comes across as me saying it needs to know literally everything about cats, when what I mean it needs to be able to know all (or most) of the KINDS of things someone could know about cats). And if you managed all of that, it still wouldn't be AGI because it only knows about cats 😅 That doesn't mean it can't make really good pictures of them. That doesn't mean it can't write a poem about them. But at the end of the day, the model doesn't understand.
@@Lord_zeel Perhaps, but you’re not exactly approaching the concept scientifically. If that’s your intent, then fair enough. Otherwise… Can you prove that understanding is not mere association? I mean that’s what happens when people become experts in a domain. Neural associations become reinforced and strengthened. Why would you need a multi-modal model with trillions of parameters entirely focused on cats? To be frank, I find that idea totally ridiculous. If that were the case, no one living or dead can ever have been said to understand the very concept of a cat. At that point, the definition of “understanding” is no longer useful. But let’s examine another aspect of that argument: the need for a multi-modal model. According to that logic, someone born blind is simply incapable of truly understanding anything because they will never know what anything looks like. Or there’s people like myself without a sense of smell. Even someone with all of their senses will never see infrared or gamma. Again, this definition of understanding ceases to be useful. Personally, I would consider understanding to be multifaceted and continuous in nature. Perhaps a painter might understand some of the anatomy of a cat and how to replicate one onto a canvas. A vet would understand the cat’s anatomy and physiology, but might not understand how to paint one. Defining understanding as a state in which you know everything that is possible to know about a concept doesn’t seem particularly useful. And, as someone who does AI research professionally, if you create a model with trillions of parameters, you’ve probably done something horribly wrong. There are generally exceptions to every rule, but that’s a good sign that there’s probably something better you should be doing instead. That goes for models like gpt-4 too
@@Lord_zeel to boil it down little further, yes, one might argue that a child playing with legos has an understanding of architecture, and they’d be *right*. One could also argue that a senior architecture student has an understanding of architecture, and they’d be right too. But it’s also fair to say that understanding of the child would likely pale in comparison to the understanding of the architecture student. On the other hand, both the painter and the veterinarian can be said to understand the concept of a cat. Even so, their knowledge can’t be compared in the same way as the child and the student because their understanding differs in type, not degree. You don’t need to possess all-encompassing knowledge of a topic in order to say that you understand it.
I think people tend to veer off into one of two extremes when it comes to understanding generative AI. The first is they think we already have true human-like or even superhuman intelligence, we don't. And the second trap people fall into, is they think ok look the AI can only generate what's in the training data, this while is technically true, the problem with this interpretation is, that's exactly how humans do it as well. True creativity is extremely slow and rare for humans, the vast majority of work is derivative, you joke about the AI not being able to come up with an elephant but that's also the case with humans. The difference is humans have seen so many things from so many categories, humans are much more efficient at learning from these "training data", and humans have this ability to tell if something is a hallucination. What impresses me the most about current generative ai, is how human the mistakes it makes are, but you need to interpret this in the right context. The Google AI telling you to put glue in pizza is a bad example, but if you look at AI art, they often make mistakes you would actually make while drawing, the only difference being you quickly realize those mistakes at the draft stage, but the AI always fully complete the image. A good way to put it, is that current generative AI is more like "artificial instinct" anything humans can do without thinking, with just "muscle memory", the AI can do extremely well, even to a superhuman level, but it still lacks this very conscious reasoning ability we have. That being said, what I think of a generative AI NPC, isn't one where you can just start typing into a text box to speak with. Instead it will still have human written scripts but the AI is instructed to for example, "say this line but annoyed because it's the 5th time you're saying it." or, say this line but to a lover" because the player is romancing this NPC.
That’s a symptom of a much larger problem. Dead economies are the tech bros playland, they can exploit your work for whatever ends they want. It’s why we still have homelessness and they’ll fight tooth and nail to keep it, desperate people will consent to whatever for food
You are acting like it's going to stop getting better, why ? it's like someone seeing Wolfenstein 3D gameplay and saying "3D in games is bullshit." the technology it is in it's infancy of course it sucks ass Rome wasn't built in a day and neither was any gaming innovation.
Considering it basically didn't exist five years ago, and we were still using Cleverbot? I mean, DOOM was only 30 years ago, and now we have true 3D, physics, and more than 512MB of total system storage.
Exactly. I’m the lead dev for an AI project and a lot of what he says in the video seems to based on where AI was at like two years ago. He seems to be assuming it’s already at its peak and won’t continue to get better. Not only that, a lot of what he talks about is down to how the developer uses it, not the tech itself.
One example of AI heaving a legit use, is someone made a mod for WoW, where it AI generates spoken audio for the text. Honestly, it even game me a sample of what it might be like to be deaf. I've heard of issues having to learn different by only being able to learn via text or hand motions, without any audio. When I listened to the AI reading out the text, I picked up on things about the Tauren in the classic version text that gave me a different impression of the Tauren. It never sunk in before that the Tauren as a race don't hunt just for food or to limit overpopulation, or even tests of skills for adulthood that are often doubled with "this predator is too big and strong," but flat out for sport at times. Or at the absolute minimum, I left with an impression they may hunt for flat out sport at times, something I never had an impression of from just reading text in the original game.
Your chess example is very bad and doesn't really tell us anything new about AIs. First of all if you ask a human to do this, without writing anything down, the human would often fail and make the same mistakes, because intelligent agents have limited working memory. Although it may appear as if chat GPT should have so much memory I mean it's running on GPUs with like 4tb memory and there's the chat log presented to you right there, that's not how chat GPT operates, it isn't looking at the chat log with you, it only gets fed this information (likely in a reduced form) the instant you ask for a response, it has to look through it, and pretend to continue the conversation but really, the Chat GPT agent up until this instant has no idea what you were talking about before. You seem to think giving it instructions will somehow steer the AI in the right direction but that's just you not understanding how it works, you cannot change the AI in any way what so ever, unlike a human. You cannot ask it something, and then expect the AI to reflect on it, and change its behavior, because again every time you ask for a response, it's the first time ever for the AI, it can read what came before, but it has no memory or influence from what came before. We also already know chat GPT has no iterative planning/reasoning capabilities, it only has one operating mode that's an instant response, it can't talk over a long process to itself, it can only read the input and reply once, this is why it can't do high level math or chess games, it is only good at tasks where a human expert would be able to do instantly without thinking deeply about it. As for that Chinese room argument, I guess it may appear convincing to people unfamiliar with philosophy but really it's a pretty bad argument, because the simple answer is yes the Chinese room does understand Chinese. And yes if you were to write a program to function the same way as a Chinese room, it also understands Chinese. In fact there have been research into current chat bot AIs that show exactly this, for example the AI has an internal, graphical representation of where all the cities are in relation to eachother, even though the bot has no graphical training or output capabilities.
I get a feeling AI is going to need something similar to human school. I have a feeling we're essentially teaching AI to behave and produce output in a way similar to what you would get if you had a 6 year old human go onto the interest and try to research answers to questions. My understanding with art generators, is someone basically has to go in and label images analysis as "horse" or "wolf". At least some furry art image training distinguishes between "furry" as toonier and "anthro" as more realistic. It's also clear that Loona is obnoxiously popular in furry art, given how many wolves it produces with red eyes, or the proper yellow with a red sclera for Loona.
Idk man, Skyrim's Herika mod and the chat gippity powered NPC's one is pretty good too. I think video games are the best use for AI that actually makes money. They could make a single player game call out to a server to power the NPCs and make them act more like real people, just like the Skyrim mods do. That's certainly double edged, but it's still pretty interesting just as a concept. I also don't care about artists, it's not a real job and it couldn't have happened to a better group of people. Overpaid NPC's that actually breathe air and constantly make things harder to produce, getting rid of them for simple things is completely logical and was a natural progression of technology. Art should be a hobby, it's about time people start producing things that actually create value. Edit: Your arguments had nothing to do with video games. Who cares if an AI hallucinates in a video game? Video games have had bug patches for like 2 decades now and all it takes is an update.
"I also don't care about artists, it's not a real job and it couldn't have happened to a better group of people" marvel slop consoomer lol. also imagine what it would mean for video game preservation if every single player game depended on a remote server that's gonna be shut down in 5 years
@@cool_bug_facts I hate marvel movies. I hate movies and Hollywood just in general. I haven’t watched a single TV show or movie in years. I play games, do some hobbyist level programming, study cybersecurity, 3D printing, I’m a HUGE Linux enthusiast and I also enjoy firearms, working on cars (ex auto mechanic), kayaking and fishing. I mostly fill my time with reading books about hacking, programming, going to school, modding Bethesda games on Arch and being a dad. I haven’t been to the range or gone fishing as much as I used to, but I certainly don’t have time for garbage Marvel movies; I’m studying for a degree that actually gets me somewhere. 😉
@@cool_bug_facts also to that last point, hence why I said it was a “double edged sword.” Edit: Artists are part of the same group of people, who told someone that used to make refrigerators for 30 years, to “learn to code;” so watching them go through the EXACT same thing, while thinking their wholly insignificant existence was actually infallible, is so unbelievably satisfying, that I can’t even put it into words. 😂 Literally every arts degree holder I know, has been showing people their snatch on onlyFans, since their student loan payments kicked in 6 months after they graduated.
I listened to a lot of your rambling and I still don't get the point... Don't replace voices from actors with generated voice? Why? Because there won't be any new actors? That's so ridiculous. What about animation? Don't replace actors appearence because there won't be actors anymore :D The fact that there's scams like NFTs or things that don't go anywhere doesn't mean that everything is equaly bad. Your whole speach is the usual fear mongering discourse I've heard about every new technology. I still remember articles on the WEB where people would learn how to make bombs and that journalist who printed an E-Book instead of reading it on an E-book reader. What's the difference with more traditional medias and information sources? Do you trust every book and every web site? Why would you blindly trust generative AI? Do you expect a website about cats and dogs to have relevant information about elephants? What about someone who trained all its life to draw cats and dogs? Do you expect them to be good at drawing something they never heard about? Do you expect Websites to make up their own information instead of learning from what's already available? All that is absurd... generative AI is just another tool, it has limitations, but it has already proven to be usefull when used in a smart way. I don't know if it will have some applications for NPC dialogs, but neither do you.
People who complain about public datasets literally only believe that major corporations with their own resources and ability to hire people to make private datasets should be able to use AI.
All this and what they say about blockchain without actually appearing to know much about it at all. My brother works on stuff that never gets advertised but is novel and only really possible because of blockchain. I barely understand a fraction of it when he explains it, but the stuff I do understand is very interesting. My summary of it would make an absolute mess of an explanation. Anyone wanting to know more about it to form their own opinion should really look into the details of how it works, and not just at what is presented by people exploiting it.
@@IronFreee Models like the collections of images they uses for image generation or the collections of texts they use for text generation. People complain about "stealing," and say that it should be illegal to use generative AI with media you down own copyrights for, however with that logic, any major corporation like Disney or Hasbro should be able to build datasets on the decades of artworks they own, and they could just afford to hire people to develop content for their datasets so applying copyrights to AI database collection would only hurt indie creators.
underrated video and topic. imagine asking a farmer npc for directions in the new Elder Scrolls and it mentions travelling by car
When did they make a "New" elder scrolls? 🤔🤔
It’s simple. The developer gives the LLM a description of the world, the kinds of technologies that are available, modes of transportation, etc. and then give it an instruction to only answer as a character living in that world. Problem solved.
@@jacobsmith4457a hypothetical new elder scrolls
@@umFerno Ahh my mistake
@@JingIeFett currently GAI can't handle those kinds of exclusions very well (that's how you ended up with the whole AI only generating images of america's founding fathers as minorities thing), and due to how the AI is developer, i.e. using machine learning to process petabytes of data to generate a working language model without having to spends the hundreds of thousands of hours it would take to code something like that by hand, you'd always be running a risk of the data you use containing even a passing mention of something you don't want in your setting and it ending up in the AI's dialogue
Now you could build one, carefully, with a more curated dataset to restrict its influence, but there's still a chance it could slip up or something could worm its way in.
But that would take an ungodly amount of time to build and would, generally speaking, be a complete waste of time and resources for most game developers. Why spend thousands of hours developing an AI to do something when you could have a bunch of voice actors do it in a hundred.
best case you have the AI generate a library of responses saving the writers a lot of time as they'd then only have to proof everything which presumably is part of the process of these things anyway. However that is the kind of thing entry level employees do, so you've got the same problem again, how do you continue to inject talent into the industry to do the things isn't advanced enough to do if you let AI take over the roles new employees would usually fill?
Enter them in at a higher level? Maybe but if you wanted to avoid that tanking product quality they'd have to go through months or years of additional training which would massively narrow the pool of people willing or able to study for that long just to get their foot in the door.
The main issue I see with normal people and AI (normal as in non-tech people and people who've not studied AI) is that they think GAI can do logic. I've seen videos of people explaining GAI as "it doesn't actually know anything, it just uses logic and patterns."
Which is false. GAI has no logic. It solely works on probabilities of patterns. It doesn't know anything, all that's happening is basically a list of values (weights) tell what letter comes next. It gets several contenders and picks the one with the highest probability. So even if it feels like it "knows" how things work, it really is just acting. It acts confident and might even apologize before making the same mistake again. It has no clue what is going on because that's not how the technology works. It's quite literally brute-forcing a "mimicry" of human behaviour, or more specifically, human text. This means it will also learn from all the mistakes that text has and is unable to correct itself as it HAS NO LOGIC.
And no, GAI won't be getting much better. We might find better ways to do things, but the training itself is showing stagnation already. This is also due to people using AI-generated content to train AI models which basically lobotomized the AI model as you keep training it.
We just can't do better with vector based matrix multiplication anymore lmao.
I saw a ai short where it seems to think that the correct answer to a handshake is to swap out your body and at the same time do a arm lock that also fused the arms into one
The issues of hallucinations (7:00) is that generative models have no comprehension of what they are doing. An LLM like ChatGPT just keeps writing whatever text "comes next" based on its parameters. It is NOT a search engine! It is NOT just repeating what it has read! Instead, it's doing something way cooler but also a lot more prone to weirdness: It is creating text that follows the same patterns as things it has "read", with absolutely no context regarding what that means. Supposed you train the model on a thousand Wikipedia articles. Then you ask it to write an article about something made up. It won't copypasta together an article, it doesn't do that. It won't "search" through a database to find stuff similar to what you asked for, it doesn't do that. Instead, it will just start outputting words in an order that's similar to how words in Wikipedia articles normally go. This forms recognizable sentence structures, paragraphs, even poetry, but none of it is planned out, none of it actually relates to itself. When something is heavily reinforced by training data, something like "the answer to life the universe and everything" being "42" it will generally reliably output the "correct" answer, the further you stray into uncharted territory the less likely there are any reinforced paths for it to follow to generate a sensible output.
This was genuinely a very helpful explanation that has given me a better understanding of GAI. Thank you.
The video title would be utterly incomprehensible to someone in the 1970's
anyone who thinks AI can replace RPG characters does not understand RPG characters
Anyone who thinks AI can't replace certain types of RPG characters, does not understand RPG characters... or RPG games... or even video games...
A GAI NPC is not really useful for general stuff, it would be useful for an ever evolving game, like Kenshi.
@@SioxerNikita”GAI is not really useful for general stuff” mentions the most general of use cases as an example of what it could do… lmao
@@Nick-cs4oc It wont be useful initially for general stuff, because they'll lack specialization. That's where the G in GAI stands for.
When it comes to integration into a game, it's actually really hard to make these things DO stuff. You gave the example of bartering, and the big problem there is that the game really has no way of knowing the contents of the conversation. In a normal dialog tree, certain functionality is associated with each option. But if the options are generated by an LLM, they aren't. You can try to use an LLM to assign them to functionality in the game... but I think you know what the problem with that would be. It's a really difficult problem to solve. In gaming, I think the strongest use cases would be for "fluff", for ambiance. Imagine walking through a crowded street, dozens of conversations taking pace around you. Baldur's Gate 3 has some incredibly immersive scenes like this, where they crafted tons of conversations, recorded voices, and play them all over top of each other. It's super cool, but probably cost a ton of money, and if you stand around in one spot too long people start repeating themselves. But what if instead, they prompted an LLM with a few dozen basic conversion premises to generate brand new conversations every time you go into the area, then generate voices to go along with it? If you talked to an NPC, you would get a canned dialog tree, but the ambiance could be made up using a very narrowly scoped set of prompts that do not involve any user interaction. This doesn't side step the issue of paying voice actors or writers though, that's a whole ethical issue that goes beyond the technology.
At 5:43 is a really important point: A critical flaw in current generative systems, particularly LLMs, is that they don't know what they don't know. ChatGPT makes stuff up, when a human would say "I don't know" which leads to some really big problems. This is hard to fix, because the "AI" doesn't really know what it's doing, it doesn't really understand language or pictures or anything. It can't even understand the idea of not knowing something. And no, you can't just "code it to say I don't know" because well... it doesn't know that it doesn't know. I take issue with the idea that the AI NPC not knowing something would be a bad thing though? Like, real people don't know everything, I would expect the NPC's knowledge to be limited. That's actually one of the problems, which you identified separately: If it knows everything, it might say stuff they don't want it to like spoilers. And any system sophisticated enough to filter the output would by definition be a more sophisticated system than the model itself... making it redundant.
I think you are making a hardline generalization that you likely shouldn't. You can't say the LLM doesn't know what it is doing, especially considering we don't know what the LLM is doing. It is far more fair to say it doesn't understand pictures, but there seems to be a significant amount of understanding of language.
And saying stuff like "Code it to say I don't know" is something you can't do, is wrong, because if you code an LLM to say it doesn't know, when it doesn't have an answer, that is something that would be integrated into the neural net when it is actually trained.
And as you should know, if you actually know anything about an LLM, but it seems like you don't know what you don't know, is that we already have models that filters LLMs that are no more sophisticated than the LLMs by themselves, so it doesn't make the models redundant.
You seem to have a very "Basic" understand of what an LLM is, what it is capable of, etc.
You can frankly, in the same vein as an LLM, argue that a human doesn't understand the very language they are speaking... because we can't actually prove that a human does understand the language they are speaking, or if they are merely mimicking the understanding of the language they are speaking... and it comes to the philosophical point of "If it quacks like a duck, it's a duck"... If it acts like it understands, it understands.
The primary issue with LLMs is that they are made to ALWAYS output... regardless of context, and because they have been made to always output, then like you holding your own breath until death is essentially impossible, in the same vein it is impossible for an LLM not to output something... So whether the LLM understands or not, is completely irrelevant, because it is physically limited to always make an output... And if you say "That means it doesn't understand language", then... Welcome to the world of pathological liars. They are almost nearly incapable of not just inventing lies and stories. I've known a guy like this, and the lies could be extremely wild... (From "I broke my foot yesterday" (coming into work with a fine foot) to "A dog ate my sister" (He has never had a sister) )...
So if an LLM always outputting something, whether it is true or not, means it doesn't understand language, my friend from when I was younger also didn't understand language.
You are suffering from what you claim LLMs are suffering from, simply not knowing what you don't know, and an incredibly simplified understanding of what "understanding" even is, and using really bad arguments for it.
This shit is why it drives me absolutely BONKERS when people say they use shit like ChatGPT to help them study and whatnot. These things have been repeatedly shown to make shit up and take things from literally any source regardless of veracity, no, I don't care that you're just telling it to summarize a Wikipedia page, YOU SHOULD NOT BE USING IT FOR THIS OR ANYTHING ELSE.
Re: 8:08 it's always randomly guessing. The model doesn't contain the training data, just the parameters that were stored based on it. It is always "randomly guessing" it's just that the more prevalent data was in the training, the more certain parameters were reinforced. Enough reinforcement, and it will reliably repeat the same information though not by "finding it in the training data" but by actually reproducing it the long way around. The glue on pizza thing is a slightly different situation, because Google's AI search feature is a hybrid system that uses both traditional search algorithms and an LLM to read pages then write synthesized results. Basically, Google search found a stupid answer then had its LLM read/summarize it. Google has had info cards that appear at the top of searches for years that are supposed to surface basic facts or pull short quotes from sites. These cards have been scrutinized since they were first implemented, as they commonly will surface inaccurate data. The more recent issue is just compounding the old one by having the LLM re-write the information. That one was technically not a hallucination. Google and Bing both have these search/LLM hybrid systems, but it's important to understand how they are distinct from things like ChatGPT which is just an LLM. And importantly, Google and Bing are taking the output from a search and using it to prompt the LLM, or the output of an LLM to perform a search query, theses are two systems that talk to each other but each work in very different ways.
Great video, amazing how you can tackle a very common subject in youtube and be original about it. Incredible! Keep it up my man
I am a big fan of RPG (especially CRPG) genre and the biggest part why i love this types of the game are simply roleplay and most importantly the companions. Not only i feel attached to them i also care about writers which made them so alive. I was happy to learn that all my favourite Dragon Age and Mass Effect characters were written by same person. It was amusing to know that both Astarion (BG3) and Fane (DOS2) had the same author behind them. When the RPG has memorable companion i always check author's portfolio to see more of their work. Sometimes i analyse the writing and favourite archetypes of certain writers for hours and why it's unique and special.
AI just takes away this aspect completely.
The amount of acronyms in the video title is sending me 😂 great video dude!
facebook spent millions to make a worse version of something that already exists
And that’s all it can ever do, regurgitate already existing stuff. Anyone complaining about Disney slop will only get regurgitated Disney slop with GAI, cuz that’s what it learns on… the hubris is just so thick I’m constantly impressed by it
I don't think people will ever give up on making AI, no matter if it seems unfeasable.
I mean I was worried that gaming companies were gonna force NFTs and such down our throats for years to go and everyone has already forgotten about them by now. GAI will probably continue to exist in some form or another, but once the hype dies down I think that most companies will start using it for much more reasonable applications rather than trying to use it for literally everything like they are now.
AI research will always continue of course but I think that the idea that the GAI of today will evolve into AGI or Artificial General Intelligence aka Sci-Fi level AI is just fundamentally not going to happen. It's just a completely different branch of AI research
Re: 13:20 we know it's not actually learning, because we know how the system works. The question of "is Gen AI sentient?" isn't answered by a turning test or by any other philosophical question, it's actually really simple: No, it can't be, it doesn't even know what it's doing. Asking it the rules actually tells you nothing at all, because all it's doing is creating the form of the answer. The rules aren't in a database, they aren't in it at all. It just knows how to create the text that comes after "what are the rules of chess?" without understanding them. It plays chess by outputting text that seems similar to what someone playing chess might write, but it has no idea that it means anything. And I know, it really seems like it does, it even listed what all the labels mean! But again, that list itself was just prediction of what comes next, not something it found in a database, and not something it is programmed to actually be able to do.
You REALLY post a lot here about things you don't even really understand.
You say "We know it's not actually learning", while people far smarter than us are discussing whether what it is doing is considered learning. It is frankly not too far from what we do as learning...
And saying that Gen AI can't be sentient, because it doesn't even know what it is doing, is also like I pointed out before, a really bad argument, because we don't know if it understands... and you can't say it doesn't understand, if you don't start trying to define understanding. I'll bet quite a few definitions of understanding you put out, I could quite literally use that definition to logically conclude that humans can't understand. We don't know how Gen AI actually really works under the hood, so we can't really answer a lot of the questions about Generative AI that we might have...
This message itself I can argue proves that you are incapable of understanding. You are generating quite a few arguments, but most of these arguments have nothing to really do with evidencing whether Generative AI is capable of learning or understanding. You are saying "It is just generating the thing that comes after the rules of chess" (heavily paraphrasing), but... that isn't really far off what humans do. We take input, a lot of input, and we generate output, a lot of various output... In terms of the general concept between neural AI and humans, there aren't really a big difference, outside of the human brain being focused on continues input and output, and changing the weights of the nodes as we go along.
There is nothing that defines understanding that you can attribute to a brain, that you also can't attribute to a Neural AI made more similarly. Understanding is an emergent property, and emergent properties usually has this really annoying thing where it can't really be clearly defined... It's like defining "Going fast"... Do you go fast if you go 5 mph? What about 10? What about 15? What if you are going 500 mph? But what if you are a rocket? Then 500 mph isn't really that fast, so are you fast at 1000 mph? The same thing with understanding.
the title of this video has too many abbreviations other than that really good video
This. "Why generative AI characters in role play games are bullshit" isn't that much longer, and only has one abbreviation - and it's the one abbreviation that in this context is actually completely meaningless. Yeah yeah, "generative machine learning systems" is harder to say, but I hate the use of "AI" as a buzzword. Please stop taking cool technology words and making them sound like sleezy garbage. These days talking about cryptography sounds icky, even if you aren't talking about bitcoin or blockchain.
I fully agree with what you said at 17:48, they show off art because it's subjective. Hallucinations don't always make art worse, in fact they can make it better and more novel.
Underrated as hell. Take my comment.
Y GAI RPG NPCs R BS!
At 4:00 the explanation of how image generation works is highly inaccurate. An image generation model does not contain a database of images, doesn't search through anything, and doesn't "remix" images. This is a very commonly repeated description, one that I suspect originated in a good-faith attempt to simplify the concept, but it leads to a lot of misconceptions about these models that have some big implications for arguments further along in the topic. As a result, we keep seeing this description repeated and the same misconceptions and the arguments that result from them again and again. This doesn't mean that some of the conclusions aren't still valid, but the supporting arguments needs reworked in light of a more accurate understanding of the technology.
I'll try to give a simple explanation that is more accurate than the video, though it may itself be overly-simplistic in its own ways. To train a generative model, one uses a database of "training data" that has been tagged with descriptions and feed each item into the training program. The specifics of what the training program does varies depending on the type of model, but the basic idea is that the system "looks" at the contents of the input data and adjusts a massive array of values (known as parameters) in some way dependent on that input. Each input causes that massive set of parameters to change just a little bit. None of the data from the input is actually stored in the model though, the input just causes the training system to tweak the parameters. After running thousands or millions of inputs through the trainer, you have a very big set of special values in your parameters that constitute the "model". Later, another program that works basically the same as the trainer but in reverse (sometimes it's actually the same program) takes some randomly generated data, then starts modifying it based on the parameters in the model until it produces an output.
Some key insights: The model doesn't contain any of the data used to train it. In fact, the file size of the model is orders of magnitude smaller than the combined input data, it literally couldn't contain it. When multiple pieces of training data are similar, their effect on the parameters will be similar, this "reinforces" the learning of those similarities. Over-enforcing something can lead to a model with a bias. This could be something like generating white people more often than black people because there were more white people in the input images, or it could lead to things like reproducing watermarks from stock images sites. The model isn't copy/pasting the watermark in a case like that, the issue is that so many images with that mark were fed in, it over-enforced the idea that that watermark is what images are supposed to look like... so it makes images with it. It's no different than images of faces training the model what eyes look like, but in this case it's an undesirable outcome. Regardless, the inputs aren't encoded into the model, they just have a slightly impact on the parameters. You could extremely over-reinforce an model until it only output nearly identical images, but it still technically wouldn't contain the image. It could absolutely constitute plagiarism, but it's never literally "copying" anything.
The model also has no idea what anything actually IS. This is extremely important to understand. It doesn't know what a "cat" is. When an image of a cat is fed in to train it, and labeled "cat" it effects the parameters in some way, such that when you ask for "cat" it will output something based on those parameters. But it doesn't understand the words or the pictures, it only uses them to either update its parameters, or uses its parameters to output a picture that goes with the words. Or text models like Chat-GPT, they don't know what anything means either - they just predict what text "should come next" based on their parameters (200 billion of them!) with no real context of what any of it means. Just like an image model, an LLM is trained on a bunch of data but rather than storing it to use later, or just tweaks its parameters a little at a time as it "reads" until it has a set of parameters that when used to generate, will create text that "looks like" something a human might write.
That is to say, an image generator can not draw a cat. But it can create an image that looks like a cat. An LLM can not write a poem. But it can create text that looks like a poem. If either does a good enough job, then the result might be indistinguishable from the real thing. This I will compare to imitation almond extract: It's actually the same chemical compound that gives it that flavor, one is made inside a plant, and the other is made inside a lab... but it's the same stuff once your remove any impurities, the lab stuff is just as safe as the "natural" because it's the same chemical. But as anyone familiar with the matter knows, they don't actually taste identical - because the natural almond HAS impurities, and they add something to it, something hard to quantify and hard to reproduce. Something you could easily miss, but something that just makes it more real. (Which is not to say that's always better, the "fake" stuff won't set off allergies, so... )
While you have a much better understanding than the video, your explanation is also not entirely correct.
While most genAI models don’t know what a cat is, some can. For example, some models consist of a VAE for images and one for text. The idea is to use image-description pairs to couple the latent space of both the image model and the text model to one another. First, you try to force the text and images to map to themselves and add loss terms to encourage them to map to one another as well. Some success was found with this using VQ-VAE if I recall correctly.
Then, you can generate random images by randomly sampling the latent space. Alternatively, you can use the text encoder to calculate a value from the latent space and then introduce some random noise.
And there’s a myriad of other strategies out there. Too many to encompass in a youtube comment
It is also arguable whether or not an autoregressive model ChatGPT does or does not understand the language.
I’d recommend looking up distributional semantics and specifically the distributional hypothesis. A popular quote for it is “a word is characterized by the company it keeps” and is the foundational principle behind models like word2vec.
In other words, one might argue that knowing what words could or are most likely to appear next in a sentence is the same as understanding the sentence.
From a practical standpoint though, you are right.
@@Foulgaz3 You are correct, but that's an overly-philosophical way of thinking about it, and I would argue it obfuscates the point. Like I said, my explanation is by nature overly-simplistic itself because I'm trying to use terms that most people stumbling across these comments could understand. There are many stages to the process of training or generating, and each type of "AI" does things differently. But to understand the generalities, those implementation details aren't important. Anyway, I contest that "knowing what a cat is" is not merely a matter of being able to associate the word "cat" with an arrangement of shapes and colors. Yes there is a complex system for mapping words to images that allows an image generator to produce a "cat" when asked for a "cat" but that doesn't mean the generator knows what a cat is. Just because something can reliably identify an image of a cat or produce an image of a cat doesn't mean it actually understands the concept of a cat. Furthermore just because an LLM can describe a cat, doesn't mean it has a fundamental understanding of it.
"One might argue that knowing what words could or are most likely to appear next in a sentence is the same as understanding the sentence." Sure, and one might argue that a child playing with LEGOs has an understanding of architecture. But one would be wrong.
Understanding is not mere association. Understanding is not linear. Understanding isn't even the ability to produce an accurate answer. Understanding is something far more complicated. To understand a cat, you would need a multi-modal model with many trillions of parameters capable of processing language, sound, texture, 3d space, movement, mechanics, physics, and a dozen other modes trained entirely on cats. To understand a cat, you need a "cat model" that encompasses all the ways that one could observe a cat, all the ways one could depict a cat, all the things there are to know about cats, etc (EDIT: This comes across as me saying it needs to know literally everything about cats, when what I mean it needs to be able to know all (or most) of the KINDS of things someone could know about cats). And if you managed all of that, it still wouldn't be AGI because it only knows about cats 😅
That doesn't mean it can't make really good pictures of them. That doesn't mean it can't write a poem about them. But at the end of the day, the model doesn't understand.
@@Lord_zeel Perhaps, but you’re not exactly approaching the concept scientifically. If that’s your intent, then fair enough. Otherwise…
Can you prove that understanding is not mere association? I mean that’s what happens when people become experts in a domain. Neural associations become reinforced and strengthened.
Why would you need a multi-modal model with trillions of parameters entirely focused on cats? To be frank, I find that idea totally ridiculous. If that were the case, no one living or dead can ever have been said to understand the very concept of a cat. At that point, the definition of “understanding” is no longer useful.
But let’s examine another aspect of that argument: the need for a multi-modal model. According to that logic, someone born blind is simply incapable of truly understanding anything because they will never know what anything looks like. Or there’s people like myself without a sense of smell. Even someone with all of their senses will never see infrared or gamma. Again, this definition of understanding ceases to be useful.
Personally, I would consider understanding to be multifaceted and continuous in nature. Perhaps a painter might understand some of the anatomy of a cat and how to replicate one onto a canvas. A vet would understand the cat’s anatomy and physiology, but might not understand how to paint one. Defining understanding as a state in which you know everything that is possible to know about a concept doesn’t seem particularly useful.
And, as someone who does AI research professionally, if you create a model with trillions of parameters, you’ve probably done something horribly wrong. There are generally exceptions to every rule, but that’s a good sign that there’s probably something better you should be doing instead. That goes for models like gpt-4 too
@@Lord_zeel to boil it down little further, yes, one might argue that a child playing with legos has an understanding of architecture, and they’d be *right*.
One could also argue that a senior architecture student has an understanding of architecture, and they’d be right too.
But it’s also fair to say that understanding of the child would likely pale in comparison to the understanding of the architecture student.
On the other hand, both the painter and the veterinarian can be said to understand the concept of a cat. Even so, their knowledge can’t be compared in the same way as the child and the student because their understanding differs in type, not degree.
You don’t need to possess all-encompassing knowledge of a topic in order to say that you understand it.
I hate that i understood every acronym used in this title immediately..
I think people tend to veer off into one of two extremes when it comes to understanding generative AI. The first is they think we already have true human-like or even superhuman intelligence, we don't. And the second trap people fall into, is they think ok look the AI can only generate what's in the training data, this while is technically true, the problem with this interpretation is, that's exactly how humans do it as well. True creativity is extremely slow and rare for humans, the vast majority of work is derivative, you joke about the AI not being able to come up with an elephant but that's also the case with humans. The difference is humans have seen so many things from so many categories, humans are much more efficient at learning from these "training data", and humans have this ability to tell if something is a hallucination.
What impresses me the most about current generative ai, is how human the mistakes it makes are, but you need to interpret this in the right context. The Google AI telling you to put glue in pizza is a bad example, but if you look at AI art, they often make mistakes you would actually make while drawing, the only difference being you quickly realize those mistakes at the draft stage, but the AI always fully complete the image.
A good way to put it, is that current generative AI is more like "artificial instinct" anything humans can do without thinking, with just "muscle memory", the AI can do extremely well, even to a superhuman level, but it still lacks this very conscious reasoning ability we have.
That being said, what I think of a generative AI NPC, isn't one where you can just start typing into a text box to speak with. Instead it will still have human written scripts but the AI is instructed to for example, "say this line but annoyed because it's the 5th time you're saying it." or, say this line but to a lover" because the player is romancing this NPC.
I really hope GAI completely fails especially when it comes to media creation and we don’t have to worry about this stuff anymore
Lol, my first thought when reading the title was that you dont like gay chatbots
At best GAI games will live in their own bubble like Virtual Reality games.
There is no bubble because it has to feed off of something. Always going to be massive implications in any implementation of GAI
Good video
Hallo blaze im still thinkin if british prime
Sounds like guys who said they shouldn't use lightbulbs on the street
Me when my family used blockchain technology to sustain me because I live in a country with a dead economy: 😐
That’s a symptom of a much larger problem. Dead economies are the tech bros playland, they can exploit your work for whatever ends they want. It’s why we still have homelessness and they’ll fight tooth and nail to keep it, desperate people will consent to whatever for food
Im edgeing to your pfp
You are acting like it's going to stop getting better, why ?
it's like someone seeing Wolfenstein 3D gameplay and saying "3D in games is bullshit."
the technology it is in it's infancy of course it sucks ass Rome wasn't built in a day and neither was any gaming innovation.
Considering it basically didn't exist five years ago, and we were still using Cleverbot? I mean, DOOM was only 30 years ago, and now we have true 3D, physics, and more than 512MB of total system storage.
Exactly. I’m the lead dev for an AI project and a lot of what he says in the video seems to based on where AI was at like two years ago. He seems to be assuming it’s already at its peak and won’t continue to get better. Not only that, a lot of what he talks about is down to how the developer uses it, not the tech itself.
One example of AI heaving a legit use, is someone made a mod for WoW, where it AI generates spoken audio for the text. Honestly, it even game me a sample of what it might be like to be deaf. I've heard of issues having to learn different by only being able to learn via text or hand motions, without any audio. When I listened to the AI reading out the text, I picked up on things about the Tauren in the classic version text that gave me a different impression of the Tauren. It never sunk in before that the Tauren as a race don't hunt just for food or to limit overpopulation, or even tests of skills for adulthood that are often doubled with "this predator is too big and strong," but flat out for sport at times. Or at the absolute minimum, I left with an impression they may hunt for flat out sport at times, something I never had an impression of from just reading text in the original game.
Your chess example is very bad and doesn't really tell us anything new about AIs. First of all if you ask a human to do this, without writing anything down, the human would often fail and make the same mistakes, because intelligent agents have limited working memory. Although it may appear as if chat GPT should have so much memory I mean it's running on GPUs with like 4tb memory and there's the chat log presented to you right there, that's not how chat GPT operates, it isn't looking at the chat log with you, it only gets fed this information (likely in a reduced form) the instant you ask for a response, it has to look through it, and pretend to continue the conversation but really, the Chat GPT agent up until this instant has no idea what you were talking about before. You seem to think giving it instructions will somehow steer the AI in the right direction but that's just you not understanding how it works, you cannot change the AI in any way what so ever, unlike a human. You cannot ask it something, and then expect the AI to reflect on it, and change its behavior, because again every time you ask for a response, it's the first time ever for the AI, it can read what came before, but it has no memory or influence from what came before.
We also already know chat GPT has no iterative planning/reasoning capabilities, it only has one operating mode that's an instant response, it can't talk over a long process to itself, it can only read the input and reply once, this is why it can't do high level math or chess games, it is only good at tasks where a human expert would be able to do instantly without thinking deeply about it.
As for that Chinese room argument, I guess it may appear convincing to people unfamiliar with philosophy but really it's a pretty bad argument, because the simple answer is yes the Chinese room does understand Chinese. And yes if you were to write a program to function the same way as a Chinese room, it also understands Chinese. In fact there have been research into current chat bot AIs that show exactly this, for example the AI has an internal, graphical representation of where all the cities are in relation to eachother, even though the bot has no graphical training or output capabilities.
I get a feeling AI is going to need something similar to human school. I have a feeling we're essentially teaching AI to behave and produce output in a way similar to what you would get if you had a 6 year old human go onto the interest and try to research answers to questions. My understanding with art generators, is someone basically has to go in and label images analysis as "horse" or "wolf". At least some furry art image training distinguishes between "furry" as toonier and "anthro" as more realistic. It's also clear that Loona is obnoxiously popular in furry art, given how many wolves it produces with red eyes, or the proper yellow with a red sclera for Loona.
Idk man, Skyrim's Herika mod and the chat gippity powered NPC's one is pretty good too. I think video games are the best use for AI that actually makes money. They could make a single player game call out to a server to power the NPCs and make them act more like real people, just like the Skyrim mods do. That's certainly double edged, but it's still pretty interesting just as a concept. I also don't care about artists, it's not a real job and it couldn't have happened to a better group of people. Overpaid NPC's that actually breathe air and constantly make things harder to produce, getting rid of them for simple things is completely logical and was a natural progression of technology. Art should be a hobby, it's about time people start producing things that actually create value.
Edit: Your arguments had nothing to do with video games. Who cares if an AI hallucinates in a video game? Video games have had bug patches for like 2 decades now and all it takes is an update.
"I also don't care about artists, it's not a real job and it couldn't have happened to a better group of people"
marvel slop consoomer lol. also imagine what it would mean for video game preservation if every single player game depended on a remote server that's gonna be shut down in 5 years
@@cool_bug_facts I hate marvel movies. I hate movies and Hollywood just in general. I haven’t watched a single TV show or movie in years. I play games, do some hobbyist level programming, study cybersecurity, 3D printing, I’m a HUGE Linux enthusiast and I also enjoy firearms, working on cars (ex auto mechanic), kayaking and fishing. I mostly fill my time with reading books about hacking, programming, going to school, modding Bethesda games on Arch and being a dad. I haven’t been to the range or gone fishing as much as I used to, but I certainly don’t have time for garbage Marvel movies; I’m studying for a degree that actually gets me somewhere. 😉
@@cool_bug_facts also to that last point, hence why I said it was a “double edged sword.”
Edit: Artists are part of the same group of people, who told someone that used to make refrigerators for 30 years, to “learn to code;” so watching them go through the EXACT same thing, while thinking their wholly insignificant existence was actually infallible, is so unbelievably satisfying, that I can’t even put it into words. 😂 Literally every arts degree holder I know, has been showing people their snatch on onlyFans, since their student loan payments kicked in 6 months after they graduated.
"I also don't care about artist" bait used to be good man
I listened to a lot of your rambling and I still don't get the point...
Don't replace voices from actors with generated voice? Why? Because there won't be any new actors? That's so ridiculous. What about animation? Don't replace actors appearence because there won't be actors anymore :D
The fact that there's scams like NFTs or things that don't go anywhere doesn't mean that everything is equaly bad.
Your whole speach is the usual fear mongering discourse I've heard about every new technology. I still remember articles on the WEB where people would learn how to make bombs and that journalist who printed an E-Book instead of reading it on an E-book reader.
What's the difference with more traditional medias and information sources? Do you trust every book and every web site?
Why would you blindly trust generative AI?
Do you expect a website about cats and dogs to have relevant information about elephants? What about someone who trained all its life to draw cats and dogs? Do you expect them to be good at drawing something they never heard about?
Do you expect Websites to make up their own information instead of learning from what's already available?
All that is absurd... generative AI is just another tool, it has limitations, but it has already proven to be usefull when used in a smart way. I don't know if it will have some applications for NPC dialogs, but neither do you.
People who complain about public datasets literally only believe that major corporations with their own resources and ability to hire people to make private datasets should be able to use AI.
All this and what they say about blockchain without actually appearing to know much about it at all. My brother works on stuff that never gets advertised but is novel and only really possible because of blockchain. I barely understand a fraction of it when he explains it, but the stuff I do understand is very interesting. My summary of it would make an absolute mess of an explanation. Anyone wanting to know more about it to form their own opinion should really look into the details of how it works, and not just at what is presented by people exploiting it.
@@youcantbeatk7006 What datasets are you talking about?
@@keithwinget6521 What is your brother working on, that was only made possible by the blockchain? Drug trafic money laundering?
@@IronFreee Models like the collections of images they uses for image generation or the collections of texts they use for text generation. People complain about "stealing," and say that it should be illegal to use generative AI with media you down own copyrights for, however with that logic, any major corporation like Disney or Hasbro should be able to build datasets on the decades of artworks they own, and they could just afford to hire people to develop content for their datasets so applying copyrights to AI database collection would only hurt indie creators.