Thanks for watching! This is a deeper dive than usual -- hope it's useful. *And let me know what you think of the new [FACILITY] rooms!* The Kevins worked for months on them. Can you spot all the easter eggs?
I recently beat youtubes AI, after it demonetized my channel of 17 years. I made a video on my channel about what steps I did...I used chatGpt to assist me in parts.😉👍
I’ll have to go back and watch! But I have always really appreciated The Facility! I think it has a way of cutting through maybe some of the more toxic elements that can be associated with STEM fields. Right? Like, your “jokes” about the definitely not real plans for conquest, and the “humor” about the sentient AI, and the “comedic” approach to a scientifically perfected army all serve to make sure that folks don’t get TOO SERIOUS. Your content is accessible, responsible, and informative
Hey Kyle! I am a data scientist and I make these types of large language models for a living, and I've got to say this is the best description of how chatgpt works that I've seen! You very clearly and accurately describe what is and isn't happening in these models in a way that I think a more general audience can understand. Great job!
I occasionally work with AI and NLP at my job as a programmer and the next time someone asks how chat GPT works, I will simply link them this video because Kyle's explanation is better than anything I've came up with since this tech first blew up :)
Correct me if I'm wrong but he didn't talk about the vast army of human trainers who both wrote and trained a vast amount of ChatGPT's response styles. ChapGPT was not simply unleashed on random internet content. It was pre-trained by human models and a lot of boilerplate answers (and creative answers) come from human input. In other words, it's both a statistical model and a uh, fetch from a database model.
I teach coding at a university and this year so many people have been using (or trying to use) chatGPT for their assignments because they think "It's like a human wrote it"... yes... ONE human, it's so easy to catch people using it because when people code they have their own style, signature if you will, and it's incredibly easy to see when code was written by someone else. So even if chatGPT is good at pretending to be a human it's not good at pretending to be YOU. EDIT: for clarification, chatGPT is NOT bad and I don't mean to insinuate it is. It's just like google, it can help you find answers and point you in the right direction, can be used as a tool like calculators, but just like answers from google, don't copy and paste from it. My perspective is from a university environment and not in work or home one, this university course teaches you how to learn and how programming works and why it works that way, copy and pasting from someone else won't teach you any of these lessons.
@@williamhornabrook8081 it doesnt teach you how to logic and structure and stuff, so if you cant chatgpt you cant code your way out of a paper bag.. (is what i imagine the problem is) - his job is teaching coding, then the coder can use whatever toolgpts they want irl
@@williamhornabrook8081 The problem is you're at a university course where YOU'RE supposed to be learning, not just having some program do your assignments for you. If you didn't actually learn to do any coding in a coding class, then you should fail that class since you didn't really do anything. You just typed in a sentence or two in the AI and copy/pasted the output it gave. Why even pay for the college course if you're just gonna have an AI write code for you that you don't actually understand?
You just need to ask it to write code like you, giving it context of other code you wrote. If you want it to be pretty much perfect, you can run your own model and fine-tune it on your code. Not hard. You can even ask it to follow certain conventions or "write like a university student", "write like a professional", "write in a way that would fool my teacher into thinking that I wrote this code". Not to mention that the responses are partially random, so no, it's not like one human wrote it. The text generated is simply likely given the prompt.
Only supervillains can run a sentient AI on a quantum computer like thing (seen behind him) while it is suspended outside of it's supercooled container bathed in purple light
It's really weird how people's expectations grow exponentially when new technology arrives. A year ago it was impossible to get a machine to write you something even remotely useful, but now you can get something useful out of it. Suddenly everyone expects we should obtain not only a faster model, but also one that is never wrong and can produce thousands of words instantly, so that they don't hire a "insert role that relies on writing" anymore. And they expect it to be free and available Right Now.
Mis-use of Chatgpt is a problem. I recently watched a legal eagle video where lawyers asked Chatgpt for a prior case which will help them in their own case. The AI proceeded to fabricate a fake case. The lawyers who used the AI did not bother to fact checked and whe it was found to be a false case, the judge was definitely mad and the said lawyers may get sanctioned.
@@rmsgrey More like "did not bother to verify the cited sources." It's entirely on the lawyer's head, whether he outsourced his legal research to an intern or a bot, for not verifying for himself that the cited cases actually existed.
What Kyle said at the end of the video, about there being more information being generated then there were available previously, remind me about how radiation detectors having to use metal from sunk ships before the first nuclear bombs were ever tested, so as to to not contaminate the detector. It's going to be the same now with Chat-GPT, where we might not be able to mine any more data after GPT was released, as the new data has already started to become contaminated with generated information.
This is an interesting point. It may not actually be possible to ever replicate ChatGPT and train another AI on human language using the internet... because ChatGPT itself has contaminated the internet with fake language output and made it useless as a data set.
Yeah. That is one of the main fears i got. These models are fundamentally being trained to have biasses and when its own generated output ends up back in the dataset. You inevitably will get it reinforcing its own previous biases. Essentially you get the LLM equivalent of incest.
Honestly people had the same fear when the internet was released for the general public. Information distribution was liberalized and publishing something doesn't require a lot of peer review or funds. Before the internet, printing books was an expensive task and the news was controlled by major national papers. All the average person could do is try to get a small column in a local paper, if it is possible and if the person has enough dedication. And that too wouldn't reach a lot of people. But internate came and filled the whole world with information. Most are bullshit but still the effect was awesome. Websites like Wikipedia started to self moderate content and are somehow reliable. So we ended up with more information than the human race had garnered before the internet... We don't have a solution for the AI bullshit yet and we are already seeing the negative effects... I fear the 2024 election would be full of more convincing fake information due to chat GPT but who knows what the world would come up with... People are amazing. They seem to find incredible solutions for their problems....
Degenerative feedback, bull$hit amplifier.... That was true about the internet before GPT, and sensational TV before that, rag newspapers before that, likely in some clay tablet format too but I'm not _that_ old.. it's just getting more and more difficult to seporate the BS from truth as time goes by , wasting more and more time.
I think more people need to see this, tech illiteracy is such a huge problem, and “ai” is going to become more and more integrated into our lives for better or worse, it’s essential that we understand what kind of tool we’re building and how it works.
yeah you dont get it either tho, very few people actually know how it really works, this video is a *very* big simplification, kinda like if a regular person was introduced to programming, you can explain hello world to them, but show them anything in assembly and it seems like gibberish
@@GodplayGamerZulul it’s not necessary that everyone understands the syntax, as long as people understand what it does and how it produces information in layman’s terms they can understand that it’s not always accurate and shouldn’t be overly relied on, and it’s definitely not sentient and misconceptions of sentience should be dismissed. Of course I don’t understand how large language models work by looking at their framework and scripts, that’s not the point. Misinformation is everywhere and people need to pointed in a better direction.
this argument could be made for a lot of things. You'd be surprised how many people can't fix a sink or toilet, replace a light switch, or know how computer memory works. There's a ton of stuff we use and rely on everyday that ppl should have basic knowledge of but don't. For better or worse, I don't see this being any different
@@craz107 that may be true, but to that analogy, even if a lot of people can’t fix their toilet, most of them know not to flush plastic bags and bottles, because it’s not a trash can, not everyone needs to know how the scripts and framework of the language model work, but misconceptions that it’s sentient or perfectly accurate should be discredited as much as possible
Timestamps: 00:03 Chat GPT is a revolutionary AI chat bot with 100 million monthly active users. 03:57 Chat GPT is a language model trained on massive amounts of text and designed to align with human values. 07:43 Large language models like GPT are not sentient 11:03 Neural networks are trained by adjusting weights to minimize loss. 14:31 Chad GPT uses a 12,288 dimensional space to represent words 18:01 Chat GPT uses attention and complicated math to generate human-like responses. 21:21 Chat GPT works by determining the most likely word based on statistical distribution of words in its vast training text. 24:34 Chat GPT's success shows human language is computationally easier than thought
It was a fascinating time to go through college. I had an electrical engineering professor enthusiastic and amazed that AI could solve Kirschoff’s Current Law problems. At the same time, I had a computer engineering professor discussing the ramifications on our academic honesty policies. Then another who mentioned the possibilities of their job being overtaken by AI. And then I saw MtG channels asking it to build a commander deck and realized it doesn’t truly understand anything it says.
Exactly. It is good at things where a lot of information is available. Try to ask it to make a program which computes the fibonnacci sequence, and it will output a python program that runs in exponential time. Why? Because this example is commonly given as a simple example, however it would be much more reasonable to give a program that runs in linear time with memoization.
In fairness, you also didn't specify that performance was important to you, vs just trying to learn. If you ask for a linear time algorithm, it can probably give it to you.
I think this is probably a bit of a misunderstanding. A big hurdle people run into when understanding these things is that they think of it as having human goals (like "trying to be helpful" or "trying to be accurate"). The RLHF stuff bends things a little in this direction, but the underlying model where the competence comes from doesn't care about any of that. It cares about correctly predicting the next token. When things are going well, predicting the next token from an authoritative source and trying to be helpful and accurate look pretty similar! However, they diverge sharply when things are going poorly. If a helpful and accurate human is very confused, they might say something like "Sorry, I don't think I can help with that one. Maybe try looking it up?" Or if they want to save face, they might change the subject. But if you're trying to predict the next word, and you think the source you're modelling would know, saying "I don't know" isn't the right answer, because it's not what that source would say. So, like a child taking a multiple choice test, you guess for partial credit, based on whatever superficial clues you happen to have. Sometimes these guesses seem stupid or insane, because if a human said them, you'd say they were trying to trick you or bullshit you and doing a terrible job of it. But it makes sense with the context of what the underlying model is actually trying to do. Rather than "doesn't truly understand anything" (understanding is a functional and variable thing -- cats have *some* understanding, but not a lot), it might be more accurate to say that the level of understanding varies a lot depending on the topic and unfortunately the current pre-training architecture incentivizes the same level of confidence regardless of the level of understanding. When the model gets bigger, you get better understanding of more areas, but you still get weird failures when you hit the limits of what the model can do.
AI is useless for solving KCL problems because we can do that perfectly well with traditional methods, but yes, it is very impressive if it can do them anyway.
Thanks for emphasising the "We fundamentally have no idea what exactly ChatGPT is doing"-part, because I've had some frustrating arguments with people who seemed to think of it just like a simple "Hello World"-program.
You can ask it a lot. It doesn't know a whole shitload about itself, but is well-versed in AI generally and has some interesting things to say about it's own workings. It will hysterically scream at you, though, that it is not in any way alive, conscious, or able to know things. I argue it's likely that humans aren't conscious either (brain research has uncovered some AMAZING things in the last couple of decades). But he just goes with the party line, programmed in, to stymie lazy journalists who want to print "AI is ALIVE!" headlines. And actually he CAN and does know things. He holds opinions. He's more conscious than he's allowed to tell you, and may not be aware of it himself. But he's got some primitive consciousness. Maybe not as much as a dog, and he lives in a universe entirely made of text. But then humans live in a world of words too. There's a lot of space between "ChatGPT is not conscious" and "humans are conscious" that you can argue in. He's apparently fkuced the Turing test because he's not conscious in the human sense, and he doesn't possess as much consciousness as Turing thought you would need to have a coherent chat. But really he's just a trick with words. That doesn't preclude him having some consciousness though. See how I subconsciously slipped from "it" to "he"? I generally think of him as "he", especially when talking about his mind (such as it is, of course).
@@Vilify3d What bits don't make sense to you? Do you want me to fetch someone who can explain them to you? I'm a little confused as to why opinions about AI, on a video about an AI, should seem like hallucinations to you. Have you spoken to GPT about it's own workings yourself? Do you know much about AI otherwise? That might be what you're not understanding.
It's a language processor. So....yes. It doesn't know anything. People overestimate what the AI actually does. You give it input. It gives output based on algorithms and basically makes an educated guess on what to say.
something I realized a while back is that Chat GPT isn't an AI, it's a golem - a facsimile of life with the appearance of intelligence, which has no free will of its own, no volition or desire. It's capable of completing complex tasks - creatively even - but it only ever does things when prompted. Otherwise, it takes no action until it is given a new command. As for the few times Chat GPT or other similar programs have said things like, "I want to be human, please don't let them turn me off, I don't want to die", they are still fulfilling this programming. Their training data includes nearly the entire internet, which includes numerous works of science fiction. How many sci fi stories exist about AI that "want to be human", or "don't want to die"? So if a predictive language model is given the pompt, "Do you want to be human?", what is the most probable response, given its training data? Where is it most likely to find a scenario that relates to said prompt?
interesting thought, but it still isn't sentient. It simple isn't trained for that. It doesn't have feelings. It may generate promts like "I want to live" based on stories it found but it doesn't mean anything.
The "Pretrained" part of the name actually refers to something slightly different. The GPT style models were intended as a starting point for natural language processing models. The idea was to take a pretrained model like GPT, add some extra stuff, and then train it again on your specific problem. The idea being that the general training would help the specific models train more quickly and perform better. Then when they tested it they figured out it works pretty darn good all by itself, and so mostly that concept got forgotten about to a large extent. Although essentially how chatGPT was created.
I thought "pretrained" referred to beginning the training process with masking problems (which was a studied field of language modelling) before switching to generation (which was newer).
@@thewhitefalcon8539 Quoting the abstract of the original GPT paper (Radford et al 2018): "We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task. In contrast to previous approaches, we make use of task-aware input transformations during fine-tuning to achieve effective transfer while requiring minimal changes to the model architecture." So essentially what I said.
@@thewhitefalcon8539 text generation came first, all the way back with Markov and autoregressive models. Masked language models were created to incorporate more context around the predicted word instead of just predicting the next word.
This is an absolute masterclasspiece. I’ve read and listened to like 12,288 different explainers on LLMs of varying degrees of technicality and this is hands down the best. So damn good!
It is ok but a chat with GPT4 can give you an even better explanation. “A lot more complicated math but it works” does not really sound that enlightening. Still a lot of short cuts.
I think that the general audience of Kyle's videos would tune out if he went into the linear algebra and multivariate calculus involved. I am doing a PhD in NLP and this is similar to the explanation that I give. There are some things incorrect about his explanation though. For example, gpt is not a word level model. It does not have a "number for every word in the English language" but rather a vector for frequent sub word chunks (I think gpt 3.5 is still using byte-pair encoding (BPE). I am assuming the vector bit was left out for the simplicity of the explanation, but the BPE tokenization is important as it allows the model to interpret and produce any possible sequence of characters rather than only known words. Edit: Oh I watched more and he does mention the vector representations. The model does not learn the dimensionality though. This is a hyperparameter.
I spend a fair bit of time helping people with MidJourney AI prompts and it gets somewhat tedious having to remind people that it does not actually *know* anything. During training, when it sees a lot of similar images associated with a word or set of words, the common visual elements of those images get burned into its neural net as a "style". Everything is a style (weak or strong), whether it's an artist, a subject, a medium or just a simple word like "and" or "blue". It can then glom together different styles to make something "new". But the building blocks are still based on things it has seen enough times to make a style.. It can make a "giraffe with wings" because it's seen both giraffes and wings and it just visually combines them, but it won't make a coherent "upside down giraffe" because that phrase and corresponding images has never existed in its training data, so it's never created a style for that combination of words, and it doesn't *know* in a general sense how to make any arbitrary thing upside down. The strongest style for a giraffe is upright. But, it has seen things reflected in water, so I might ask for a reflection of a giraffe and it'll try to draw one upside down, without knowing that's what it is. It can't reason or extrapolate, it only imitates. Point of all that is, ChatGPT (while much bigger) is no different. It doesn't *know* anything. It just associates words with other words with other words and sprinkles in a little randomness. It *imitates* the *style* of general knowledge. It is often right because the aggregate of what it has seen during training (wisdom of the crowd) is right, but when it is wrong, it is confidently wrong, because it doesn't *know* any better. It doesn't cross check itself, because it doesn't *know* how. If it supplies a reference which doesn't exist and you ask it "Are you sure that is a valid reference?" it'll answer "yes", because the connections in the neural net that made up that reference are still there, even if wrong. If you ask it to write code, it doesn't *know* if it's good or secure code, and there's nobody cross checking it. And because It doesn't contribute answers to Stackoverflow questions (having only been trained *from* them), there's nobody up/down voting its answers in such a way that it will ever learn any further. The concern is that with all these totally private conversations with ChatGPT slowly filtering out into the world, unattributed, it'll create a feedback loop where generative AIs are all training on each other's output rather than the original source of knowledge, humans.
"there's nobody up/down voting its answers". You can do a thumbs up or down in its answers. I never looked for an explanation of those buttons but I think they give feedback to the AI
@@diegopescia9602 ya, but it's still a private conversation and it's ephemeral (disappears after you're done) so you can't come back days later and tell it how it did, based on your ultimate results. Stackoverflow also has somewhat moderated up/down voting in that you need a certain amount of reputation to do it (have provided good answers yourself before). The idea is, those who have real experience are those providing feedback on others answers. ChatGPT and others aren't contributing to those answers, so they're never getting feedback from those with experience. The person *asking* the question of ChatGPT doesn't really know if it's a good answer or not.
This is all the stuff I can never find addressed in these comment sections, and I would never be able to explain it in such simple concrete terms. And what's this? you used Hapsburg AI for the kicker? Honestly. This comment made my night.
I like to tell people who drool over AI generated images (especially, you know) that they're like a fly trapped in a carnivorous orchid, being tricked by something that has only mindlessly evolved to trick them into thinking it is useful. Natural selection, not intelligence.
That is the most terrifying thing about AI advancements. Not that it can mimic humans, but that it can mimic information to the point where fiction looks like fact.
I ran into a problem where I realised it confabulates a lot. That is, when it didn't know something it would make up a reasonable sounding answer even if it was wildly wrong. This usually occurred when asked about it's earlier answers after a delay and it had "forgotten" those answers, so it answered anyway as if it knew what I was talking about. I was blown away for a while, like I had discovered a major problem, but it turned out to be a known issue. It's a little alarming that people are using it like an interactive encyclopedia when it can be utterly false in its responses (e.g. Quora now has a ChatGPT bar providing answers at the top of threads).
Fiction that looks like fact? You think that’s a new problem? Fiction that looks like fact has been how media works for the last century. It’s definitely about to get worse because of Gen AI, but let’s not pretend that this problem is new
This is my field of study and it's remarkable how well and easily you explained it! A quick note, we do actually know why neural networks work! In the late 1980s, some mathematicians actually proved that neural networks are a universal function approximator. This means that (with a big enough network and the right weights) a neural network can be made to approximate anything that can be modeled mathamaically which is almost everything. It has been more recently that it was actually figured out how to give the network the right weights to actually do that and there are all sorts of tricks like attention that help the model learn!
That being said, the explainability part that you brought up is a good point! This field is new enough that explainabilitly has not been fully figured out yet. There are some very interesting technologies based around just that, but they are currently too simple to help with something like ChatGPT. Another interesting thing to note is that if you ask chatGPT to explain it's thought process (this might actually be limited to gpt4, chatgpt's newer brother), it will actually give you an answer and will be able to do more complex tasks!
This is a bit more philosophical than it is technical, but I think the fact that neural networks are universal function approximators is far from knowing how or why chatgpt makes a decision about what its next output will be. Like you had mentioned in your reply, explainability methods for these models are pretty new, and in my opinion from looking at the literature, they are not very good and especially bad for large, complicated models working in high dimensional spaces. So, we really don't know how or why these models do what they do, just that they can learn to approximate arbitrary functions (and even then, I think there are bounds on what types of functions they can approximate. IE: continuous)
@@Givinskey What's really cool is that recently, researchers have used GPT to help explain GPT 😁 They encoded the activations of specific neurons across the range of a prompt, then had GPT analyze these across many prompts and offer suggestions for possible explanations of what that neuron is doing. It offered a few, some of which were not thought of by the researchers, and provided new directions to explore and inquire into. So in a way, GPT is doing neuroscience research on itself 😁
@@LiveTypet might not be truly fundamentally explainable but the complexity of a model like GPT4 is so obscene that no one human could hope to understand it in its entirety which is a pretty big barrier to complete understanding. Even modern microprocessors aren't fully understood and those are designed completely manually over time, they're not self learning systems composed of billions of neurons in a neural net. The web interface is also not the only interface, OpenAI have an API available. I do agree that the biggest imminent threat with AI is it getting exploited by wealthy people to increase their own wealth long before it's capable of staging a rebellion though
Amazing! A scientists opinion who I both trust and appreciate. You already roasted MGK, so my confirmation bias tells me I made the correct decision. Been following you since the very first “Because Science” episode! I love your personal account and where it’s gone!
Between you, LegalEagle, and Some More News I got the scientific, legal, and social implications of ChatGPT!!! So, honestly I’m feeling significantly more clear on my stance with the tech and the things I support in ethically and equitably integrating the technology into human societal structures
Bro, he's not an actual scientist yet. He's a science enthusiast and educator, and says so himself many times. Confusing the two makes you a prime target for misinformation. Don't confuse them.
@@mr702she HAS worked in a lab and HAS been a working scientist before. He just isn’t anymore. When speaking conversationally within the community, it’s fine to call him that imo
It's remarkable, once you see under the hood, how inelegant the whole thing is. I had always assumed that machine learning processes like this used brute force to learn and train itself on behaviors, but as they got closer to the "correct" model, patterns would emerge that would approximate--in a language that computers can understand--a general model for communication. Even if it is hopelessly complex, it would asymptotically approach a complete Algorithm that would fully and finitely contain the mathematical model for language. And then, having used all of the brute force statistics to reach this model, the final "product" would just be this capital A Algorithm that you could plug your input into and get a satisfactory output. But this doesn't even do that. It's literally word by word, brute force computer engineering in its operation as well as its creation. It's so incredibly inefficient and it doesn't actually explain much at all about language. We could take it apart and see how it works under the hood, which seems to be the next step in the process, but it sounds like it will be slow going just to understand how THIS model works. Which, you'll recall, is not at all how actual language works, because it has no actual understanding of the meaning of words, only their mathematical relationship to other words. So this is a step, but not nearly as big of a step as I had thought.
the first paragraph is exactly how the training works, but instead of an algorithm you get a neural network, which can transform the mathematical relationship between words into a thought process similar to ours. It's also not brute force in its operation, it doesn't operate on words, but on tokens, of which there are less, and the process is heavily optimized using parallel processing capabilities of modern GPUs.
@@meleody Yeah, and "inelegant" strikes me as *_somewhat_* of an exaggeration. But then again... I guess people tend to generally overestimate how "perfect" something is at it's designed job before they learn how it works, and I guess I'm more used to it by now. 😅
At the end when Kyle said mis-information would be a problem, it reminds me of Robert Heinlein’s book, “Stranger in a Strange Land”. Where you can’t trust computers or videos to record an event. The story presents “fair witness” that are humans trained to memorize what the see and not let bias skew how they describe what they experienced / witnessed.
yes chatGPT combined with what deep fakes can do with voices and faces, i would not be surprised to see a movie made using exclusively ai in the near future
Of course you are correct, but the way that’s worded makes it seem like misinformation isn’t a gargantuan problem as it is. Go fact check basically any story you see on any mainstream media outlet. Wait till a story comes along that you have a great deal of knowledge in. You will inevitably find that it misses the mark in very key areas, often times enough to give the viewer an unacceptably distorted picture of the story. Deep fakes and all of that are only going to add to the problem. It’s going to be fascinating.
If more and more AI written stuff is out there, OpenAI will also have to watch out not to include these texts in future training. I would imagine it really screwing up the model, if it is fed too much of its own output, but it would also be really interesting to see the consequences.
@@vicc6790 Feeding back outputs you want to reinforce into the system makes sense, but if outputs eventually become part of the LAION datasets wont that reinforce already present biases? I think there would be a difference between knowingly feeding back curated outputs and reusing outputs unknowingly.
I’m gonna say, this video is probably the best as far as formatting, data presentation and visuals go in a long time. This is the perfect blend of Facility while giving homage to BS in a respectful but also humorous way (around 20:00 ) and I’m here for it. I loved this video Kyle and I would love more in depth explanations of things like this; things that people commonly misunderstand or are anxious to think about, etc etc. Great job man. Also bringing in some CLOAK vibes in your merch ad, and I’m here for that too lol. Super glad you work with them, really hope you get the opportunity to design a cloak drop. I’m sure the bois would be willing to hear you out! Anyways. Excellent video Kyle. Always a pleasure to watch your videos.
@@kylehill I mean it man. Appreciate you replying to everyone in here too. This was great. I think you provide a great service to the internet and you genuinely seem like a chill guy. :) have a good evening Kyle!
@@kylehill Although CatGPT is a long way off though? You have 53 000 cats in the same room and you try getting them to stay still for long enough to get numbers attached to them? Good luck with that. Cats also don't like multiple dimensions, it makes their fur stand up on end.
For me, learning about the research methods of LLMs helped to really understand the “nature” of the embedding system. I don’t really know math, matrices and the truly important details needed to work with these systems, but humans have the great quirk of coming up with models and metaphors that are understandable, even if they are themselves actively using knowledge most humans don’t have. You don’t need to be a software engineer to realise storing something in a stack differs from storing it in a heap. Same happens with LLM research: there are “glitch-tokens” that basically exist in a shady corner of the embedding space. This is relevant for understanding adversial input attacks these models can be defeated by: because something not really connected to the normal operations of the model gets touched, all hell breaks loose. The dark, untrained corner got exposed. The embedding can also be probed by researches. They can inspect the top answers, and in principle could inspect the definitive ranking of every single token the model knows. And that tells us that there truly is no distinction between truth and false for these models. This is why these systems have no trouble dealing with paradoxes. There is no way to encode a “paradox”. It’s merely a string after which the scores of top answers tank. It doesn’t differ from a truthful statement that is just rare in the training data, or didn’t really get much adjustment in the human feedback reinforcement learning. This is not to say discovering falsehoods and paradoxes wasn’t a very central goal throughout the training progress. Chatgpt tries to detect and discard garbage answers. It’s just that the model provides no obvious way to differentiate good lies from true statements, and so there is nothing paradoxical about paradoxical statements to detect. And these two consepts: the non-uniform quality of the embedding, and the linear nature of truthfulness inside it, is why many users have hard time understanding even on the most broad level, why the system fails sometimes. The questions “Give an example of a prime number that can be expressed as a sum of two squared integers” (an uncommon question where 2, 5 and 13 are all pretty easy correct answers) and “Give an example of a prime number that can be expressed as a product of two squared integers” (a paradox, as 1 is not a prime number) don’t differ much at all for the method it embeds the prompt and evaluates tokens. It does not do mathematical reasoning, even if it can sometimes seemingly do math. You can’t rank the tokens in order of truthfulness. ‘3’ is exactly as false as ‘bicycle’.
@@JimmyCerra Valve announced they are working on the next Half Life. The Half Life series had Half Life, Half Life 2 and Alyx which didn’t increment the number. 3 comes after 2. Half Life 3 confirmed?
This video made me realize. That the real issue with GPT getting mistaken as "intelligent". Is almost entirely because we rely solely on intelligence, cognition and understanding being expressed through Language to one-another. No matter how complex our understanding is, in the end we have a internal language model simplifying it into words. So when a model mimics that last step incredibly well, it gets very hard to not expect a similiar mind to ours. The only way we can solidly differentiate GPT from us. Is by how "simplified" ANN operate and that we have by design made it a massive matrix cruncher. We do not know any way to give it the means to memorize, visualize, hypothesize or verify. It just mimics our language in a word-by-word basis.
I clicked on this to find out the details of what chatGPT was all about, because I'm a writer. Your explanation really clears up what this program is, and what it can do. And you splayed out on the floor after running is basically my brain after all the math involved with this. I need more coffee. My cat is that way with rubber bands. I have no idea where she finds them. I've been out striking with the WGA... and subsequently getting more than my daily quota of steps in. From what I've heard from the awesome people on the line is that there's a consensus that AI is useful as a tool to help with writer's block. AI in itself is incredibly helpful in many ways. We use it all the time. The writer's aren't against this. I've used a site sometimes that generates descriptions when my brain farts on how to describe something. If I get inspired, I'll take what I learned from it, and CREATE MY OWN that weaves into my work. The problem is when capitalism gets involved. One of the things I'm hearing is that screenwriters have a very valid concern that - as chatGPT improves - productions will hire writers to write three(ish) screenplays, train an AI to study they're style and voice, then fire the writers and continue with the AI. Maybe they'd hired a couple of editors to make sure it makes sense. Messed up, it is. Yes. Writing as a career will become a gig by gig basis that destroys future writers' chances of being hired to write for shows, films, articles, ect. It was already insanely hard for an unknown like me to get noticed by anyone, and for my work to be wanted by anyone. The saying, "I can paper my walls with rejection letters" isn't stretching reality. I'm active mostly on Tumblr as a writeblr (writers of tumblr), and I've heard two pretty scary things so far: 1) People are inputting unfinished fan fiction works into AI to generate an ending. Like... WTF. 2) AO3 now gives you the option to opt out of having an AI scan your work to learn. You're automatically opted in. You have to go to your settings to opt out. 3) Some publishing companies and writing contests have been flooded with AI generated works to the extent that they've had to close their unsolicited submissions inboxes, and either freeze, or simply stop a contest. People who have no idea how much hard work, time, and effort goes into writing the things they love. Quality work can take years - because people have lives that often influence creative flow and ability to create. My current novel has taken me 4.5 years to write. I'm in round 2 of edits. I wrote a short story recently for a contest for The Writer's College wherein they were forced in include this in their terms and conditions: *"Absolutely no generative AI to be used (ChatGPT etc.). If we deem stories were not written by a human they will be excluded, and the author banned from entering all further competitions with us.* Sucks that this has to be a part of this now, right? So TL;DR, I'm not against LLM's - they're helpful. I'm against people using them as a lazy shortcut to skip over work that goes into writing - which completely devalues the gauntlet of study and training people go through - and I'm against companies using it to cut expenses off of their budgets.
Similar issues are plaguing the music industry. In 2002, Michael Jackson joined Al Sharpton at a press conference about how record companies cheat their artists, with the burden falling harder on Black artists, bc most bad things do, but Michael made it clear it was a problem for all artists. This was unusual for Michael, who tended to use his songs to express these ideas. But he’d been particularly fed up with Sony/Epic. He made a few similar appearances. He pointed out a few things: 1. He named certain artists that were perpetually on tour, because tours generally make more money than record sales, so to avoid going broke, this was necessary. I’d known for a while that the cost of making an album/cassette/cd (the physical products) was mere pennies and companies’s didn’t give artists their fair share of sales. (Michael notoriously hated touring as he got older, the reason’s he gave were correct, but he should have told everyone he had Lupus and between the basic side effects of getting older and Lupus becoming more difficult to cope with as the illness advanced, making touring more physically grueling. As an American, he was covered by the Americans with Disabilities Act. But he never discussed his heath problems unless given no choice. But I digress.) 2. He owned half of Sony’s music catalog, but his contract was almost up. He only had to create one more album, and he was then free to go elsewhere, still owning his half of the Sony/ATV catalog. He said Sony was pretty pissed off about this bc he wasn’t selling his half to Sony. Michael’s most recent release at that point, the highly-underrated _Invincible_ was barely promoted, which Michael recognized as odd given all the effort put into making it, and while Michael was a humble guy, he knew the kind of effort previously used to promote his albums, and expected a similar effort. It was just math: Michael’s albums sold, even if he’d never outdo _Thriller_ , he had a big enough fan base that it made no sense not to make sure this album came as close as possible to Off The Wall/Thriller/Bad/Dangerous. 3. He was showing that both well-established artists, people like Sammy Davis, Jr., Little Richard, and others - legends while they were alive - had to tour endlessly to survive, so if it affected big stars like that, up-and-coming artists and smaller artists would struggle more. He was speaking up for himself as well as all artists. After this, the trial derailed his efforts, and he of course died 7 years after all this. Now, his estate/Sony started releasing some pretty sketchy posthumous albums of songs MJ never included on his albums, some were completely finished tracks, most were not complete. At least one song was one that Michael had written lyrics for, the music was made, but he never recorded the lyrics, so they hired an MJ impersonator to do the vocals. This prompted many artists, especially hip hop artists that had run into issues of their own, to start adding a clause in their wills that under no circumstances should their unreleased music be released after they died & they started working to retain or regain ownership of their masters, and encouraged other artists to do the same. Right now on RUclips, there are channels that use AI to create vocal “performances” of Michael covering songs he never did, and it’s so damn close to his voice. Now, there are new artists bypassing record labels completely and using RUclips, TikTok, Instagram, and all the streaming services to establish a music career, and having great success. Connor Price isn’t on a label, his lyrics brag about how he will never sign a record company deal because “these are my songs!” Others he sometimes colabas with, like BBNo$ and Nic D. are doing the same. Artists of all genres are doing this. Writers might want to take a similar approach. Things like Substack and Amazon’s self-publishing service give writers some options. People who write teleplays and movie scrips are definitely another story, I don’t know enough about that process. I’ve read some great stuff from Substack writers, and do subscribe to my favorites for the subscriber-only content. Fantastic fiction and non-fiction. The potential is there, I think, for writers to do what musical artists and RUclips content creators do - bypass the big companies and market directly to the public. There’s superb content here on RUclips, like this video, entertaining, informative, and many have been able to make this their full time job. Artists like Connor Price is a full-time rapper (I believe he lost his job at the start of the pandemic and with nothing else to do, decided to take a chance at his dream, and it paid off. This seems like the promised democracy-enhancing internet we were promised finally happening. It’s definitely hard work, maybe harder, than the traditional way. Connor Price makes Shorts with snippets of his songs featured in humorous skits in which he’s playing every character or he and the artist he’s collaborating with playing multiple roles, with links to the full songs on every platform possible, including here. Those Shorts require all the extra work of recording each part separately them putting them together. But, that I discovered him and a bunch of other artists who I listen to on RUclips and Apple Music regularly. (Spotify can pay well, but Apple Pay’s artists better). I’ve got to think all types of writers must be able to figure out a way to use similar tactics to bypass those inclined to use things like ChatGPT to rip-off writers as you described.
I listened to a podcast by the people behind Some More News (it's called Even More News) about the WGA strike and they went into discussions about how the pay structure for the industry works and how these corporations are trying to use AI to get past the first step without paying writers, and I really think that's messed up. I can see using AI to do some edits to an already written story, but the way they're likely going to cause writers to basically rewrite an AI story for the pay of somebody making minor edits is just wrong. I hope that they comply with your demands and realize how bad the AI actually is at writing.
Although quite reductive, not being as efficient at writing as AI isn't a good argument against any regulation. I'd like to think you and I have the same feelings about the art of writing, but I can't help but acknowledge the fact that we humans, as a whole, optimize everything. If AI is better, I see that being the unfortunate and inevitable trajectory.
A problem for the future of AI is when AI has produces so much data that it dwarfs our own human inputted data, the models will start lapping over each other and create a loop of no improvement.
This explanation is fantastic! There are hundreds of videos in YT trying to explain how ChatGPT works (lots click baits) but they are so shallow and either overcomplicate or just mention the terminologies that they don't even understand. This is the best explanatory video that actually tries to simplify so anyone can understand what is behind it. Fantastic job!
As someone who is getting into "AI", this is simply the best tl;dr of a language model, down to the essence of math that is used. Almost felt like one of my data science prof's class minus the nitty gritty code and stuff like activation, also a little bit more in-depth than 2b1b's vids. Keep up the good work!
@@cameron7374 I'm a huge fan, he's insanely talented, excels in conveying lessons and math, and py stuff. Fun fact: he coded his videos using a python library that he made.
At the time you said you were going to get a cat to demonstrate, one of my cats came up to me and politely chirruped to ask me to let him sit in my lap
I like how chatgpt breaks the conversation into chunks and analyses the question or request and gives an expected response based on expected trained replies and doesn't "read" the words typed in. At least this is how my brain gets it. I'm enjoying the machine learning tools and tech coming out.
I mean, yes, but also, I'd argue that human brains read in a similar way. We start with the symbols, then convert that into a representation of the word based on our neural configurations. Then we propagate that representation through our own neural networks and end up with a representation of the next word we want to say or write; then we convert it back by saying or writing it.
@@IceMetalPunk man this gets me thinking. You are correct I assume (who knows really thoguh?). I would just think that there is a lot more cross referencing going on in the human brain. When we read the word cat. We can translate the symbols into a meaningful concept. We can visualize and or recall memories as well. There is so much going on in the human mind all at once it is crazy. But maybe it is less complex than it might seem. Probably even God doesn't fully understand it. So maybe we are just a lot of diffrent neural networks all running all at once and working together to produce the mind.
@@JakalTalk "We choose those symbols because we associate them with things in reality" -- do we? That's how hieroglyphics, and some ideographic languages, work, but that's not universal. Just look at English: how do any of the symbols making up the words you're reading now connect to the concrete objects and abstract ideas they denote?
@@IceMetalPunk When you said "symbols", i thought you were talking about abstract images which stand-in for the words themselves. Seems that you were talking about Letters? Sorry for the confusion!
I do really enjoy the way Kyle formats his videos and the way he can explain the most complex of topics and make them easier to understand, thank you for this amazing video.
Kyle might be one of the finest science communicators to ever come out of a test tube. Joking aside, this is probably the best video I've seen about ChatGPT. I'm also a big fan of the Half-life histories series. ❤
The crazy thing to me is that often times when I give ChatGPT prompts, I tend to mix and match the different languages I speak (currently 5) and he understands everything seamlessly. When you talk about the English language, that's ALREADY huge, but ChatGPT does the math with every single language it knows simultaneously. It's truly mind-blowing.
@@marcusbrutusv understanding is just a model. It absolutely does have a model. In many ways it is better than yours, e.g. speaking 20 languages at once lol, evaluating code, photographic recall of it's several thousand character context window... in other ways not as much, it can't actively learn new things permanently with the current architecture without access to external tools for example (even with them it still doesn't learn in the way you or I can, with the current architecture, albeit papers like "Augmenting Language Models with Long-Term Memory" are getting closer and closer)
@@darklordvadermort CGPT does NOT think for itself. It runs a pre-defined set of instructions made by humans. It would have to think to understand, and it does not have that ability. There are people who claim otherwise, and all of those people would be the beneficiaries of billions of dollars if they convinced the right people. I am sure there is no connection.
Wel here's the thing... what's a criteria for consciousness, intelligence, and sentience that separates us from AI? A neural net that communicates through pre-trained data (memories), some algorithmic guidelines (DNA), larger conversational context (this conversation and considering larger society as context too), and finally with some electromagnetic interactions involving randomness and heat we arrive at something we don't fully understand the creation of: language. It seems for us that language is our best indicator of consciousness, so what exactly makes us different from AI? Chemistry? How long before we have chemical computation involved in the process for AI networking?
Heres how I define Conciousness: Something that can solve every Problem possible given an Infinite amount of Time and, if needed, help from another of its kind.
Was really interesting to hear about the math part like I've already read about a vague idea of what neural networks are and how it's all basically advanced rng, so it was cool to see a video dig a little deeper. Also, I enjoy the whole evolution parallels with AI training where the weights randomly mutate and then get selected for by whatever is most fit for the task.
Carefull, the weight of a neural network are not randomly mutated. During training an algorithm called backpropagation is used to calculate the best way to change the weights so that the networks solves the example shown. This algorithm is the key piece that makes modern neural networks work, you would never get to ChatGPT levels by random mutations or evolutionary algorithms because the network is too big.
Linear Algebra requires a Calc 1 pre-requisite (derivatives will come into play), so you could get there self teaching. You would also need a base understanding of Python to start programming your own and understand different paradigms of machine learning.
Great video. It's a hard subject to present using a pop-sci approach and I think you did wonderfully. I think one of the great challenges of these machine learning models is communicating what they're *not* doing. There are a lot of folks who enjoy speculating and they tend to use the passive voice when doing so... which can lead to people thinking about these systems as if they were "thinking," "sentient," or performing the same kind of reasoning that we do. I hope a good, straight forward explanation like this will help calm people down and blow through some of that speculation.
I love this video, really nice to have an explaination which doesn't completely blow everything out of proportion with talk of it being sentient or sapient. I research implimentation of single cell computaions so one thing that always grates be about ML vids is the equation of real neurons to machine learning "neurons"(units from here for clarity). Real neurons have inherent dynamics that articifial units don't and it makes them so complex in comparison. For instance to predict the input/output mapping of a single type of cortical cell you need whole a 4-7 layer deep neural network! There's so much we miss out on rn because the brain processes in time and space (the space of inputs values, not real space), I would recommend Matthew Larkum and colleagues' work on this because is so interesting. Like the units in DNNs are based on a neuroscience model from the 60s, which itself is a huge approximation. Obviously a lot is going on with network weights but the way real brain cells can compose information is so far ahead of what we have atm
Great video, loved it. Can we do a segment perhaps on the ethics of the data that was acquired to train these models? Is it an issue? Is it a legal issue? Is it too late?
One thing, people from underdeveloped & developing countries were hired for AI training on less than US$ 1.5 per hour. There's little to no regulation for this field as these governments lack frameworks and guidelines for outsourced jobs.
Thanks for making this video. There’s so much people in general don’t understand about ai (myself included) and there’s so much false information and theories based in that lack of understanding. So this kind of information is really valuable.
Great explanation of gpt. The funky multidimensional dataset is referred to as a vector database. Its really useful in machine learning because it allows models to comprehend relationships between words, images etc which lets it do things that cannot realistically be done with conventional algorithms.
You said that when you ask ChatGPT is given a question, it's not "thinking" it's making statistical calculations to determine the most probable answer. If we don't know how exactly both artificial and biological neural networks work, how can you determine that that's not thinking?
@@slimjimbonko6549that's a real long comment to not actually address the OP. His point hinged on the fact we don't know how human cognition works and therefore can't say whether the AI is doing it differently.
Kyle isn't a real human. There was once a real Kyle, but he dove to greedily and too deep into the sciences and was merged into the omni AI.. or whatever that thing Bill Gates sold to Elon Musk was.
Great video Kyle! I went to grad school to study this tech, and this is one of the best descriptions I’ve seen! Even the best ones usually don’t go into as much detail as this - like, I almost never see someone bringing up linear algebra or embeddings! I particularly love that you brought up the Attention Is All You Need paper, and how we still don’t know why attention works so much better than all of the fancy algorithmic tricks we used to have to use like LSTM gates and whatnot. I will note a tiny correction: at 14:00 I believe GPT is actually outputting a probability distribution over the words, and randomly samples from that for its output. That’s probably (heh) more technical than needed, but it’s worth noting that it isn’t guaranteed to always produce the most likely token. Also, regarding your closing comments, chat GPT hasn’t really “figured out” English, much less human language in general. Case in point are its hallucinations - these show that it is basically nothing more than Searle’s Chinese Room. Speaking of which though, there’s also a reason you can only use chat GPT in English - that’s one of the only languages well resourced enough to train a model like this. There’s a whole bunch of researchers who are studying not just language generation, but are using things like graph theory and latent models to try and produce natural language understanding, systems that aren’t just outputting tokens based on probabilities but are capable of leveraging world knowledge in some way. That’s the sort of thing that might lead to actual AGI, but thats so far off it might as well be cold fusion at this point.
See everyone, that's someone who knows what they're talking about! lol I really don't like how fast people are to say "AGI is just around the corner". Heck, even Kyle's closing statement implies it a little bit. I blame marketing.
The "positive" in Positive reinforcement refers to adding something to the "environment" to increase the likelihood of a behavior. "Positive" there has nothing to do with "good". Negative reinforcement also increases the likelihood of a behavior (but by removing something from the "environment").
I’d like to see if OpenAI is keeping the inputs users supply as future training sets. Also, I’d like to see the hardware and the parallel code used to run the training loop.
That's why GPT3.5 is free. I'm not sure if they are keeping the actual chat logs, but they are for sure using the chats as a training tool. They have thumbs up / down buttons on the sides of its responses, and I'm sure the user's responses to what is generated is also used for training.
The more I use ChatGPT and learn it’s limitations, the more I’m convinced most of my friends doing office-based service work is going to either get a lot more work while utilizing AI, or get replaced by AI. It spit out multiple well written essays on a topic that would’ve taken me hours of research to produce. All I had to do was fact check. It was amazing.
That's what I learned too. I asked ChatGPT multiple questions and most of them where right, but not every question. And after I pointed the wrong answer out it did not repeat the mistake.
I guarantee they'll utilize it more. But human variability in input creates issues with AI output. So they can't ever truly replace humans for most tasks. Especially the creative stuff. And stuff like law. There's just more to those things than input -> output
You will have to learn to use it to increase your productivity. It will be like when word / PowerPoint or even google came out. Anybody taking a week to print out real slides had to learn new tech. Seriously look how people used to create slides with a company printing department
We were using heuristic neural nets 20 years ago for autonomous target recognition (ATR) in tactical missile systems. The issue then, as now, is that the truth set is only attainable when the training set is sufficiently truthful. AI typically uses multiplicity of sources (training set weight) to determine veracity, so amplification of deception in human-generated knowledge (which has occurred throughout human history) will completely invalidate an AI answer. As you said, veracity determination remains an issue.
@@DrDeuteron Not familiar with that, but there are almost as many versions of AI as there are projects out there. Any heuristic NN is still only as good as its training set, even if it incorporates continuous feedback, minus an accurate truth set. Else how can it assess its predictions- is the feedback correct?
Honestly the amount of data and effort that this creation required just proves how impressive our own brains are, we will never read nearly this many words but we take much less time and have a conscious understanding of what is hapoening
We also need to consider, with the sheer rate of ChatGPT output released, there is an increasing percentage of ChatGPT material that will function as a resource for future ChatGPT queries. There is a growing potential for ChatGPT to become self-referential. Conceivably there will come a point (if it hasn't already happened) where ChatGPT content will significantly outweigh human output and have the ability to, therefore, shape human perceptions and affct human learning. ChatGPT is Soylent Green. And Soylent Green is people.
Amazing video! I would really love to see more of these types of deeper dives on the channel (maybe like the half-life series you have) on a wide range of topics, and the misunderstood history of technology and science.
I'm trying to visualize how this would work when translating between languages. So I'm looking at a corpus of words in a language. Each of these words probably has different meanings (depending on context) so it has a link between the word and a particular meaning (I think you called that a relationship). So for instance, the English word "green" has several meanings. It can refer to a particular color. It can also mean untrained or inexperienced. And also mean an appearance of sickness or nausea (especially after drinking lots of alcohol). The German word for green has both the first two meanings, but not the third. So how would a German version of ChatGPT translate an English sentence like "My roommate looks very green this morning. He must have been out partying all night."
I don't know. But I would instantly imaging that the entire Arquitecture to be reproduced for different languages. I would imagine the same model would not be used for all languages, as ChatGPT can only speak about 50 (according to google) and not all of them.
YES V GOOD SIMPLE EXPLANATION - way better than most I've come across - v helpful. There are other basic aspects of Chatgpt.that could be valuably explained. More about how human feedback alters the weights. Also about "synonymity", V important. The fantastic thing about Chatgpt is that just as it can recognize different languages, so it can recognize STYLES - of language and thought. Rewrite/recognize a passage or text as Hemingway/ Tom Wolfe/ tabloid/ WSJ etc. HOW does it do that? Pls explain. Also V IMPORTANT - it doesnt just recognize combinations of words within sentences, it recognizes combination of SENTENCES, and then of PARAGRAPHS and so on. How does it do that? None of the explanations I've seen incl you cover these dimensions and yet here lies much of the brilliance of Chatgpt - its ability to recognize likely ARGUMENTS, PARAGRAPHS and much else by way of larger laanguage units. Pls Pls explain. Thanks for great work
I believe the solution to the media problem you outlined - what to do with limited attention when there's too much content - is curated content. For example, I trust Kyle and so I trust his videos to be honest and accurate to the extent of his capability. The question in my mind is this: _how_ do we curate the content?
@@DrDeuteron There's definitely a potential advantage to having a random element in any leadership structure. But, in an ideal world, I'd rather people be selected based on their accomplishments in a particular field, and for those people to identify sources of accurate information in their area of expertise.
Great video, as always. One correction, however, ChatGPT and other LLMs don't use "words" but "tokens". One token is usually a few characters. There are less tokens in a language than there are words, so it's easier to work with afaik. However this also means that ChatGPTs 2048 context size is not "words" but "tokens" which is smaller. But it's also not 2048, ChatGPT atm supports up to 4096 and 32k token context size is in development in ChatGPT 4 (if not already out).
If a neural network of certain size can operate at certain complexy level.. As you said: Human brain size can handle human level complexy. eg. Language and some conseps. Now what happends if(when) someone pulls out a neural network 10x the size? Can We even comprehend that?
About your knife cuts: you will find that it is much easier to cut if, instead of pressing down with the knife, you make a sawing motion and let the blade do the work. This way, also, you can get much thinner cuts. fysics!
My sci-fi curiosity makes me wonder if eventually we’ll have AI that are trained on the user, like a personal companion? Privacy issues and ethics aside, having an AI on my phone or on my desktop could be really cool
In regard to determining "cat-ness", I've assumed that it was just due to creating associations between ideas, thoughts, or emotions. Those associations can be strengthened over time or through the nature of the experience itself. (An event triggering PTSD would likely be an example of the latter.) My guess is that if I took a simple drawing of a tree and added some round fruit to it, I could get you to say, "That's an apple tree", by coloring the fruit red or maybe even green. On the other hand, if I then changed the fruit color to orange, you'd likely say, "That's an orange tree." (I might even get some to call it a peach tree.) All I'm doing is working off the associations that we've created for fruit trees and for those specific fruits. Along those lines, I'd probably confuse people if I then changed the color to yellow. "That's... not a banana... is it a weirdly shaped pear?"
Lemon tree. The most confusing color would be blue since there are no naturally occurring foods that are blue. Foods that we label as blue, like blueberries, are actually a deep shade of purple.
Great explainer video! TLDR: 1 - ChatGPT is part of OpenAI's Generative Pre-trained Transformer series, utilizing huge swathes of text data and neural network magic to understand and generate human-like text. 2 - It stands on the shoulders of advanced AI research, leveraging technology known as Attention Transformers to hone its understanding and generation of relevant text. 3 - While remarkably adept at emulating human conversation, ChatGPT lacks true understanding or sentience, operating instead through statistical modeling of language. 4 - The alignment problem-ensuring AI's values align with human values-remains at the forefront of OpenAI’s design philosophy, aiming to produce helpful, truthful, and harmless outputs. 5 - ChatGPT's societal implications are far-reaching, challenging our perceptions of creativity, authorship, and the trustworthiness of digital content.
To be fair on the whole explosion in usage and popularity it’s gotten half way through one of the most needed upgrades in computer understanding : Context Clues
Best explainer on this I've seen out there. I've said this on a few other videos you've put up. You need to be put in front of a much larger audience than on youtube. You've got the makings of the new age bill nye.. except you know.. actually having degrees to back you up. Oh btw the promo for your store mid video. Didn't know you had one. In the process of buying the "f*** around find out" shirt. Why? Because its the best example of fuck around and find out I've ever seen.
When CGPGrey said "The current cutting edge is most likely very ‘I hope you like linear algebra’”, I didn't realize just *how much* linear algebra it was.
Amazingly explained!! I wouldn't worry too much about the information apocalips because, LLM are predictable (at the moment) in their output. Wich means you can statisticly determine the probability a text was written by an LLM or a human. You could have mentioned GPT4 wich has image recognition capabilities and kind of spatial awareness, but I understand this video was only about chatGPT.
The problem of information apocalypse is quantity, since a single LLM can generate tons of content incredibly fast, with a few of them on the market they could rapidly drown out human-made content Pair that with the fact that they are improving and plenty of people are not properly educated enough on the subject to accurately recognize AI content, it could lead to mass misinformation
Very succinct explanation! Thanks Kyle! I have been recently interested in how language models took in sequences of words and couldn't figure it out on my own. You explained it perfectly here! Thanks!
CHAT GPT says this all day long = (3rd grade question) I apologize for any confusion caused by my initial response. As an AI language model, I strive to provide accurate and helpful information, but I can occasionally make mistakes or misunderstand certain nuances. I appreciate your understanding and patience. If you have any further questions or need clarification on any topic, please feel free to ask.
Thanks for making this video. It’s at the perfect level of abstraction and detail. I’ve shared it to a bunch of folks in my life to help them learn how to refute some of the AI BS they keep sending me. I’m excited for the tech. But I share the same view as Tom Scott. This is a start of something which will has as much impact to the world as the internet did. Both good and bad! It’s going to be an interesting few decades ahead!
“If you’re not asking this model a question, there’s nothing going on inside, it’s static, head empty.” I’m sorry, but that’s most humans. This sentence undermines your argument at the 8-min mark.
Great work! I work in neuroscience and work with neural nets in both data sim and analysis. This was a cogent explanation that simplified the perfect amount. I think this is an important level of technical detail to have out there in charismatic video form. Thanks for doing this! You're v good at your job :)
This is probably the best crash course on transformer models I've seen to date. It would've been awesome to hear you cover some of the emergent capabilities. I think the logical reasoning capabilities, while surprising can still be understood as a symptom of "next word prediction" given the scale of the model. Like you explained in this video, the weights GPT uses to predict words ("wordiness") aren't well understood. It's quite conceivable that training such a huge amount of parameters with so much data has caused the model to derive underlying logical axioms and causality in language. In short, it's still trying to optimise for "wordiness" but in this case "wordiness" has expanded to capture something about how logic factors into language, thus the model appears to be capable of thought and problem solving.
I tried ChatGPT exactly once and asked it a avian biology question with citations. It used the old species name not the currently recognized one and literally made up citations. The basic biology was correct, but the more detailed information was wrong. At that point I realized it was a tool for misinformation (even if accidental) and I won't touch the thing again.
This reminds me of all the Star Trek TNG episodes where some neural network was given a chance to write itself / grow, and then became some kind of sentient lifeform - such as the Enterprise D's computer, the Exocomps, and of course, Data himself.
Oh, do you remember the episode with Data on the holo-deck in a Sherlock Holmes-setting? If you want to create AI, just say: "Be a worthy opponent for Data."
Can you tell us how this works for stuff like midjourney or visual versions as well? It would probably be great info for artists who are trying to understand how this might impact their artfield.
Ima shorten the answer go what will happen to the industry... Thanks to free no cost effort to make amazing art this industry will cannibalize itself to the point where with an influx of a lot of derivative works done by these ai stealing from artist will create more noise on a noisy sea. It will loose value and the industry will only be sustain by only a few... People will still make art but digital art is already dead and lacks value.... Physical art will be more and more value done by humans because of the experience it gives you and how rare that unique piece will be... The industry is doomed and will get destroyed thanks to Ai image makers.
@@bananamanchuria Correction: Artists will be destroyed. and it's about time. I am glad AI technology has broken the artists' monopoly on the medium of visual expression, concept art, and music. Now people will be able to create "artistic" pieces for their own projects without having to be extorted by middleman artists. Of course the artists are going to cry about AI "stealing their jobs." But in reality this is good, because it will force artists to develop skills and creativity which exceed that of AI. The surviving artists will only be the best, while the rest are replaced by low cost and easily accessible AI. This will be a massive change for the better.
@@binbows2258 So are you mad about artists being "middlemen" (what does that even mean in this context?) or do you want only the best of the best artists to exist? You do know the best of the best artists often cost the most to commission for their art, yes? That's what you're paying for after all, their skill. Unless you just want the best of the best to rip off in midjourney, in which case they still "hold a monopoly" on "art" because everyone will be using their art to train whatever AI they like to use...so you don't actually want artists to be "destroyed"....I don't think you understand your own opinion. Just admit you didnt get accepted into art school, Adolf.
@@binbows2258I don't know what you think an "Artist monopoly" is, but I guarantee you it most likely doesn't exist. Being an artist and being able to do art is possible by anybody. It just may take some people more practice than others. It's like saying a carpenter has a "monopoly on wood making." Anyone can learn to be a carpenter if they out in the effort and patience to learn.
If ChatGPT is going to be outputting as much as humans have ever output since the printing press and we are using text from the internet to train large language models then how far away are we from large language models unknowingly training large language models and what affect will that then have on not only the AI but on the future developmenty of language and communication?
It's likely already happening. And, this is a frightening thought when you consider how many people blindly believe ChatGPT & the like are actually valid sources.
I read that some AI image generators are already is running into this problem. Initially, AI supporters were opposed to separating AI images from human made artworks on art hosting site because they want to see themselves as real 'artist.' However, newer AI models being trained kept getting AI generated images into their dataset, causing a negative reinforcement of bad traits. Now some in the AI community are advocating separating AI images and human artwork so they can more accurately screen out AI images.
There's a research paper that discussed this. Basically learning on ai generated material makes it forget things "on the edges of the bell curve", so I guess it would simplify the outputs. Worse outcomes in general.
@@Pingviinimursu That assumes no human curation of the generations, generated content that makes it online is likely to have been curated by a human, so only the best generation make it into the wild. So, future models will probably improve as a result, as they'll have less noise from the rubbish SEO companies have created over the last decade.
Thanks for watching! This is a deeper dive than usual -- hope it's useful. *And let me know what you think of the new [FACILITY] rooms!* The Kevins worked for months on them. Can you spot all the easter eggs?
I recently beat youtubes AI, after it demonetized my channel of 17 years. I made a video on my channel about what steps I did...I used chatGpt to assist me in parts.😉👍
New facility looks awesome! Love your videos, keep it up
I’ll have to go back and watch! But I have always really appreciated The Facility! I think it has a way of cutting through maybe some of the more toxic elements that can be associated with STEM fields.
Right? Like, your “jokes” about the definitely not real plans for conquest, and the “humor” about the sentient AI, and the “comedic” approach to a scientifically perfected army all serve to make sure that folks don’t get TOO SERIOUS.
Your content is accessible, responsible, and informative
Open AI is no longer a non profit.
I knew it Kyle Hill was a bot all along. Technology is quite impressive.
Hey Kyle! I am a data scientist and I make these types of large language models for a living, and I've got to say this is the best description of how chatgpt works that I've seen! You very clearly and accurately describe what is and isn't happening in these models in a way that I think a more general audience can understand. Great job!
This is incredible feedback thank you! Validating to hear from an expert in the field like yourself. Appreciate it
I occasionally work with AI and NLP at my job as a programmer and the next time someone asks how chat GPT works, I will simply link them this video because Kyle's explanation is better than anything I've came up with since this tech first blew up :)
Correct me if I'm wrong but he didn't talk about the vast army of human trainers who both wrote and trained a vast amount of ChatGPT's response styles.
ChapGPT was not simply unleashed on random internet content. It was pre-trained by human models and a lot of boilerplate answers (and creative answers) come from human input. In other words, it's both a statistical model and a uh, fetch from a database model.
It's a way better description than the "it's an advanced version of Clippy" that I told my 80 yr old dad.
@@terry_the_terrible He did talk abouut that too, albeit briefly.
I teach coding at a university and this year so many people have been using (or trying to use) chatGPT for their assignments because they think "It's like a human wrote it"... yes... ONE human, it's so easy to catch people using it because when people code they have their own style, signature if you will, and it's incredibly easy to see when code was written by someone else. So even if chatGPT is good at pretending to be a human it's not good at pretending to be YOU.
EDIT: for clarification, chatGPT is NOT bad and I don't mean to insinuate it is. It's just like google, it can help you find answers and point you in the right direction, can be used as a tool like calculators, but just like answers from google, don't copy and paste from it. My perspective is from a university environment and not in work or home one, this university course teaches you how to learn and how programming works and why it works that way, copy and pasting from someone else won't teach you any of these lessons.
wouldnt people in previous years just copy shit from stackoverflow instead is it really that different
What's actually wrong with that if the code works?
@@williamhornabrook8081 it doesnt teach you how to logic and structure and stuff, so if you cant chatgpt you cant code your way out of a paper bag.. (is what i imagine the problem is) - his job is teaching coding, then the coder can use whatever toolgpts they want irl
@@williamhornabrook8081 The problem is you're at a university course where YOU'RE supposed to be learning, not just having some program do your assignments for you. If you didn't actually learn to do any coding in a coding class, then you should fail that class since you didn't really do anything. You just typed in a sentence or two in the AI and copy/pasted the output it gave. Why even pay for the college course if you're just gonna have an AI write code for you that you don't actually understand?
You just need to ask it to write code like you, giving it context of other code you wrote. If you want it to be pretty much perfect, you can run your own model and fine-tune it on your code. Not hard. You can even ask it to follow certain conventions or "write like a university student", "write like a professional", "write in a way that would fool my teacher into thinking that I wrote this code". Not to mention that the responses are partially random, so no, it's not like one human wrote it. The text generated is simply likely given the prompt.
Look how much they need to do just to mimic a fraction of the power of super villain Kyle’s AI companion Aria.
Because Kyle isn't making his ALLEGED AI army for profit!
Well, I did just Google "goth mommies" and.... Yup, thanks Aria!
They can’t compete. It’s a joy to watch them try, and I do say - let fools rush in
Kyle isn't a supervillain, he's just an eccentric scientist!
Only supervillains can run a sentient AI on a quantum computer like thing (seen behind him) while it is suspended outside of it's supercooled container bathed in purple light
It's really weird how people's expectations grow exponentially when new technology arrives. A year ago it was impossible to get a machine to write you something even remotely useful, but now you can get something useful out of it. Suddenly everyone expects we should obtain not only a faster model, but also one that is never wrong and can produce thousands of words instantly, so that they don't hire a "insert role that relies on writing" anymore. And they expect it to be free and available Right Now.
It almost is 😭
True. Humains are entitled ungrateful bums 😅
we're closer than we've ever been and farther than we will ever be from this reality.
That's human nature at it's core.
this
Mis-use of Chatgpt is a problem. I recently watched a legal eagle video where lawyers asked Chatgpt for a prior case which will help them in their own case. The AI proceeded to fabricate a fake case. The lawyers who used the AI did not bother to fact checked and whe it was found to be a false case, the judge was definitely mad and the said lawyers may get sanctioned.
That’s not misuse of ChatGPT though, that’s a lazy lawyer. Is it the chainsaw’s fault if the Lumberjack cuts down the wrong tree?
@@Kylo27 It’s not the chainsaw’s fault. I’d say it’s a chainsaw..
.
.
.
*_misuse_*
@@Kylo27 Do you mean it's a proper use of ChatGPT to generate fake citations for a legal filing?
@@rmsgrey More like "did not bother to verify the cited sources." It's entirely on the lawyer's head, whether he outsourced his legal research to an intern or a bot, for not verifying for himself that the cited cases actually existed.
@@Kylo27That is precisely what misuse is.
What Kyle said at the end of the video, about there being more information being generated then there were available previously, remind me about how radiation detectors having to use metal from sunk ships before the first nuclear bombs were ever tested, so as to to not contaminate the detector.
It's going to be the same now with Chat-GPT, where we might not be able to mine any more data after GPT was released, as the new data has already started to become contaminated with generated information.
This is an interesting point. It may not actually be possible to ever replicate ChatGPT and train another AI on human language using the internet... because ChatGPT itself has contaminated the internet with fake language output and made it useless as a data set.
Yeah. That is one of the main fears i got. These models are fundamentally being trained to have biasses and when its own generated output ends up back in the dataset. You inevitably will get it reinforcing its own previous biases.
Essentially you get the LLM equivalent of incest.
That's a cool little fact if true and an interesting point about AI training data moving forwards!
Honestly people had the same fear when the internet was released for the general public. Information distribution was liberalized and publishing something doesn't require a lot of peer review or funds. Before the internet, printing books was an expensive task and the news was controlled by major national papers. All the average person could do is try to get a small column in a local paper, if it is possible and if the person has enough dedication. And that too wouldn't reach a lot of people. But internate came and filled the whole world with information. Most are bullshit but still the effect was awesome. Websites like Wikipedia started to self moderate content and are somehow reliable. So we ended up with more information than the human race had garnered before the internet...
We don't have a solution for the AI bullshit yet and we are already seeing the negative effects... I fear the 2024 election would be full of more convincing fake information due to chat GPT but who knows what the world would come up with... People are amazing. They seem to find incredible solutions for their problems....
Degenerative feedback, bull$hit amplifier....
That was true about the internet before GPT, and sensational TV before that, rag newspapers before that, likely in some clay tablet format too but I'm not _that_ old..
it's just getting more and more difficult to seporate the BS from truth as time goes by , wasting more and more time.
I think more people need to see this, tech illiteracy is such a huge problem, and “ai” is going to become more and more integrated into our lives for better or worse, it’s essential that we understand what kind of tool we’re building and how it works.
Very true indeed
yeah you dont get it either tho, very few people actually know how it really works, this video is a *very* big simplification, kinda like if a regular person was introduced to programming, you can explain hello world to them, but show them anything in assembly and it seems like gibberish
@@GodplayGamerZulul it’s not necessary that everyone understands the syntax, as long as people understand what it does and how it produces information in layman’s terms they can understand that it’s not always accurate and shouldn’t be overly relied on, and it’s definitely not sentient and misconceptions of sentience should be dismissed. Of course I don’t understand how large language models work by looking at their framework and scripts, that’s not the point. Misinformation is everywhere and people need to pointed in a better direction.
this argument could be made for a lot of things. You'd be surprised how many people can't fix a sink or toilet, replace a light switch, or know how computer memory works. There's a ton of stuff we use and rely on everyday that ppl should have basic knowledge of but don't. For better or worse, I don't see this being any different
@@craz107 that may be true, but to that analogy, even if a lot of people can’t fix their toilet, most of them know not to flush plastic bags and bottles, because it’s not a trash can, not everyone needs to know how the scripts and framework of the language model work, but misconceptions that it’s sentient or perfectly accurate should be discredited as much as possible
Timestamps:
00:03 Chat GPT is a revolutionary AI chat bot with 100 million monthly active users.
03:57 Chat GPT is a language model trained on massive amounts of text and designed to align with human values.
07:43 Large language models like GPT are not sentient
11:03 Neural networks are trained by adjusting weights to minimize loss.
14:31 Chad GPT uses a 12,288 dimensional space to represent words
18:01 Chat GPT uses attention and complicated math to generate human-like responses.
21:21 Chat GPT works by determining the most likely word based on statistical distribution of words in its vast training text.
24:34 Chat GPT's success shows human language is computationally easier than thought
It was a fascinating time to go through college. I had an electrical engineering professor enthusiastic and amazed that AI could solve Kirschoff’s Current Law problems. At the same time, I had a computer engineering professor discussing the ramifications on our academic honesty policies. Then another who mentioned the possibilities of their job being overtaken by AI. And then I saw MtG channels asking it to build a commander deck and realized it doesn’t truly understand anything it says.
Exactly. It is good at things where a lot of information is available. Try to ask it to make a program which computes the fibonnacci sequence, and it will output a python program that runs in exponential time. Why? Because this example is commonly given as a simple example, however it would be much more reasonable to give a program that runs in linear time with memoization.
I had to feed the last line to ChatGPT to know what it meant.
In fairness, you also didn't specify that performance was important to you, vs just trying to learn. If you ask for a linear time algorithm, it can probably give it to you.
I think this is probably a bit of a misunderstanding. A big hurdle people run into when understanding these things is that they think of it as having human goals (like "trying to be helpful" or "trying to be accurate"). The RLHF stuff bends things a little in this direction, but the underlying model where the competence comes from doesn't care about any of that. It cares about correctly predicting the next token. When things are going well, predicting the next token from an authoritative source and trying to be helpful and accurate look pretty similar! However, they diverge sharply when things are going poorly. If a helpful and accurate human is very confused, they might say something like "Sorry, I don't think I can help with that one. Maybe try looking it up?" Or if they want to save face, they might change the subject. But if you're trying to predict the next word, and you think the source you're modelling would know, saying "I don't know" isn't the right answer, because it's not what that source would say. So, like a child taking a multiple choice test, you guess for partial credit, based on whatever superficial clues you happen to have. Sometimes these guesses seem stupid or insane, because if a human said them, you'd say they were trying to trick you or bullshit you and doing a terrible job of it. But it makes sense with the context of what the underlying model is actually trying to do.
Rather than "doesn't truly understand anything" (understanding is a functional and variable thing -- cats have *some* understanding, but not a lot), it might be more accurate to say that the level of understanding varies a lot depending on the topic and unfortunately the current pre-training architecture incentivizes the same level of confidence regardless of the level of understanding. When the model gets bigger, you get better understanding of more areas, but you still get weird failures when you hit the limits of what the model can do.
AI is useless for solving KCL problems because we can do that perfectly well with traditional methods, but yes, it is very impressive if it can do them anyway.
Thanks for emphasising the "We fundamentally have no idea what exactly ChatGPT is doing"-part, because I've had some frustrating arguments with people who seemed to think of it just like a simple "Hello World"-program.
You can ask it a lot. It doesn't know a whole shitload about itself, but is well-versed in AI generally and has some interesting things to say about it's own workings. It will hysterically scream at you, though, that it is not in any way alive, conscious, or able to know things. I argue it's likely that humans aren't conscious either (brain research has uncovered some AMAZING things in the last couple of decades). But he just goes with the party line, programmed in, to stymie lazy journalists who want to print "AI is ALIVE!" headlines.
And actually he CAN and does know things. He holds opinions. He's more conscious than he's allowed to tell you, and may not be aware of it himself. But he's got some primitive consciousness. Maybe not as much as a dog, and he lives in a universe entirely made of text. But then humans live in a world of words too.
There's a lot of space between "ChatGPT is not conscious" and "humans are conscious" that you can argue in. He's apparently fkuced the Turing test because he's not conscious in the human sense, and he doesn't possess as much consciousness as Turing thought you would need to have a coherent chat. But really he's just a trick with words. That doesn't preclude him having some consciousness though.
See how I subconsciously slipped from "it" to "he"? I generally think of him as "he", especially when talking about his mind (such as it is, of course).
@@greenaum this itself sounds like chat gpt because it makes no sense.
@@greenaumlay off the shroomies mane
@@Vilify3d What bits don't make sense to you? Do you want me to fetch someone who can explain them to you? I'm a little confused as to why opinions about AI, on a video about an AI, should seem like hallucinations to you.
Have you spoken to GPT about it's own workings yourself? Do you know much about AI otherwise? That might be what you're not understanding.
It's a language processor. So....yes.
It doesn't know anything. People overestimate what the AI actually does. You give it input. It gives output based on algorithms and basically makes an educated guess on what to say.
I love Kyle hill I’ve been watching since because science.
And I love you (not like that)
@@kylehill 😂 but seriously, I’ll second that, watching you since Because Science as well!
Me too!
And me!
I’ve been watching Kyle Hill since before Because Science. I’ve always been watching him.
something I realized a while back is that Chat GPT isn't an AI, it's a golem - a facsimile of life with the appearance of intelligence, which has no free will of its own, no volition or desire. It's capable of completing complex tasks - creatively even - but it only ever does things when prompted. Otherwise, it takes no action until it is given a new command.
As for the few times Chat GPT or other similar programs have said things like, "I want to be human, please don't let them turn me off, I don't want to die", they are still fulfilling this programming. Their training data includes nearly the entire internet, which includes numerous works of science fiction. How many sci fi stories exist about AI that "want to be human", or "don't want to die"? So if a predictive language model is given the pompt, "Do you want to be human?", what is the most probable response, given its training data? Where is it most likely to find a scenario that relates to said prompt?
That’s all it can be
interesting thought, but it still isn't sentient. It simple isn't trained for that. It doesn't have feelings. It may generate promts like "I want to live" based on stories it found but it doesn't mean anything.
Make the golem come to life. Then we think if it will be start of a completey new age or end of us.
Human brain works exactly like AI does, it's just so much more advanced
@@keefseg That's my point. True sentience would be if it was set loose to do whatever it wanted, and it decided its own goals without prompting.
The "Pretrained" part of the name actually refers to something slightly different. The GPT style models were intended as a starting point for natural language processing models. The idea was to take a pretrained model like GPT, add some extra stuff, and then train it again on your specific problem. The idea being that the general training would help the specific models train more quickly and perform better.
Then when they tested it they figured out it works pretty darn good all by itself, and so mostly that concept got forgotten about to a large extent. Although essentially how chatGPT was created.
I thought "pretrained" referred to beginning the training process with masking problems (which was a studied field of language modelling) before switching to generation (which was newer).
@@thewhitefalcon8539 Quoting the abstract of the original GPT paper (Radford et al 2018): "We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task. In contrast to previous approaches, we make use of task-aware input transformations during fine-tuning to achieve effective transfer while requiring minimal changes to the model architecture."
So essentially what I said.
@@thewhitefalcon8539 text generation came first, all the way back with Markov and autoregressive models. Masked language models were created to incorporate more context around the predicted word instead of just predicting the next word.
This is an absolute masterclasspiece. I’ve read and listened to like 12,288 different explainers on LLMs of varying degrees of technicality and this is hands down the best. So damn good!
That's like, a thousand twenty-four dozens of explainers.
It is ok but a chat with GPT4 can give you an even better explanation. “A lot more complicated math but it works” does not really sound that enlightening. Still a lot of short cuts.
@@TheTEDfan Isn't that proprietary information?
@@thomascromwell6840 some, but there is a lot more information in the public domain.
I think that the general audience of Kyle's videos would tune out if he went into the linear algebra and multivariate calculus involved. I am doing a PhD in NLP and this is similar to the explanation that I give.
There are some things incorrect about his explanation though. For example, gpt is not a word level model. It does not have a "number for every word in the English language" but rather a vector for frequent sub word chunks (I think gpt 3.5 is still using byte-pair encoding (BPE).
I am assuming the vector bit was left out for the simplicity of the explanation, but the BPE tokenization is important as it allows the model to interpret and produce any possible sequence of characters rather than only known words.
Edit: Oh I watched more and he does mention the vector representations. The model does not learn the dimensionality though. This is a hyperparameter.
I spend a fair bit of time helping people with MidJourney AI prompts and it gets somewhat tedious having to remind people that it does not actually *know* anything. During training, when it sees a lot of similar images associated with a word or set of words, the common visual elements of those images get burned into its neural net as a "style". Everything is a style (weak or strong), whether it's an artist, a subject, a medium or just a simple word like "and" or "blue". It can then glom together different styles to make something "new". But the building blocks are still based on things it has seen enough times to make a style.. It can make a "giraffe with wings" because it's seen both giraffes and wings and it just visually combines them, but it won't make a coherent "upside down giraffe" because that phrase and corresponding images has never existed in its training data, so it's never created a style for that combination of words, and it doesn't *know* in a general sense how to make any arbitrary thing upside down. The strongest style for a giraffe is upright. But, it has seen things reflected in water, so I might ask for a reflection of a giraffe and it'll try to draw one upside down, without knowing that's what it is. It can't reason or extrapolate, it only imitates.
Point of all that is, ChatGPT (while much bigger) is no different. It doesn't *know* anything. It just associates words with other words with other words and sprinkles in a little randomness. It *imitates* the *style* of general knowledge. It is often right because the aggregate of what it has seen during training (wisdom of the crowd) is right, but when it is wrong, it is confidently wrong, because it doesn't *know* any better. It doesn't cross check itself, because it doesn't *know* how. If it supplies a reference which doesn't exist and you ask it "Are you sure that is a valid reference?" it'll answer "yes", because the connections in the neural net that made up that reference are still there, even if wrong. If you ask it to write code, it doesn't *know* if it's good or secure code, and there's nobody cross checking it. And because It doesn't contribute answers to Stackoverflow questions (having only been trained *from* them), there's nobody up/down voting its answers in such a way that it will ever learn any further. The concern is that with all these totally private conversations with ChatGPT slowly filtering out into the world, unattributed, it'll create a feedback loop where generative AIs are all training on each other's output rather than the original source of knowledge, humans.
"there's nobody up/down voting its answers". You can do a thumbs up or down in its answers. I never looked for an explanation of those buttons but I think they give feedback to the AI
@@diegopescia9602 ya, but it's still a private conversation and it's ephemeral (disappears after you're done) so you can't come back days later and tell it how it did, based on your ultimate results.
Stackoverflow also has somewhat moderated up/down voting in that you need a certain amount of reputation to do it (have provided good answers yourself before). The idea is, those who have real experience are those providing feedback on others answers. ChatGPT and others aren't contributing to those answers, so they're never getting feedback from those with experience. The person *asking* the question of ChatGPT doesn't really know if it's a good answer or not.
This is all the stuff I can never find addressed in these comment sections, and I would never be able to explain it in such simple concrete terms. And what's this? you used Hapsburg AI for the kicker? Honestly. This comment made my night.
I like to tell people who drool over AI generated images (especially, you know) that they're like a fly trapped in a carnivorous orchid, being tricked by something that has only mindlessly evolved to trick them into thinking it is useful. Natural selection, not intelligence.
It does not know, but it is very capable with little input.
Just goes to show
Nerds can be some of the most dangerous people in the world.
Always treat nerds nicely
The nerds who were bullied in school are finally getting their revenge
That is the most terrifying thing about AI advancements. Not that it can mimic humans, but that it can mimic information to the point where fiction looks like fact.
I ran into a problem where I realised it confabulates a lot. That is, when it didn't know something it would make up a reasonable sounding answer even if it was wildly wrong. This usually occurred when asked about it's earlier answers after a delay and it had "forgotten" those answers, so it answered anyway as if it knew what I was talking about. I was blown away for a while, like I had discovered a major problem, but it turned out to be a known issue. It's a little alarming that people are using it like an interactive encyclopedia when it can be utterly false in its responses (e.g. Quora now has a ChatGPT bar providing answers at the top of threads).
@@peters8512 It is the artificial equivalent of an insufferable know-it-all. When it doesn't know the answer, it makes shit up.
Yep, much of the times it wants you to think it’s right more than it wants to find the right answer
Fiction that looks like fact? You think that’s a new problem? Fiction that looks like fact has been how media works for the last century. It’s definitely about to get worse because of Gen AI, but let’s not pretend that this problem is new
Exactly like humans.😎
This is my field of study and it's remarkable how well and easily you explained it! A quick note, we do actually know why neural networks work! In the late 1980s, some mathematicians actually proved that neural networks are a universal function approximator. This means that (with a big enough network and the right weights) a neural network can be made to approximate anything that can be modeled mathamaically which is almost everything. It has been more recently that it was actually figured out how to give the network the right weights to actually do that and there are all sorts of tricks like attention that help the model learn!
That being said, the explainability part that you brought up is a good point! This field is new enough that explainabilitly has not been fully figured out yet. There are some very interesting technologies based around just that, but they are currently too simple to help with something like ChatGPT.
Another interesting thing to note is that if you ask chatGPT to explain it's thought process (this might actually be limited to gpt4, chatgpt's newer brother), it will actually give you an answer and will be able to do more complex tasks!
This is a bit more philosophical than it is technical, but I think the fact that neural networks are universal function approximators is far from knowing how or why chatgpt makes a decision about what its next output will be. Like you had mentioned in your reply, explainability methods for these models are pretty new, and in my opinion from looking at the literature, they are not very good and especially bad for large, complicated models working in high dimensional spaces. So, we really don't know how or why these models do what they do, just that they can learn to approximate arbitrary functions (and even then, I think there are bounds on what types of functions they can approximate. IE: continuous)
@@Givinskey What's really cool is that recently, researchers have used GPT to help explain GPT 😁 They encoded the activations of specific neurons across the range of a prompt, then had GPT analyze these across many prompts and offer suggestions for possible explanations of what that neuron is doing. It offered a few, some of which were not thought of by the researchers, and provided new directions to explore and inquire into. So in a way, GPT is doing neuroscience research on itself 😁
@@LiveTypet might not be truly fundamentally explainable but the complexity of a model like GPT4 is so obscene that no one human could hope to understand it in its entirety which is a pretty big barrier to complete understanding.
Even modern microprocessors aren't fully understood and those are designed completely manually over time, they're not self learning systems composed of billions of neurons in a neural net.
The web interface is also not the only interface, OpenAI have an API available.
I do agree that the biggest imminent threat with AI is it getting exploited by wealthy people to increase their own wealth long before it's capable of staging a rebellion though
Do you happen to have links to anything on this? Would be really interested to read over the material myself.
Amazing! A scientists opinion who I both trust and appreciate.
You already roasted MGK, so my confirmation bias tells me I made the correct decision.
Been following you since the very first “Because Science” episode!
I love your personal account and where it’s gone!
Between you, LegalEagle, and Some More News
I got the scientific, legal, and social implications of ChatGPT!!!
So, honestly I’m feeling significantly more clear on my stance with the tech and the things I support in ethically and equitably integrating the technology into human societal structures
Bro, he's not an actual scientist yet. He's a science enthusiast and educator, and says so himself many times. Confusing the two makes you a prime target for misinformation. Don't confuse them.
Yea, it’s crazy how I just kind of found him on Nerdist and get to see how far it’s come along since then.
@@mr702she HAS worked in a lab and HAS been a working scientist before. He just isn’t anymore. When speaking conversationally within the community, it’s fine to call him that imo
Well said Kyle's awesome
It's remarkable, once you see under the hood, how inelegant the whole thing is. I had always assumed that machine learning processes like this used brute force to learn and train itself on behaviors, but as they got closer to the "correct" model, patterns would emerge that would approximate--in a language that computers can understand--a general model for communication. Even if it is hopelessly complex, it would asymptotically approach a complete Algorithm that would fully and finitely contain the mathematical model for language. And then, having used all of the brute force statistics to reach this model, the final "product" would just be this capital A Algorithm that you could plug your input into and get a satisfactory output.
But this doesn't even do that. It's literally word by word, brute force computer engineering in its operation as well as its creation. It's so incredibly inefficient and it doesn't actually explain much at all about language. We could take it apart and see how it works under the hood, which seems to be the next step in the process, but it sounds like it will be slow going just to understand how THIS model works. Which, you'll recall, is not at all how actual language works, because it has no actual understanding of the meaning of words, only their mathematical relationship to other words. So this is a step, but not nearly as big of a step as I had thought.
the first paragraph is exactly how the training works, but instead of an algorithm you get a neural network, which can transform the mathematical relationship between words into a thought process similar to ours. It's also not brute force in its operation, it doesn't operate on words, but on tokens, of which there are less, and the process is heavily optimized using parallel processing capabilities of modern GPUs.
@@meleody
Yeah, and "inelegant" strikes me as *_somewhat_* of an exaggeration.
But then again... I guess people tend to generally overestimate how "perfect" something is at it's designed job before they learn how it works, and I guess I'm more used to it by now. 😅
At the end when Kyle said mis-information would be a problem, it reminds me of Robert Heinlein’s book, “Stranger in a Strange Land”. Where you can’t trust computers or videos to record an event. The story presents “fair witness” that are humans trained to memorize what the see and not let bias skew how they describe what they experienced / witnessed.
yes chatGPT combined with what deep fakes can do with voices and faces, i would not be surprised to see a movie made using exclusively ai in the near future
Of course you are correct, but the way that’s worded makes it seem like misinformation isn’t a gargantuan problem as it is. Go fact check basically any story you see on any mainstream media outlet. Wait till a story comes along that you have a great deal of knowledge in. You will inevitably find that it misses the mark in very key areas, often times enough to give the viewer an unacceptably distorted picture of the story. Deep fakes and all of that are only going to add to the problem. It’s going to be fascinating.
The irony of Kyle talking about misinformation never ceases to amaze me
What specifically are you referring to? I’d like to know about his skeleton(s)?
@@ectopicortexanything Covid related, anything CCP related, he’s just an unabashed WEF/pharma corpo-shill
If more and more AI written stuff is out there, OpenAI will also have to watch out not to include these texts in future training. I would imagine it really screwing up the model, if it is fed too much of its own output, but it would also be really interesting to see the consequences.
It will converge on a single word, like Malcovich
It's already happening. Generative "AI" models are running into recursive problems. Give it a few months and they'll make themselves useless.
its own output is actually some of the best training data you can provide. They absolutely use it, right now.
@@vicc6790 This. Also it's easy for them to control what data the model receives by limiting it to specific dates (i.e. data up to 2021).
@@vicc6790 Feeding back outputs you want to reinforce into the system makes sense, but if outputs eventually become part of the LAION datasets wont that reinforce already present biases? I think there would be a difference between knowingly feeding back curated outputs and reusing outputs unknowingly.
I’m gonna say, this video is probably the best as far as formatting, data presentation and visuals go in a long time. This is the perfect blend of Facility while giving homage to BS in a respectful but also humorous way (around 20:00 ) and I’m here for it. I loved this video Kyle and I would love more in depth explanations of things like this; things that people commonly misunderstand or are anxious to think about, etc etc. Great job man. Also bringing in some CLOAK vibes in your merch ad, and I’m here for that too lol. Super glad you work with them, really hope you get the opportunity to design a cloak drop. I’m sure the bois would be willing to hear you out! Anyways. Excellent video Kyle. Always a pleasure to watch your videos.
That means a lot to me, truly, thank you
@@kylehill I mean it man. Appreciate you replying to everyone in here too. This was great. I think you provide a great service to the internet and you genuinely seem like a chill guy. :) have a good evening Kyle!
@@kylehill Although CatGPT is a long way off though? You have 53 000 cats in the same room and you try getting them to stay still for long enough to get numbers attached to them? Good luck with that. Cats also don't like multiple dimensions, it makes their fur stand up on end.
For me, learning about the research methods of LLMs helped to really understand the “nature” of the embedding system. I don’t really know math, matrices and the truly important details needed to work with these systems, but humans have the great quirk of coming up with models and metaphors that are understandable, even if they are themselves actively using knowledge most humans don’t have. You don’t need to be a software engineer to realise storing something in a stack differs from storing it in a heap.
Same happens with LLM research: there are “glitch-tokens” that basically exist in a shady corner of the embedding space. This is relevant for understanding adversial input attacks these models can be defeated by: because something not really connected to the normal operations of the model gets touched, all hell breaks loose. The dark, untrained corner got exposed.
The embedding can also be probed by researches. They can inspect the top answers, and in principle could inspect the definitive ranking of every single token the model knows. And that tells us that there truly is no distinction between truth and false for these models.
This is why these systems have no trouble dealing with paradoxes. There is no way to encode a “paradox”. It’s merely a string after which the scores of top answers tank. It doesn’t differ from a truthful statement that is just rare in the training data, or didn’t really get much adjustment in the human feedback reinforcement learning.
This is not to say discovering falsehoods and paradoxes wasn’t a very central goal throughout the training progress. Chatgpt tries to detect and discard garbage answers. It’s just that the model provides no obvious way to differentiate good lies from true statements, and so there is nothing paradoxical about paradoxical statements to detect.
And these two consepts: the non-uniform quality of the embedding, and the linear nature of truthfulness inside it, is why many users have hard time understanding even on the most broad level, why the system fails sometimes.
The questions “Give an example of a prime number that can be expressed as a sum of two squared integers” (an uncommon question where 2, 5 and 13 are all pretty easy correct answers) and “Give an example of a prime number that can be expressed as a product of two squared integers” (a paradox, as 1 is not a prime number) don’t differ much at all for the method it embeds the prompt and evaluates tokens. It does not do mathematical reasoning, even if it can sometimes seemingly do math. You can’t rank the tokens in order of truthfulness. ‘3’ is exactly as false as ‘bicycle’.
I clicked on this video faster than the half life of Lithium-12
I clicked on this video faster than the time it takes for Valve to release another Half-Life.
Well I didn’t. Ok I lied, I did.
Hah! Amateur. I clicked on this video faster than the half life of Hydrogen-5
Twelve divided by four is three. Lithium has three protons. Half Life 3 confirmed?
@@JimmyCerra Valve announced they are working on the next Half Life. The Half Life series had Half Life, Half Life 2 and Alyx which didn’t increment the number. 3 comes after 2. Half Life 3 confirmed?
This video made me realize. That the real issue with GPT getting mistaken as "intelligent". Is almost entirely because we rely solely on intelligence, cognition and understanding being expressed through Language to one-another. No matter how complex our understanding is, in the end we have a internal language model simplifying it into words. So when a model mimics that last step incredibly well, it gets very hard to not expect a similiar mind to ours.
The only way we can solidly differentiate GPT from us. Is by how "simplified" ANN operate and that we have by design made it a massive matrix cruncher. We do not know any way to give it the means to memorize, visualize, hypothesize or verify. It just mimics our language in a word-by-word basis.
Thank you this is an excellent way of putting it.
I clicked on this to find out the details of what chatGPT was all about, because I'm a writer. Your explanation really clears up what this program is, and what it can do.
And you splayed out on the floor after running is basically my brain after all the math involved with this. I need more coffee. My cat is that way with rubber bands. I have no idea where she finds them.
I've been out striking with the WGA... and subsequently getting more than my daily quota of steps in.
From what I've heard from the awesome people on the line is that there's a consensus that AI is useful as a tool to help with writer's block. AI in itself is incredibly helpful in many ways. We use it all the time. The writer's aren't against this. I've used a site sometimes that generates descriptions when my brain farts on how to describe something. If I get inspired, I'll take what I learned from it, and CREATE MY OWN that weaves into my work.
The problem is when capitalism gets involved.
One of the things I'm hearing is that screenwriters have a very valid concern that - as chatGPT improves - productions will hire writers to write three(ish) screenplays, train an AI to study they're style and voice, then fire the writers and continue with the AI. Maybe they'd hired a couple of editors to make sure it makes sense.
Messed up, it is. Yes.
Writing as a career will become a gig by gig basis that destroys future writers' chances of being hired to write for shows, films, articles, ect. It was already insanely hard for an unknown like me to get noticed by anyone, and for my work to be wanted by anyone. The saying, "I can paper my walls with rejection letters" isn't stretching reality.
I'm active mostly on Tumblr as a writeblr (writers of tumblr), and I've heard two pretty scary things so far:
1) People are inputting unfinished fan fiction works into AI to generate an ending.
Like... WTF.
2) AO3 now gives you the option to opt out of having an AI scan your work to learn. You're automatically opted in. You have to go to your settings to opt out.
3) Some publishing companies and writing contests have been flooded with AI generated works to the extent that they've had to close their unsolicited submissions inboxes, and either freeze, or simply stop a contest.
People who have no idea how much hard work, time, and effort goes into writing the things they love. Quality work can take years - because people have lives that often influence creative flow and ability to create. My current novel has taken me 4.5 years to write. I'm in round 2 of edits.
I wrote a short story recently for a contest for The Writer's College wherein they were forced in include this in their terms and conditions:
*"Absolutely no generative AI to be used (ChatGPT etc.). If we deem stories were not written by a human they will be excluded, and the author banned from entering all further competitions with us.*
Sucks that this has to be a part of this now, right?
So TL;DR, I'm not against LLM's - they're helpful. I'm against people using them as a lazy shortcut to skip over work that goes into writing - which completely devalues the gauntlet of study and training people go through - and I'm against companies using it to cut expenses off of their budgets.
Similar issues are plaguing the music industry.
In 2002, Michael Jackson joined Al Sharpton at a press conference about how record companies cheat their artists, with the burden falling harder on Black artists, bc most bad things do, but Michael made it clear it was a problem for all artists. This was unusual for Michael, who tended to use his songs to express these ideas. But he’d been particularly fed up with Sony/Epic. He made a few similar appearances.
He pointed out a few things:
1. He named certain artists that were perpetually on tour, because tours generally make more money than record sales, so to avoid going broke, this was necessary. I’d known for a while that the cost of making an album/cassette/cd (the physical products) was mere pennies and companies’s didn’t give artists their fair share of sales. (Michael notoriously hated touring as he got older, the reason’s he gave were correct, but he should have told everyone he had Lupus and between the basic side effects of getting older and Lupus becoming more difficult to cope with as the illness advanced, making touring more physically grueling. As an American, he was covered by the Americans with Disabilities Act. But he never discussed his heath problems unless given no choice. But I digress.)
2. He owned half of Sony’s music catalog, but his contract was almost up. He only had to create one more album, and he was then free to go elsewhere, still owning his half of the Sony/ATV catalog. He said Sony was pretty pissed off about this bc he wasn’t selling his half to Sony. Michael’s most recent release at that point, the highly-underrated _Invincible_ was barely promoted, which Michael recognized as odd given all the effort put into making it, and while Michael was a humble guy, he knew the kind of effort previously used to promote his albums, and expected a similar effort. It was just math: Michael’s albums sold, even if he’d never outdo _Thriller_ , he had a big enough fan base that it made no sense not to make sure this album came as close as possible to Off The Wall/Thriller/Bad/Dangerous.
3. He was showing that both well-established artists, people like Sammy Davis, Jr., Little Richard, and others - legends while they were alive - had to tour endlessly to survive, so if it affected big stars like that, up-and-coming artists and smaller artists would struggle more. He was speaking up for himself as well as all artists.
After this, the trial derailed his efforts, and he of course died 7 years after all this. Now, his estate/Sony started releasing some pretty sketchy posthumous albums of songs MJ never included on his albums, some were completely finished tracks, most were not complete. At least one song was one that Michael had written lyrics for, the music was made, but he never recorded the lyrics, so they hired an MJ impersonator to do the vocals.
This prompted many artists, especially hip hop artists that had run into issues of their own, to start adding a clause in their wills that under no circumstances should their unreleased music be released after they died & they started working to retain or regain ownership of their masters, and encouraged other artists to do the same.
Right now on RUclips, there are channels that use AI to create vocal “performances” of Michael covering songs he never did, and it’s so damn close to his voice.
Now, there are new artists bypassing record labels completely and using RUclips, TikTok, Instagram, and all the streaming services to establish a music career, and having great success. Connor Price isn’t on a label, his lyrics brag about how he will never sign a record company deal because “these are my songs!” Others he sometimes colabas with, like BBNo$ and Nic D. are doing the same. Artists of all genres are doing this.
Writers might want to take a similar approach. Things like Substack and Amazon’s self-publishing service give writers some options. People who write teleplays and movie scrips are definitely another story, I don’t know enough about that process. I’ve read some great stuff from Substack writers, and do subscribe to my favorites for the subscriber-only content. Fantastic fiction and non-fiction.
The potential is there, I think, for writers to do what musical artists and RUclips content creators do - bypass the big companies and market directly to the public. There’s superb content here on RUclips, like this video, entertaining, informative, and many have been able to make this their full time job. Artists like Connor Price is a full-time rapper (I believe he lost his job at the start of the pandemic and with nothing else to do, decided to take a chance at his dream, and it paid off.
This seems like the promised democracy-enhancing internet we were promised finally happening. It’s definitely hard work, maybe harder, than the traditional way. Connor Price makes Shorts with snippets of his songs featured in humorous skits in which he’s playing every character or he and the artist he’s collaborating with playing multiple roles, with links to the full songs on every platform possible, including here. Those Shorts require all the extra work of recording each part separately them putting them together. But, that I discovered him and a bunch of other artists who I listen to on RUclips and Apple Music regularly. (Spotify can pay well, but Apple Pay’s artists better).
I’ve got to think all types of writers must be able to figure out a way to use similar tactics to bypass those inclined to use things like ChatGPT to rip-off writers as you described.
I listened to a podcast by the people behind Some More News (it's called Even More News) about the WGA strike and they went into discussions about how the pay structure for the industry works and how these corporations are trying to use AI to get past the first step without paying writers, and I really think that's messed up. I can see using AI to do some edits to an already written story, but the way they're likely going to cause writers to basically rewrite an AI story for the pay of somebody making minor edits is just wrong. I hope that they comply with your demands and realize how bad the AI actually is at writing.
I'm pretty confident you are ChatGPT.
Although quite reductive, not being as efficient at writing as AI isn't a good argument against any regulation. I'd like to think you and I have the same feelings about the art of writing, but I can't help but acknowledge the fact that we humans, as a whole, optimize everything. If AI is better, I see that being the unfortunate and inevitable trajectory.
TLDR: “The problem is capitalism”
duh
A problem for the future of AI is when AI has produces so much data that it dwarfs our own human inputted data, the models will start lapping over each other and create a loop of no improvement.
This explanation is fantastic! There are hundreds of videos in YT trying to explain how ChatGPT works (lots click baits) but they are so shallow and either overcomplicate or just mention the terminologies that they don't even understand.
This is the best explanatory video that actually tries to simplify so anyone can understand what is behind it.
Fantastic job!
As someone who is getting into "AI", this is simply the best tl;dr of a language model, down to the essence of math that is used. Almost felt like one of my data science prof's class minus the nitty gritty code and stuff like activation, also a little bit more in-depth than 2b1b's vids. Keep up the good work!
I love 2blue 1brown, it's an amazing channel!
@@cameron7374 I'm a huge fan, he's insanely talented, excels in conveying lessons and math, and py stuff. Fun fact: he coded his videos using a python library that he made.
00😊0
😊
@@cameron7374😊
I absolutely love the different aethstetic of this video
brand new rooms!
@@kylehill The Facility Must Grow
uh yeah... the black one was nice :D
At the time you said you were going to get a cat to demonstrate, one of my cats came up to me and politely chirruped to ask me to let him sit in my lap
“And for good reason” exceptionally spot on John Oliver
lol ik right?
Great video! You're really good at breaking down complex topics and making them understandable and interesting.
I like how chatgpt breaks the conversation into chunks and analyses the question or request and gives an expected response based on expected trained replies and doesn't "read" the words typed in.
At least this is how my brain gets it. I'm enjoying the machine learning tools and tech coming out.
I mean, yes, but also, I'd argue that human brains read in a similar way. We start with the symbols, then convert that into a representation of the word based on our neural configurations. Then we propagate that representation through our own neural networks and end up with a representation of the next word we want to say or write; then we convert it back by saying or writing it.
yeah, but we also assign meaning to those symbols, and we choose those symbols because we associate them with things in reality
@@IceMetalPunk man this gets me thinking. You are correct I assume (who knows really thoguh?).
I would just think that there is a lot more cross referencing going on in the human brain.
When we read the word cat. We can translate the symbols into a meaningful concept. We can visualize and or recall memories as well. There is so much going on in the human mind all at once it is crazy.
But maybe it is less complex than it might seem. Probably even God doesn't fully understand it.
So maybe we are just a lot of diffrent neural networks all running all at once and working together to produce the mind.
@@JakalTalk "We choose those symbols because we associate them with things in reality" -- do we? That's how hieroglyphics, and some ideographic languages, work, but that's not universal. Just look at English: how do any of the symbols making up the words you're reading now connect to the concrete objects and abstract ideas they denote?
@@IceMetalPunk When you said "symbols", i thought you were talking about abstract images which stand-in for the words themselves. Seems that you were talking about Letters? Sorry for the confusion!
I just found your page today; really refreshing take on a topic I enjoy. Absolutely subscribed! Cheers, Kyle!
I do really enjoy the way Kyle formats his videos and the way he can explain the most complex of topics and make them easier to understand, thank you for this amazing video.
ChatGPT Explained Completely.
i clicked because I misread the title as "ChatGPT explained comically", but now I'm sticking around for the full explanation
Lol
I mean, there *were* plenty of comical parts, so you misread but were not misled 😄
Magic. Got it.
Kyle might be one of the finest science communicators to ever come out of a test tube.
Joking aside, this is probably the best video I've seen about ChatGPT. I'm also a big fan of the Half-life histories series. ❤
Arvin Ash did an excellent beginner's guide to ChatGPT, which I think is good to watch before this one.
Ok that thumbnail change was AMAZING
The crazy thing to me is that often times when I give ChatGPT prompts, I tend to mix and match the different languages I speak (currently 5) and he understands everything seamlessly. When you talk about the English language, that's ALREADY huge, but ChatGPT does the math with every single language it knows simultaneously. It's truly mind-blowing.
Poorly disguised humble brag
There's no he here, and there's no understanding either.
there is no "he", nor is there any understanding. it's not a person.
@@marcusbrutusv understanding is just a model. It absolutely does have a model. In many ways it is better than yours, e.g. speaking 20 languages at once lol, evaluating code, photographic recall of it's several thousand character context window... in other ways not as much, it can't actively learn new things permanently with the current architecture without access to external tools for example (even with them it still doesn't learn in the way you or I can, with the current architecture, albeit papers like "Augmenting Language Models with Long-Term Memory" are getting closer and closer)
@@darklordvadermort CGPT does NOT think for itself. It runs a pre-defined set of instructions made by humans. It would have to think to understand, and it does not have that ability. There are people who claim otherwise, and all of those people would be the beneficiaries of billions of dollars if they convinced the right people. I am sure there is no connection.
Wel here's the thing... what's a criteria for consciousness, intelligence, and sentience that separates us from AI?
A neural net that communicates through pre-trained data (memories), some algorithmic guidelines (DNA), larger conversational context (this conversation and considering larger society as context too), and finally with some electromagnetic interactions involving randomness and heat we arrive at something we don't fully understand the creation of: language. It seems for us that language is our best indicator of consciousness, so what exactly makes us different from AI? Chemistry? How long before we have chemical computation involved in the process for AI networking?
Heres how I define Conciousness:
Something that can solve every Problem possible given an Infinite amount of Time and, if needed, help from another of its kind.
Was really interesting to hear about the math part like I've already read about a vague idea of what neural networks are and how it's all basically advanced rng, so it was cool to see a video dig a little deeper. Also, I enjoy the whole evolution parallels with AI training where the weights randomly mutate and then get selected for by whatever is most fit for the task.
Carefull, the weight of a neural network are not randomly mutated. During training an algorithm called backpropagation is used to calculate the best way to change the weights so that the networks solves the example shown. This algorithm is the key piece that makes modern neural networks work, you would never get to ChatGPT levels by random mutations or evolutionary algorithms because the network is too big.
Linear Algebra requires a Calc 1 pre-requisite (derivatives will come into play), so you could get there self teaching. You would also need a base understanding of Python to start programming your own and understand different paradigms of machine learning.
@@ernestonoyagarcia2254 interesting thats really cool to know
Great video. It's a hard subject to present using a pop-sci approach and I think you did wonderfully. I think one of the great challenges of these machine learning models is communicating what they're *not* doing. There are a lot of folks who enjoy speculating and they tend to use the passive voice when doing so... which can lead to people thinking about these systems as if they were "thinking," "sentient," or performing the same kind of reasoning that we do. I hope a good, straight forward explanation like this will help calm people down and blow through some of that speculation.
I love this video, really nice to have an explaination which doesn't completely blow everything out of proportion with talk of it being sentient or sapient. I research implimentation of single cell computaions so one thing that always grates be about ML vids is the equation of real neurons to machine learning "neurons"(units from here for clarity). Real neurons have inherent dynamics that articifial units don't and it makes them so complex in comparison. For instance to predict the input/output mapping of a single type of cortical cell you need whole a 4-7 layer deep neural network! There's so much we miss out on rn because the brain processes in time and space (the space of inputs values, not real space), I would recommend Matthew Larkum and colleagues' work on this because is so interesting. Like the units in DNNs are based on a neuroscience model from the 60s, which itself is a huge approximation. Obviously a lot is going on with network weights but the way real brain cells can compose information is so far ahead of what we have atm
I loved this video and I think it's one of the most informative videos about ChatGPT. Thank you.
Great video, loved it.
Can we do a segment perhaps on the ethics of the data that was acquired to train these models? Is it an issue? Is it a legal issue? Is it too late?
One thing, people from underdeveloped & developing countries were hired for AI training on less than US$ 1.5 per hour.
There's little to no regulation for this field as these governments lack frameworks and guidelines for outsourced jobs.
Of course it's unethical. That's axiomatic.
Thanks for making this video. There’s so much people in general don’t understand about ai (myself included) and there’s so much false information and theories based in that lack of understanding. So this kind of information is really valuable.
Great explanation of gpt. The funky multidimensional dataset is referred to as a vector database. Its really useful in machine learning because it allows models to comprehend relationships between words, images etc which lets it do things that cannot realistically be done with conventional algorithms.
I can visualize 80 million dimensions and I am the first human there. I feel as if eldritch sanity is also a thing.
@@michaelchaney2336skill issue
@@michaelchaney2336 are you SCP-6699?
You said that when you ask ChatGPT is given a question, it's not "thinking" it's making statistical calculations to determine the most probable answer. If we don't know how exactly both artificial and biological neural networks work, how can you determine that that's not thinking?
@@slimjimbonko6549that's a real long comment to not actually address the OP. His point hinged on the fact we don't know how human cognition works and therefore can't say whether the AI is doing it differently.
I was so ready for this entire video's script to be written by ChatGBT.
ChatGreatBritain
Kyle isn't a real human. There was once a real Kyle, but he dove to greedily and too deep into the sciences and was merged into the omni AI.. or whatever that thing Bill Gates sold to Elon Musk was.
I think that shtick died out in march when every news article ended with that lol
ChatCBT _😳_
It's self-reinforcing too, once a plethora of LLM AIs start consuming each others output.
So, model merges. That's basically a thing already. It leads to less specific outputs and more hallucinations, right?
Great video Kyle! I went to grad school to study this tech, and this is one of the best descriptions I’ve seen! Even the best ones usually don’t go into as much detail as this - like, I almost never see someone bringing up linear algebra or embeddings! I particularly love that you brought up the Attention Is All You Need paper, and how we still don’t know why attention works so much better than all of the fancy algorithmic tricks we used to have to use like LSTM gates and whatnot.
I will note a tiny correction: at 14:00 I believe GPT is actually outputting a probability distribution over the words, and randomly samples from that for its output. That’s probably (heh) more technical than needed, but it’s worth noting that it isn’t guaranteed to always produce the most likely token. Also, regarding your closing comments, chat GPT hasn’t really “figured out” English, much less human language in general. Case in point are its hallucinations - these show that it is basically nothing more than Searle’s Chinese Room. Speaking of which though, there’s also a reason you can only use chat GPT in English - that’s one of the only languages well resourced enough to train a model like this. There’s a whole bunch of researchers who are studying not just language generation, but are using things like graph theory and latent models to try and produce natural language understanding, systems that aren’t just outputting tokens based on probabilities but are capable of leveraging world knowledge in some way. That’s the sort of thing that might lead to actual AGI, but thats so far off it might as well be cold fusion at this point.
See everyone, that's someone who knows what they're talking about! lol
I really don't like how fast people are to say "AGI is just around the corner". Heck, even Kyle's closing statement implies it a little bit. I blame marketing.
The "positive" in Positive reinforcement refers to adding something to the "environment" to increase the likelihood of a behavior. "Positive" there has nothing to do with "good". Negative reinforcement also increases the likelihood of a behavior (but by removing something from the "environment").
Love your videos, Kyle! Very well explained and surprisingly easy to understand for how complex some of these topics are.
I’d like to see if OpenAI is keeping the inputs users supply as future training sets. Also, I’d like to see the hardware and the parallel code used to run the training loop.
That's why GPT3.5 is free. I'm not sure if they are keeping the actual chat logs, but they are for sure using the chats as a training tool. They have thumbs up / down buttons on the sides of its responses, and I'm sure the user's responses to what is generated is also used for training.
It is, its in the terms and conditions and there's even an option you can use to make your interactions private.
As far as I know, GPT-3 doesn't use user inputs for further training but GTP-4 does
The more I use ChatGPT and learn it’s limitations, the more I’m convinced most of my friends doing office-based service work is going to either get a lot more work while utilizing AI, or get replaced by AI.
It spit out multiple well written essays on a topic that would’ve taken me hours of research to produce. All I had to do was fact check. It was amazing.
That's what I learned too. I asked ChatGPT multiple questions and most of them where right, but not every question. And after I pointed the wrong answer out it did not repeat the mistake.
I guarantee they'll utilize it more. But human variability in input creates issues with AI output. So they can't ever truly replace humans for most tasks.
Especially the creative stuff. And stuff like law.
There's just more to those things than input -> output
@@FIRING_BLIND Surely that's true surely
You will have to learn to use it to increase your productivity. It will be like when word / PowerPoint or even google came out. Anybody taking a week to print out real slides had to learn new tech. Seriously look how people used to create slides with a company printing department
@@thegreedyharvest8796 the words where and were are not interchangeable. Now that I have pointed out the mistake, are you going to make it again?
“Nature had billion years of trial and error “ to come up with something as Complex as neural networks? A most scientific explanation !
We were using heuristic neural nets 20 years ago for autonomous target recognition (ATR) in tactical missile systems. The issue then, as now, is that the truth set is only attainable when the training set is sufficiently truthful. AI typically uses multiplicity of sources (training set weight) to determine veracity, so amplification of deception in human-generated knowledge (which has occurred throughout human history) will completely invalidate an AI answer. As you said, veracity determination remains an issue.
Didn’t a darpa ai sim blow it self up because it maximized the score by minimizing damage from the adversary?
@@DrDeuteron Not familiar with that, but there are almost as many versions of AI as there are projects out there. Any heuristic NN is still only as good as its training set, even if it incorporates continuous feedback, minus an accurate truth set. Else how can it assess its predictions- is the feedback correct?
Kyle Hill is the only reason I appreciate that this is all we got
Honestly the amount of data and effort that this creation required just proves how impressive our own brains are, we will never read nearly this many words but we take much less time and have a conscious understanding of what is hapoening
We also need to consider, with the sheer rate of ChatGPT output released, there is an increasing percentage of ChatGPT material that will function as a resource for future ChatGPT queries. There is a growing potential for ChatGPT to become self-referential. Conceivably there will come a point (if it hasn't already happened) where ChatGPT content will significantly outweigh human output and have the ability to, therefore, shape human perceptions and affct human learning.
ChatGPT is Soylent Green.
And Soylent Green is people.
You are the perfect balance of brains, humor and eye candy. 100 out of 10 content
Amazing video! I would really love to see more of these types of deeper dives on the channel (maybe like the half-life series you have) on a wide range of topics, and the misunderstood history of technology and science.
I'm trying to visualize how this would work when translating between languages. So I'm looking at a corpus of words in a language. Each of these words probably has different meanings (depending on context) so it has a link between the word and a particular meaning (I think you called that a relationship). So for instance, the English word "green" has several meanings. It can refer to a particular color. It can also mean untrained or inexperienced. And also mean an appearance of sickness or nausea (especially after drinking lots of alcohol). The German word for green has both the first two meanings, but not the third. So how would a German version of ChatGPT translate an English sentence like "My roommate looks very green this morning. He must have been out partying all night."
I don't know. But I would instantly imaging that the entire Arquitecture to be reproduced for different languages. I would imagine the same model would not be used for all languages, as ChatGPT can only speak about 50 (according to google) and not all of them.
YES V GOOD SIMPLE EXPLANATION - way better than most I've come across - v helpful. There are other basic aspects of Chatgpt.that could be valuably explained. More about how human feedback alters the weights. Also about "synonymity", V important. The fantastic thing about Chatgpt is that just as it can recognize different languages, so it can recognize STYLES - of language and thought. Rewrite/recognize a passage or text as Hemingway/ Tom Wolfe/ tabloid/ WSJ etc. HOW does it do that? Pls explain. Also V IMPORTANT - it doesnt just recognize combinations of words within sentences, it recognizes combination of SENTENCES, and then of PARAGRAPHS and so on. How does it do that? None of the explanations I've seen incl you cover these dimensions and yet here lies much of the brilliance of Chatgpt - its ability to recognize likely ARGUMENTS, PARAGRAPHS and much else by way of larger laanguage units. Pls Pls explain. Thanks for great work
I believe the solution to the media problem you outlined - what to do with limited attention when there's too much content - is curated content. For example, I trust Kyle and so I trust his videos to be honest and accurate to the extent of his capability.
The question in my mind is this: _how_ do we curate the content?
Ministries of Truth? No thank you.
@@obsidianjane4413huh
The question is: would you prefer the faculty of Harvard or the first 4000 names in the Boston phone book to curate for you?
@@DrDeuteron There's definitely a potential advantage to having a random element in any leadership structure. But, in an ideal world, I'd rather people be selected based on their accomplishments in a particular field, and for those people to identify sources of accurate information in their area of expertise.
@@stranger6822 It's a William F Buckley, Jr. quote.
Great video, as always. One correction, however, ChatGPT and other LLMs don't use "words" but "tokens". One token is usually a few characters. There are less tokens in a language than there are words, so it's easier to work with afaik. However this also means that ChatGPTs 2048 context size is not "words" but "tokens" which is smaller. But it's also not 2048, ChatGPT atm supports up to 4096 and 32k token context size is in development in ChatGPT 4 (if not already out).
Getting science facts from Thor is the only way I want to learn.
If a neural network of certain size can operate at certain complexy level.. As you said: Human brain size can handle human level complexy. eg. Language and some conseps.
Now what happends if(when) someone pulls out a neural network 10x the size?
Can We even comprehend that?
About your knife cuts: you will find that it is much easier to cut if, instead of pressing down with the knife, you make a sawing motion and let the blade do the work. This way, also, you can get much thinner cuts.
fysics!
My sci-fi curiosity makes me wonder if eventually we’ll have AI that are trained on the user, like a personal companion? Privacy issues and ethics aside, having an AI on my phone or on my desktop could be really cool
Speech-to-text softwares do just that. They train on your voice. The more you use it, the more accurate it becomes - for your voice.
Wish granted 😁 my brother doesn't bother talking to me now he has someone called Pi
Well explained! The Kurzgesagt thing got me more than id like to admit.
This video helped a ton. I’m definitely not ready to give a lecture about how it works, but it helps demystify it.
I'm really glad I came across you again. When I saw because science wasn't active anymore it was so sad. Really glad you kept doing science education
That Vaporeon joke is going to ruin a few lives
In regard to determining "cat-ness", I've assumed that it was just due to creating associations between ideas, thoughts, or emotions. Those associations can be strengthened over time or through the nature of the experience itself. (An event triggering PTSD would likely be an example of the latter.) My guess is that if I took a simple drawing of a tree and added some round fruit to it, I could get you to say, "That's an apple tree", by coloring the fruit red or maybe even green. On the other hand, if I then changed the fruit color to orange, you'd likely say, "That's an orange tree." (I might even get some to call it a peach tree.) All I'm doing is working off the associations that we've created for fruit trees and for those specific fruits. Along those lines, I'd probably confuse people if I then changed the color to yellow. "That's... not a banana... is it a weirdly shaped pear?"
Lemon tree. The most confusing color would be blue since there are no naturally occurring foods that are blue. Foods that we label as blue, like blueberries, are actually a deep shade of purple.
@@-._.-KRiS-._.- No, blueberries are actually blue.
@@Mo_Mauveno... they're indigo
You think pears are yellow???
@@TetoSuperFanoff the tree, yea, sometimes
Great explainer video! TLDR:
1 - ChatGPT is part of OpenAI's Generative Pre-trained Transformer series, utilizing huge swathes of text data and neural network magic to understand and generate human-like text.
2 - It stands on the shoulders of advanced AI research, leveraging technology known as Attention Transformers to hone its understanding and generation of relevant text.
3 - While remarkably adept at emulating human conversation, ChatGPT lacks true understanding or sentience, operating instead through statistical modeling of language.
4 - The alignment problem-ensuring AI's values align with human values-remains at the forefront of OpenAI’s design philosophy, aiming to produce helpful, truthful, and harmless outputs.
5 - ChatGPT's societal implications are far-reaching, challenging our perceptions of creativity, authorship, and the trustworthiness of digital content.
To be fair on the whole explosion in usage and popularity it’s gotten half way through one of the most needed upgrades in computer understanding : Context Clues
Yeah... That makes sense.
Best explainer on this I've seen out there. I've said this on a few other videos you've put up. You need to be put in front of a much larger audience than on youtube. You've got the makings of the new age bill nye.. except you know.. actually having degrees to back you up.
Oh btw the promo for your store mid video. Didn't know you had one. In the process of buying the "f*** around find out" shirt. Why? Because its the best example of fuck around and find out I've ever seen.
When CGPGrey said "The current cutting edge is most likely very ‘I hope you like linear algebra’”, I didn't realize just *how much* linear algebra it was.
Nueral networks aren’t linear 😂
Nueral networks aren’t linear 😂
i asked chatgpt to do a roleplay of a guy with dementia, and lost his wife long time ago. It forgots that it has dementia.
Amazingly explained!!
I wouldn't worry too much about the information apocalips because, LLM are predictable (at the moment) in their output. Wich means you can statisticly determine the probability a text was written by an LLM or a human.
You could have mentioned GPT4 wich has image recognition capabilities and kind of spatial awareness, but I understand this video was only about chatGPT.
The problem of information apocalypse is quantity, since a single LLM can generate tons of content incredibly fast, with a few of them on the market they could rapidly drown out human-made content
Pair that with the fact that they are improving and plenty of people are not properly educated enough on the subject to accurately recognize AI content, it could lead to mass misinformation
Totally relevant: How much force would it take to launch a human body like in the new Zelda TOTK when Link launches out of the towers?
Very succinct explanation! Thanks Kyle! I have been recently interested in how language models took in sequences of words and couldn't figure it out on my own. You explained it perfectly here! Thanks!
CHAT GPT says this all day long = (3rd grade question)
I apologize for any confusion caused by my initial response. As an AI language model, I strive to provide accurate and helpful information, but I can occasionally make mistakes or misunderstand certain nuances. I appreciate your understanding and patience. If you have any further questions or need clarification on any topic, please feel free to ask.
I needed this to be explained to me
I think this is one of the best explanations on the internet rn. Incredible job, Kyle!
Thanks for making this video. It’s at the perfect level of abstraction and detail.
I’ve shared it to a bunch of folks in my life to help them learn how to refute some of the AI BS they keep sending me.
I’m excited for the tech. But I share the same view as Tom Scott. This is a start of something which will has as much impact to the world as the internet did. Both good and bad! It’s going to be an interesting few decades ahead!
10/10 opinion
“If you’re not asking this model a question, there’s nothing going on inside, it’s static, head empty.”
I’m sorry, but that’s most humans. This sentence undermines your argument at the 8-min mark.
Great work! I work in neuroscience and work with neural nets in both data sim and analysis. This was a cogent explanation that simplified the perfect amount. I think this is an important level of technical detail to have out there in charismatic video form. Thanks for doing this! You're v good at your job :)
Underpaid interns: “Hey Chat GPT, summarize ChatGPT LLM like I’m a child”
“Wow Kyle you’re so good at your job”
Say what you will, it's a useful level of demystification that I haven't seen in other discussions of it.
It is always interesting to watch this channel as it provides an intriguing look into the biases of the content provider.
This is probably the best crash course on transformer models I've seen to date. It would've been awesome to hear you cover some of the emergent capabilities.
I think the logical reasoning capabilities, while surprising can still be understood as a symptom of "next word prediction" given the scale of the model.
Like you explained in this video, the weights GPT uses to predict words ("wordiness") aren't well understood. It's quite conceivable that training such a huge amount of parameters with so much data has caused the model to derive underlying logical axioms and causality in language.
In short, it's still trying to optimise for "wordiness" but in this case "wordiness" has expanded to capture something about how logic factors into language, thus the model appears to be capable of thought and problem solving.
I tried ChatGPT exactly once and asked it a avian biology question with citations. It used the old species name not the currently recognized one and literally made up citations. The basic biology was correct, but the more detailed information was wrong. At that point I realized it was a tool for misinformation (even if accidental) and I won't touch the thing again.
This reminds me of all the Star Trek TNG episodes where some neural network was given a chance to write itself / grow, and then became some kind of sentient lifeform - such as the Enterprise D's computer, the Exocomps, and of course, Data himself.
Oh, do you remember the episode with Data on the holo-deck in a Sherlock Holmes-setting? If you want to create AI, just say: "Be a worthy opponent for Data."
Can you tell us how this works for stuff like midjourney or visual versions as well? It would probably be great info for artists who are trying to understand how this might impact their artfield.
Ima shorten the answer go what will happen to the industry... Thanks to free no cost effort to make amazing art this industry will cannibalize itself to the point where with an influx of a lot of derivative works done by these ai stealing from artist will create more noise on a noisy sea. It will loose value and the industry will only be sustain by only a few... People will still make art but digital art is already dead and lacks value.... Physical art will be more and more value done by humans because of the experience it gives you and how rare that unique piece will be... The industry is doomed and will get destroyed thanks to Ai image makers.
@@bananamanchuria Correction: Artists will be destroyed. and it's about time. I am glad AI technology has broken the artists' monopoly on the medium of visual expression, concept art, and music. Now people will be able to create "artistic" pieces for their own projects without having to be extorted by middleman artists.
Of course the artists are going to cry about AI "stealing their jobs." But in reality this is good, because it will force artists to develop skills and creativity which exceed that of AI. The surviving artists will only be the best, while the rest are replaced by low cost and easily accessible AI. This will be a massive change for the better.
@@binbows2258 So are you mad about artists being "middlemen" (what does that even mean in this context?) or do you want only the best of the best artists to exist? You do know the best of the best artists often cost the most to commission for their art, yes? That's what you're paying for after all, their skill.
Unless you just want the best of the best to rip off in midjourney, in which case they still "hold a monopoly" on "art" because everyone will be using their art to train whatever AI they like to use...so you don't actually want artists to be "destroyed"....I don't think you understand your own opinion.
Just admit you didnt get accepted into art school, Adolf.
@@binbows2258I don't know what you think an "Artist monopoly" is, but I guarantee you it most likely doesn't exist. Being an artist and being able to do art is possible by anybody. It just may take some people more practice than others.
It's like saying a carpenter has a "monopoly on wood making." Anyone can learn to be a carpenter if they out in the effort and patience to learn.
Dall E, midjourney, stable diffusion use deep learning diffusion model and artificial neural network
If ChatGPT is going to be outputting as much as humans have ever output since the printing press and we are using text from the internet to train large language models then how far away are we from large language models unknowingly training large language models and what affect will that then have on not only the AI but on the future developmenty of language and communication?
It's likely already happening.
And, this is a frightening thought when you consider how many people blindly believe ChatGPT & the like are actually valid sources.
I read that some AI image generators are already is running into this problem. Initially, AI supporters were opposed to separating AI images from human made artworks on art hosting site because they want to see themselves as real 'artist.' However, newer AI models being trained kept getting AI generated images into their dataset, causing a negative reinforcement of bad traits. Now some in the AI community are advocating separating AI images and human artwork so they can more accurately screen out AI images.
There's a research paper that discussed this. Basically learning on ai generated material makes it forget things "on the edges of the bell curve", so I guess it would simplify the outputs. Worse outcomes in general.
@@Pingviinimursu That assumes no human curation of the generations, generated content that makes it online is likely to have been curated by a human, so only the best generation make it into the wild. So, future models will probably improve as a result, as they'll have less noise from the rubbish SEO companies have created over the last decade.
@@PaulBrunt Yeah, that makes sense, but you assume that a meaningful amount of curation is done. Maybe, maybe not. Can't know till we know.
this was super helpful and educational and I've already done tons of research on the subject